`pyg::subgraph` CUDA implementation by rusty1s · Pull Request #42 · pyg-team/pyg-lib

rusty1s · 2022-05-03T14:02:17Z

No description provided.

codecov-commenter · 2022-05-03T14:22:07Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.48%. Comparing base (c1656ed) to head (9113818).
Report is 359 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master      #42   +/-   ##
=======================================
  Coverage   99.48%   99.48%           
=======================================
  Files           9        9           
  Lines         195      195           
=======================================
  Hits          194      194           
  Misses          1        1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

pyg_lib/csrc/sampler/cuda/subgraph_kernel.cu

pyg_lib/csrc/utils/cuda/helpers.h

yaoyaowd · 2022-05-03T21:03:06Z

pyg_lib/csrc/sampler/cuda/subgraph_kernel.cu

+    const auto to_local_node_data = to_local_node.data_ptr<scalar_t>();
+    auto deg_data = deg.data_ptr<scalar_t>();
+
+    // Compute induced subgraph degree, parallelize with 32 threads per node:


I'm actually not sure if it is necessary to parallelize with 32 threads per nodes. Most of the time we are dealing with sparse data and a lot of threads will not go into for loop.

If you are looking for extreme performance, you can bundle to_local_node_data and col_data into one iterator structure and use this function. I haven't seen any better performance than it in the past.
https://nvlabs.github.io/cub/structcub_1_1_device_segmented_reduce.html#a4854a13561cb66d46aa617aab16b8825

Do you have an example of bundling to_local_node_data and col_data into one iterator structure? This looks really interesting.

I am okay with dropping the warp-level parallelism for now, but we will lose the contiguous access to col_data, and probably under-utilize the number of threads available on modern GPUs.

On a second look, this doesn't seem possible since col_data refers to edges, while to_local_node_data refers to nodes, while we actually want do the compute across the number of nodes in the induced subgraph.

ZenoTan · 2022-05-03T23:20:35Z

pyg_lib/csrc/sampler/cuda/subgraph_kernel.cu

+  // We maintain a O(N) vector to map global node indices to local ones.
+  // TODO Can we do this without O(N) storage requirement?
+  const auto to_local_node = nodes.new_full({rowptr.size(0) - 1}, -1);


Does N means the number of nodes in the graph?
What if we could filter each node in nodes_data since it should be much smaller than rowptr_data.
Otherwise we may consider caching this tensor to reduce memory allocation for each time.

Good points! We use this vector as the mapping from global node indices to new local ones. In C++, we use a map for this but can't do the same in CUDA. I don't know of a more elegant solution for this.

Caching is an option as well, but requires a (non-intuitive and backend-specific) change in input arguments. I added it as a TODO for now.

There's GPU hash table/set which may require some atomic operations when you build it, but lookup is fast.
I found that caching is not a good option since you have to reset the array every time.

Since you can sample on GPU, then the graph is not that big, a node array is not that bad and can make the code less complicated

yaoyaowd

Let's see if we can improve it later.

rusty1s added 4 commits May 3, 2022 15:19

Update

07f6a85

update

058c0bc

update

f85ff1e

initial commit

4aa86b9

rusty1s added 0 - Priority P0 feature sampler labels May 3, 2022

rusty1s self-assigned this May 3, 2022

update

4bb8087

update

ef68b2b

ZenoTan reviewed May 3, 2022

View reviewed changes

pyg_lib/csrc/sampler/cuda/subgraph_kernel.cu Outdated Show resolved Hide resolved

yaoyaowd reviewed May 3, 2022

View reviewed changes

ZenoTan reviewed May 3, 2022

View reviewed changes

rusty1s added 4 commits May 4, 2022 07:43

update

377e261

typo

9b5d9f7

changelog

9b4e6e6

update

bf23274

yaoyaowd approved these changes May 4, 2022

View reviewed changes

Merge branch 'master' into subgraph_cuda

9113818

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`pyg::subgraph` CUDA implementation#42

`pyg::subgraph` CUDA implementation#42
rusty1s wants to merge 11 commits intomasterfrom
subgraph_cuda

rusty1s commented May 3, 2022

Uh oh!

codecov-commenter commented May 3, 2022 •

edited by codecov bot

Loading

Uh oh!

Uh oh!

Uh oh!

yaoyaowd May 3, 2022

Uh oh!

rusty1s May 4, 2022

Uh oh!

rusty1s May 4, 2022

Uh oh!

ZenoTan May 3, 2022 •

edited

Loading

Uh oh!

rusty1s May 4, 2022 •

edited

Loading

Uh oh!

ZenoTan May 4, 2022

Uh oh!

yaoyaowd left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

rusty1s commented May 3, 2022

Uh oh!

codecov-commenter commented May 3, 2022 • edited by codecov bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

yaoyaowd May 3, 2022

Choose a reason for hiding this comment

Uh oh!

rusty1s May 4, 2022

Choose a reason for hiding this comment

Uh oh!

rusty1s May 4, 2022

Choose a reason for hiding this comment

Uh oh!

ZenoTan May 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rusty1s May 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZenoTan May 4, 2022

Choose a reason for hiding this comment

Uh oh!

yaoyaowd left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented May 3, 2022 •

edited by codecov bot

Loading

ZenoTan May 3, 2022 •

edited

Loading

rusty1s May 4, 2022 •

edited

Loading