-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Remove Unbuffered Sampling from cuGraph Examples #10079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
remove use of __dict__ Co-authored-by: Matthias Fey <[email protected]>
remove use of __dict__ Co-authored-by: Matthias Fey <[email protected]>
simplify isinstance syntax Co-authored-by: Matthias Fey <[email protected]>
simplify isinstance call Co-authored-by: Matthias Fey <[email protected]>
remove explicit inheritance from object Co-authored-by: Matthias Fey <[email protected]>
remove direct __dict__ access Co-authored-by: Matthias Fey <[email protected]>
remove direct __dict__ access Co-authored-by: Matthias Fey <[email protected]>
…hi-nv/pytorch_geometric into initial-cugraph-storage
Create Initial Version of cuGraph Storage and Data Classes
|
ty will review by eod |
puririshi98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code LGTM but can you share logs of running this w/ master branch and your branch to show the diff in outputs and speed. also please share the device specs for context
akihironitta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have some reference to details on buffered/unbuffered sampling in the PR description, but LGTM!
Co-authored-by: Akihiro Nitta <[email protected]>
|
Just added single-GPU benchmarks, working on multi-GPU now... |
|
@puririshi98 could you please re-review? I have added benchmarks indicating the speedup from buffered sampling and linked to the PR that introduced the feature in |
puririshi98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to go
Removes unbuffered sampling (dumping data to disk) since it is about to be deprecated and it has numerous bugs that we do not intend to fix. Buffered sampling does not require writing to disk, and is significantly faster as shown in the benchmarks below. Buffered sampling was introduced in rapidsai/cugraph-gnn#48 as part of the overhaul to add support for negative sampling and general link prediction workflows in cuGraph-PyG. Performance Analysis (w/ GraphSAGE): | Dataset | GPUs | Epoch Time (unbuffered) | Epoch Time (buffered) | Speedup | |----------|-------|------------------------|---------------------|------------| | ogbn-arxiv | 1x A100 80GB | 14 s | 5s | 2.8x | ogbn-arxiv | 2x A100 80GB | 14 s | 4s | 3.5x | ogbn-products | 1x A100 80GB | 69 s | 45 s | 1.5x | ogbn-products | 2x A100 80GB | 43 s | 25 s | 1.7x | ogbn-products | 4x A100 80GB | 42 s | 19 s | 2.2x | ogbn-papers100M | 1x A100 80GB | 479 s | 375 s | 1.3x | ogbn-papers100M | 2x A100 80GB | 435 s | 305 s | 1.4x | ogbn-papers100M | 4x A100 80GB | 207 s | 120 s | 1.7x | ogbn-papers100M | 8x A100 80GB | 169 s | 100 s | 1.7x --------- Co-authored-by: Matthias Fey <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Akihiro Nitta <[email protected]>
Removes unbuffered sampling (dumping data to disk) since it is about to be deprecated and it has numerous bugs that we do not intend to fix. Buffered sampling does not require writing to disk, and is significantly faster as shown in the benchmarks below.
Buffered sampling was introduced in rapidsai/cugraph-gnn#48 as part of the overhaul to add support for negative sampling and general link prediction workflows in cuGraph-PyG.
Performance Analysis (w/ GraphSAGE):