Skip to content

Conversation

@zaristei
Copy link
Contributor

@zaristei zaristei commented Sep 16, 2024

Follow up to PR 9597. Includes multiple changes related to LLM+GNN experiments and scaling up to a remote backend. Including:

  • LargeGraphIndexer for building a large knowledge graph locally from multiple samples in an arbitrary dataset
  • Remote Backend Loader and examples for deploying a Retrieval algorithm to a third party backend FeatureStore or GraphStore
  • NVTX profiling tools for nsys users
  • Quality of Life improvements and benchmarking scripts for G-Retriever.

Updates using these for WebQSP will be moved to a seperate PR

UPDATE:
PR is being broken up into smaller PRs. These can be previewed here:

@puririshi98 puririshi98 changed the title G Retriever Experiments and Improvements (full) G-retriever API updates (NVTX, Remote Backend, Large Graph Indexer, Examples) Nov 25, 2024
Copy link
Contributor

@puririshi98 puririshi98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just make new PR w/ webqsp changes and move advanced/rag.rst to that. and apply the fix for skipping pytest if not correct versioning.

zaristei and others added 8 commits November 25, 2024 19:14
Will be re-added with an updated version of WebQSP
Will be re-added with an updated version of WebQSP
Will be re-added with an updated version of WebQSP
Will be re-added with an updated version of WebQSP
Will be re-added with an updated version of WebQSP
@zaristei
Copy link
Contributor Author

zaristei commented Nov 25, 2024

just make new PR w/ webqsp changes and move advanced/rag.rst to that. and apply the fix for skipping pytest if not correct versioning.

New PR to be found here for WebQSP and doc changes

Copy link
Contributor

@puririshi98 puririshi98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Has been reviewed 10+ times over the last few months. Code tested. @zaristei has addressed all reviews. Most of the reviews are much above

@puririshi98 puririshi98 merged commit 742f790 into pyg-team:master Nov 26, 2024
16 checks passed
mattjhayes3 pushed a commit to mattjhayes3/pytorch_geometric that referenced this pull request Dec 14, 2024
…xamples) (pyg-team#9666)

Follow up to [PR
9597](pyg-team#9597). Includes
multiple changes related to LLM+GNN experiments and scaling up to a
remote backend. Including:

- LargeGraphIndexer for building a large knowledge graph locally from
multiple samples in an arbitrary dataset
- Remote Backend Loader and examples for deploying a Retrieval algorithm
to a third party backend FeatureStore or GraphStore
- NVTX profiling tools for nsys users
- Quality of Life improvements and benchmarking scripts for G-Retriever.

Updates using these for WebQSP will be moved to a seperate PR

UPDATE:
PR is being broken up into smaller PRs. These can be previewed here:

- zaristei#6
- zaristei#7
- zaristei#8

---------

Co-authored-by: Zack Aristei <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Zachary Aristei <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
puririshi98 added a commit that referenced this pull request Apr 2, 2025
Successor to
[9666](#9666), this:
- ~~updates the documentation to show how to utilize GNN RAG and~~(now
handled by separate branch)
- updates WebQSP to help serve as a toy example for LargeGraphIndexer.
- fixes issues with LargeGraphIndexer running out of memory by
introducing a default batch size and multithreading ability

~~currently blocked by a bug that causes the g_retriever.py example to
get 1% less accuracy.~~ Bug is due to a fp32 precision issue related to
batch kernels in Huggingface's transformers. Performance difference is
too inconsequential to require a fix.

may also be the cause of low retrieval precision in
#9846

---------

Co-authored-by: Zack Aristei <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Zachary Aristei <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
Kh4L pushed a commit to Kh4L/pytorch_geometric that referenced this pull request Apr 15, 2025
Successor to
[9666](pyg-team#9666), this:
- ~~updates the documentation to show how to utilize GNN RAG and~~(now
handled by separate branch)
- updates WebQSP to help serve as a toy example for LargeGraphIndexer.
- fixes issues with LargeGraphIndexer running out of memory by
introducing a default batch size and multithreading ability

~~currently blocked by a bug that causes the g_retriever.py example to
get 1% less accuracy.~~ Bug is due to a fp32 precision issue related to
batch kernels in Huggingface's transformers. Performance difference is
too inconsequential to require a fix.

may also be the cause of low retrieval precision in
pyg-team#9846

---------

Co-authored-by: Zack Aristei <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Zachary Aristei <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants