Skip to content

Conversation

@puririshi98
Copy link
Contributor

@puririshi98 puririshi98 commented Sep 4, 2025

replaces #9992
adds new torch_Geometric.llm for NVIDIA to mantain
example of how to use this in nvidia docker container:
git config --global credential.helper store; huggingface-cli login --token <insert_token>; cd /opt/pyg; pip uninstall -y torch-geometric; rm -rf pytorch_geometric; git clone -b latest-txt2kg https://github.com/pyg-team/pytorch_geometric.git; cd /opt/pyg/pytorch_geometric; pip install .; pip install openai

example to run:
python3 examples/llm/txt2kg_rag.py

thanks to @Kh4L @zaristei @rlratzel @rliu
for their contributions
image

image This is thoroughly tested by many internal and external users and should be good to merge. we will continue to improve this pipeline in future PRs but this PR is ready. Future PRs:

add the RAG CI job back after figuring out why tests pass but the CI gives a red X
add a non-toy dataset for the default data (needs to be released by NVIDIA on huggingface, waiting on that)
further improvements to user friendliness of example

I will work with @zaristei on refining RAGQueryLoader and Feature/Graph Store workflow as directed by @wsad1
Note: NVIDIA CI will run all unit tests (including these rag related ones) as well as the full txt2kg_rag.py example everytime we update our NVIDIA container. This will run on H100, B100, and A100 a few diff skus. Then NVIDIA QA also runs on almost every hardware SKU before each bi-monthly release.

(closed subPR: #10368)
I had broken this PR down but then matthias said I could merge if he hasnt reviewed before i came back from vacay since we have tested and reviewed this so thoroughly at NVIDIA and externally on top of my involvement in most of the llm features of PyG.

NVIDIA's backing of this project spans many orgs all the way to the top and has fortune500 names adopting it left and right. Getting this merged in is the first step towards cementing PyG as THE framework for anything "Graph" at NVIDIA and I am leading this charge. We will continue to optimize as time goes and treat any issues that come up as p0s to fix asap. But merging this is an important first step.

puririshi98 and others added 30 commits February 4, 2025 21:06
Co-authored-by: riship <[email protected]>
Co-authored-by: riship <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy link
Member

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as per offline discussion. I left some final comments before merging.

@akihironitta akihironitta added this to the 2.7.0 milestone Sep 4, 2025
@akihironitta akihironitta changed the title Reorg txt2kg Reorganize TXT2KG and introduce torch_geometric.llm Sep 5, 2025
@akihironitta akihironitta self-assigned this Sep 5, 2025
@github-actions github-actions bot removed the sampler label Sep 5, 2025
Copy link
Member

@akihironitta akihironitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per our offline discussion amongst @rusty1s @wsad1 @puririshi98 and myself, we have decided to create a new subpackage torch_geometric.llm and assign @puririshi98 as its codeowner so that NVIDIA and the PyG community can iterate on the integration much more quickly even without needing to guarantee the same standard as the PyG core, e.g., thorough test coverage, documentation and code quality.

Thanks again @puririshi98 and the team for your patience and the exciting work! 🚀

@akihironitta akihironitta merged commit d4a442b into master Sep 5, 2025
16 checks passed
@akihironitta akihironitta deleted the reorg-txt2kg branch September 5, 2025 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants