-
Notifications
You must be signed in to change notification settings - Fork 3.9k
GPSE Implementation #9018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPSE Implementation #9018
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9018 +/- ##
==========================================
+ Coverage 87.33% 89.23% +1.90%
==========================================
Files 460 472 +12
Lines 30385 30594 +209
==========================================
+ Hits 26536 27301 +765
+ Misses 3849 3293 -556 ☔ View full report in Codecov by Sentry. |
|
Super, let me resolve the linting issue. |
|
@semihcanturk @rusty1s What is the status on this PyG implementation of GPSE? The GPSE paper says, "For convenience, GPSE has also been integrated into the PyG library to facilitate downstream applications." I was hoping to try it out soon. |
|
Hi @pjspol, and thanks for checking in! The implementation here is ready and is simply awaiting a merge -- I know @rusty1s was unavailable for a while so that has delayed the merge a bit, but I'm expecting it to be merged soon. In the meantime, you should be able to install this branch in your environment to work with GPSE: |
|
Hi @semihcanturk. I believe there is a minor ambiguity in the docs for the GPSENodeEncoder. In the Thanks |
|
Hi @rusty1s, finally got around to this :) There were minor issues re: |
|
What is the status on this PyG implementation of GPSE?I think it will be very helpful for my research! |
|
@zzzzzzyc i will work with @semihcanturk to get this merged in but for the meantime you can test this branch in your own work. please do share your experience with running this I would also recommend testing in the nvidia pyg container for easiest setup |
puririshi98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on testing with NVIDIA stack i think this is safe to merge
|
please fix linting and hopefuly other CI clears up(unrelated to you) |
Head branch was pushed to by a user without write access
Following #9018, Provides a comprehensive example in `examples/gpse.py` to compute GPSE encodings and use them for a graph regression task on the ZINC dataset. Two methods to compute GPSE encodings is demonstrated: - Through `precompute_GPSE`: Given a PyG dataset, computes GPSE encodings in-place _once_ before training, without saving them to storage. Ideal if you want to compute the encodings only once per run (unlike a dataset transform) but do not want to save the pre-transformed dataset to storage (unlike a PyG pre-transform). To run with default pretrained weights (molpcba): ``` python examples/gpse.py --gpse ``` To run with pretrained weights from any other dataset, please provide the pretraining dataset name from the available options as a kwarg: ``` python examples/gpse.py --gpse geom ``` - Through the `AddGPSE` transform: A PyG transform analogous to [AddLaplacianEigenvectorPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddLaplacianEigenvectorPE.html#torch_geometric.transforms.AddLaplacianEigenvectorPE) and [AddRandomWalkPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddRandomWalkPE.html#torch_geometric.transforms.AddRandomWalkPE), can be used as a pre-transform or transform to a PyG dataset. ``` python examples/gpse.py --gpse --as_transform ``` Using as a transform is not recommended as recomputing them for every batch in every epoch is quite inefficient; using it as a pre-transform or through `precompute_GPSE` is suggested instead. In either case, the `torch_geometric.nn.GPSENodeEncoder` is then used to compute a mapping of the GPSE encodings to the desired dimension, and append them to `batch.x` to prepare them as inputs to a GNN. This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues. --------- Co-authored-by: Semih Cantürk <=> Co-authored-by: rusty1s <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Rishi Puri <[email protected]>
[Graph Positional and Structural Encoder](https://arxiv.org/abs/2307.07107) implementation as per pyg-team#8310. Adapted from the original repository: https://github.com/G-Taxonomy-Workgroup/GPSE. This version is a standalone implementation that is decoupled from GraphGym, and thus aims for better accessibility and a smoother integration into PyG. While the priority of this PR is to enable loading and using pre-trained models in plug-and-play fashion, it also includes the custom loss function used to train the model. Nevertheless, it might be easier to use the original repository for pre-training and fine-tuning new GPSE models for the time being. This PR includes the following: - `GPSE`: The main GPSE module, that generates learned encodings for input graphs. - Several helper classes (`FeatureEncoder`, `GNNStackStage`, `IdentityHead`, `GNNInductiveHybridMultiHead`, `ResGatedGCNConvGraphGymLayer`, `Linear`, `MLP`, `GeneralMultiLayer`, `GeneralLayer`, `BatchNorm1dNode`, `BatchNorm1dEdge`, `VirtualNodePatchSingleton`) and wrapper functions (`GNNPreMP`, `GNNLayer`), all adapted from their GraphGym versions for compatibility and enabling the loading of weights pre-trained using the GraphGym/original version. - The class method `GPSE.from_pretrained()` that returns a model with pre-trained weights from the original repository/Zenodo files. - `GPSENodeEncoder`, a helper linear/MLP encoder that takes the GPSE encodings precomputed as`batch.pestat_GPSE` in the input graphs, maps them to a desired dimension and appends them to node features. - `precompute_GPSE` , a function that takes in a GPSE model and a dataset, and precomputes GPSE encodings in-place for a given dataset using the helper function `gpse_process_batch`. - The transform `AddGPSE`, which in similar fashion to `AddLaplacianEigenvectorPE` and `AddRandomWalkPE` adds the GPSE encodings to a given graph using the helper function `gpse_process` - The testing modules `test/test_gpse.py` and `test/test_add_gpse.py`. - The loss function `gpse_loss` and helper functions `cosim_col_sep` and `process_batch_idx` used in GPSE training. - A comprehensive example in `examples/gpse.py` is provided as a separate PR in pyg-team#10118. This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues. --------- Co-authored-by: Semih Cantürk <=> Co-authored-by: rusty1s <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Rishi Puri <[email protected]>
Following pyg-team#9018, Provides a comprehensive example in `examples/gpse.py` to compute GPSE encodings and use them for a graph regression task on the ZINC dataset. Two methods to compute GPSE encodings is demonstrated: - Through `precompute_GPSE`: Given a PyG dataset, computes GPSE encodings in-place _once_ before training, without saving them to storage. Ideal if you want to compute the encodings only once per run (unlike a dataset transform) but do not want to save the pre-transformed dataset to storage (unlike a PyG pre-transform). To run with default pretrained weights (molpcba): ``` python examples/gpse.py --gpse ``` To run with pretrained weights from any other dataset, please provide the pretraining dataset name from the available options as a kwarg: ``` python examples/gpse.py --gpse geom ``` - Through the `AddGPSE` transform: A PyG transform analogous to [AddLaplacianEigenvectorPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddLaplacianEigenvectorPE.html#torch_geometric.transforms.AddLaplacianEigenvectorPE) and [AddRandomWalkPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddRandomWalkPE.html#torch_geometric.transforms.AddRandomWalkPE), can be used as a pre-transform or transform to a PyG dataset. ``` python examples/gpse.py --gpse --as_transform ``` Using as a transform is not recommended as recomputing them for every batch in every epoch is quite inefficient; using it as a pre-transform or through `precompute_GPSE` is suggested instead. In either case, the `torch_geometric.nn.GPSENodeEncoder` is then used to compute a mapping of the GPSE encodings to the desired dimension, and append them to `batch.x` to prepare them as inputs to a GNN. This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues. --------- Co-authored-by: Semih Cantürk <=> Co-authored-by: rusty1s <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Rishi Puri <[email protected]>
Graph Positional and Structural Encoder implementation as per #8310. Adapted from the original repository: https://github.com/G-Taxonomy-Workgroup/GPSE. This version is a standalone implementation that is decoupled from GraphGym, and thus aims for better accessibility and a smoother integration into PyG. While the priority of this PR is to enable loading and using pre-trained models in plug-and-play fashion, it also includes the custom loss function used to train the model. Nevertheless, it might be easier to use the original repository for pre-training and fine-tuning new GPSE models for the time being.
This PR includes the following:
GPSE: The main GPSE module, that generates learned encodings for input graphs.FeatureEncoder,GNNStackStage,IdentityHead,GNNInductiveHybridMultiHead,ResGatedGCNConvGraphGymLayer,Linear,MLP,GeneralMultiLayer,GeneralLayer,BatchNorm1dNode,BatchNorm1dEdge,VirtualNodePatchSingleton) and wrapper functions (GNNPreMP,GNNLayer), all adapted from their GraphGym versions for compatibility and enabling the loading of weights pre-trained using the GraphGym/original version.GPSE.from_pretrained()that returns a model with pre-trained weights from the original repository/Zenodo files.GPSENodeEncoder, a helper linear/MLP encoder that takes the GPSE encodings precomputed asbatch.pestat_GPSEin the input graphs, maps them to a desired dimension and appends them to node features.precompute_GPSE, a function that takes in a GPSE model and a dataset, and precomputes GPSE encodings in-place for a given dataset using the helper functiongpse_process_batch.AddGPSE, which in similar fashion toAddLaplacianEigenvectorPEandAddRandomWalkPEadds the GPSE encodings to a given graph using the helper functiongpse_processtest/test_gpse.pyandtest/test_add_gpse.py.gpse_lossand helper functionscosim_col_sepandprocess_batch_idxused in GPSE training.examples/gpse.pyis provided as a separate PR in GPSE example #10118.This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues.