GPSE Implementation #9018

semihcanturk · 2024-03-05T06:27:55Z

Graph Positional and Structural Encoder implementation as per #8310. Adapted from the original repository: https://github.com/G-Taxonomy-Workgroup/GPSE. This version is a standalone implementation that is decoupled from GraphGym, and thus aims for better accessibility and a smoother integration into PyG. While the priority of this PR is to enable loading and using pre-trained models in plug-and-play fashion, it also includes the custom loss function used to train the model. Nevertheless, it might be easier to use the original repository for pre-training and fine-tuning new GPSE models for the time being.

This PR includes the following:

GPSE: The main GPSE module, that generates learned encodings for input graphs.
Several helper classes (FeatureEncoder, GNNStackStage, IdentityHead, GNNInductiveHybridMultiHead, ResGatedGCNConvGraphGymLayer, Linear, MLP, GeneralMultiLayer, GeneralLayer, BatchNorm1dNode, BatchNorm1dEdge, VirtualNodePatchSingleton) and wrapper functions (GNNPreMP, GNNLayer), all adapted from their GraphGym versions for compatibility and enabling the loading of weights pre-trained using the GraphGym/original version.
The class method GPSE.from_pretrained() that returns a model with pre-trained weights from the original repository/Zenodo files.
GPSENodeEncoder, a helper linear/MLP encoder that takes the GPSE encodings precomputed asbatch.pestat_GPSE in the input graphs, maps them to a desired dimension and appends them to node features.
precompute_GPSE , a function that takes in a GPSE model and a dataset, and precomputes GPSE encodings in-place for a given dataset using the helper function gpse_process_batch.
The transform AddGPSE, which in similar fashion to AddLaplacianEigenvectorPE and AddRandomWalkPE adds the GPSE encodings to a given graph using the helper function gpse_process
The testing modules test/test_gpse.py and test/test_add_gpse.py.
The loss function gpse_loss and helper functions cosim_col_sep and process_batch_idx used in GPSE training.
A comprehensive example in examples/gpse.py is provided as a separate PR in GPSE example #10118.

This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues.

codecov · 2024-03-27T05:46:55Z

Codecov Report

Attention: Patch coverage is 64.41860% with 153 lines in your changes are missing coverage. Please review.

Project coverage is 89.23%. Comparing base (61c47ee) to head (f24acbc).
Report is 14 commits behind head on master.

❗ Current head f24acbc differs from pull request most recent head f9ce537

Please upload reports for the commit f9ce537 to get more accurate results.

Files	Patch %	Lines
torch_geometric/nn/models/gpse.py	62.68%	153 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #9018      +/-   ##
==========================================
+ Coverage   87.33%   89.23%   +1.90%     
==========================================
  Files         460      472      +12     
  Lines       30385    30594     +209     
==========================================
+ Hits        26536    27301     +765     
+ Misses       3849     3293     -556

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rusty1s · 2024-06-05T15:23:07Z

Super, let me resolve the linting issue.

pjspol · 2024-07-15T19:00:25Z

@semihcanturk @rusty1s What is the status on this PyG implementation of GPSE? The GPSE paper says, "For convenience, GPSE has also been integrated into the PyG library to facilitate downstream applications." I was hoping to try it out soon.

semihcanturk · 2024-08-16T13:00:05Z

Hi @pjspol, and thanks for checking in! The implementation here is ready and is simply awaiting a merge -- I know @rusty1s was unavailable for a while so that has delayed the merge a bit, but I'm expecting it to be merged soon. In the meantime, you should be able to install this branch in your environment to work with GPSE: pip install git+https://github.com/semihcanturk/pytorch_geometric/tree/gpse

luke-a-thompson · 2024-10-17T05:45:20Z

Hi @semihcanturk. I believe there is a minor ambiguity in the docs for the GPSENodeEncoder. In the expand_x option (line 663), it does not specify if the expansion will be to dim_pe_in or dim_pe_out. I assume it's dim_pe_out, but it still might be worth noting exactly which shape the output will take.

expand_x (bool, optional): Expand node features :obj:`x` from
    :obj:`dim_in` to (:obj:`dim_emb` - :obj:`dim_pe`)

Thanks

for more information, see https://pre-commit.ci

semihcanturk · 2025-01-15T18:58:02Z

Hi @rusty1s, finally got around to this :) There were minor issues re: bn_eps and bn_mom args after your update which are now fixed + fixed docs (thanks for the catch @luke-a-thompson!); I've also updated the branch to latest PyG. Once again I can't pass the linting check for whatever reason, but otherwise we should be good to go here -- lmk if you need anything on my end.

zzzzzzyc · 2025-03-09T10:40:51Z

What is the status on this PyG implementation of GPSE?I think it will be very helpful for my research!

puririshi98 · 2025-03-13T22:09:10Z

@zzzzzzyc i will work with @semihcanturk to get this merged in but for the meantime you can test this branch in your own work.

please do share your experience with running this
pip uninstall -y torch-geometric; rm -rf pytorch_geometric; git clone -b gpse https://github.com/semihcanturk/pytorch_geometric.git; cd /opt/pyg/pytorch_geometric; pip install .;

I would also recommend testing in the nvidia pyg container for easiest setup
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pyg/tags

puririshi98

based on testing with NVIDIA stack i think this is safe to merge

puririshi98 · 2025-04-01T01:21:06Z

please fix linting and hopefuly other CI clears up(unrelated to you)

Following #9018, Provides a comprehensive example in `examples/gpse.py` to compute GPSE encodings and use them for a graph regression task on the ZINC dataset. Two methods to compute GPSE encodings is demonstrated: - Through `precompute_GPSE`: Given a PyG dataset, computes GPSE encodings in-place _once_ before training, without saving them to storage. Ideal if you want to compute the encodings only once per run (unlike a dataset transform) but do not want to save the pre-transformed dataset to storage (unlike a PyG pre-transform). To run with default pretrained weights (molpcba): ``` python examples/gpse.py --gpse ``` To run with pretrained weights from any other dataset, please provide the pretraining dataset name from the available options as a kwarg: ``` python examples/gpse.py --gpse geom ``` - Through the `AddGPSE` transform: A PyG transform analogous to [AddLaplacianEigenvectorPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddLaplacianEigenvectorPE.html#torch_geometric.transforms.AddLaplacianEigenvectorPE) and [AddRandomWalkPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddRandomWalkPE.html#torch_geometric.transforms.AddRandomWalkPE), can be used as a pre-transform or transform to a PyG dataset. ``` python examples/gpse.py --gpse --as_transform ``` Using as a transform is not recommended as recomputing them for every batch in every epoch is quite inefficient; using it as a pre-transform or through `precompute_GPSE` is suggested instead. In either case, the `torch_geometric.nn.GPSENodeEncoder` is then used to compute a mapping of the GPSE encodings to the desired dimension, and append them to `batch.x` to prepare them as inputs to a GNN. This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues. --------- Co-authored-by: Semih Cantürk <=> Co-authored-by: rusty1s <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Rishi Puri <[email protected]>

[Graph Positional and Structural Encoder](https://arxiv.org/abs/2307.07107) implementation as per pyg-team#8310. Adapted from the original repository: https://github.com/G-Taxonomy-Workgroup/GPSE. This version is a standalone implementation that is decoupled from GraphGym, and thus aims for better accessibility and a smoother integration into PyG. While the priority of this PR is to enable loading and using pre-trained models in plug-and-play fashion, it also includes the custom loss function used to train the model. Nevertheless, it might be easier to use the original repository for pre-training and fine-tuning new GPSE models for the time being. This PR includes the following: - `GPSE`: The main GPSE module, that generates learned encodings for input graphs. - Several helper classes (`FeatureEncoder`, `GNNStackStage`, `IdentityHead`, `GNNInductiveHybridMultiHead`, `ResGatedGCNConvGraphGymLayer`, `Linear`, `MLP`, `GeneralMultiLayer`, `GeneralLayer`, `BatchNorm1dNode`, `BatchNorm1dEdge`, `VirtualNodePatchSingleton`) and wrapper functions (`GNNPreMP`, `GNNLayer`), all adapted from their GraphGym versions for compatibility and enabling the loading of weights pre-trained using the GraphGym/original version. - The class method `GPSE.from_pretrained()` that returns a model with pre-trained weights from the original repository/Zenodo files. - `GPSENodeEncoder`, a helper linear/MLP encoder that takes the GPSE encodings precomputed as`batch.pestat_GPSE` in the input graphs, maps them to a desired dimension and appends them to node features. - `precompute_GPSE` , a function that takes in a GPSE model and a dataset, and precomputes GPSE encodings in-place for a given dataset using the helper function `gpse_process_batch`. - The transform `AddGPSE`, which in similar fashion to `AddLaplacianEigenvectorPE` and `AddRandomWalkPE` adds the GPSE encodings to a given graph using the helper function `gpse_process` - The testing modules `test/test_gpse.py` and `test/test_add_gpse.py`. - The loss function `gpse_loss` and helper functions `cosim_col_sep` and `process_batch_idx` used in GPSE training. - A comprehensive example in `examples/gpse.py` is provided as a separate PR in pyg-team#10118. This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues. --------- Co-authored-by: Semih Cantürk <=> Co-authored-by: rusty1s <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Rishi Puri <[email protected]>

Following pyg-team#9018, Provides a comprehensive example in `examples/gpse.py` to compute GPSE encodings and use them for a graph regression task on the ZINC dataset. Two methods to compute GPSE encodings is demonstrated: - Through `precompute_GPSE`: Given a PyG dataset, computes GPSE encodings in-place _once_ before training, without saving them to storage. Ideal if you want to compute the encodings only once per run (unlike a dataset transform) but do not want to save the pre-transformed dataset to storage (unlike a PyG pre-transform). To run with default pretrained weights (molpcba): ``` python examples/gpse.py --gpse ``` To run with pretrained weights from any other dataset, please provide the pretraining dataset name from the available options as a kwarg: ``` python examples/gpse.py --gpse geom ``` - Through the `AddGPSE` transform: A PyG transform analogous to [AddLaplacianEigenvectorPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddLaplacianEigenvectorPE.html#torch_geometric.transforms.AddLaplacianEigenvectorPE) and [AddRandomWalkPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddRandomWalkPE.html#torch_geometric.transforms.AddRandomWalkPE), can be used as a pre-transform or transform to a PyG dataset. ``` python examples/gpse.py --gpse --as_transform ``` Using as a transform is not recommended as recomputing them for every batch in every epoch is quite inefficient; using it as a pre-transform or through `precompute_GPSE` is suggested instead. In either case, the `torch_geometric.nn.GPSENodeEncoder` is then used to compute a mapping of the GPSE encodings to the desired dimension, and append them to `batch.x` to prepare them as inputs to a GNN. This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues. --------- Co-authored-by: Semih Cantürk <=> Co-authored-by: rusty1s <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Rishi Puri <[email protected]>

Semih Cantürk added 3 commits March 5, 2024 00:06

GPSE implementation

0e84df7

rename example module

a3ac5a2

cosim_col_sep raises ValueError if no batch_idx

e4ed760

semihcanturk requested review from EdisonLeeeee and wsad1 as code owners March 5, 2024 06:27

github-actions bot added nn example transform labels Mar 5, 2024

remove examples/gpse.py, to be added as a separate PR

e7add7e

github-actions bot removed the example label Mar 7, 2024

semihcanturk mentioned this pull request Mar 7, 2024

Graph Positional and Structural Encoder #8310

Open

rusty1s assigned semihcanturk Mar 25, 2024

rusty1s added feature 0 - Priority P0 labels Mar 25, 2024

Semih Cantürk added 5 commits March 27, 2024 00:40

GPSE implementation

7726933

rename example module

6bf3042

cosim_col_sep raises ValueError if no batch_idx

8a295af

remove examples/gpse.py, to be added as a separate PR

971fa48

Merge remote-tracking branch 'origin/gpse' into gpse

f24acbc

semihcanturk requested review from a team, mananshah99 and rusty1s as code owners March 27, 2024 05:40

github-actions bot added documentation benchmark example dataset data utils labels Mar 27, 2024

Merge branch 'master' into gpse

9f000d0

update

c63fc9f

semihcanturk and others added 5 commits January 15, 2025 20:27

fix bn_eps and bn_mom args

5ed9a37

merge with latest master

8dbaa40

[pre-commit.ci] auto fixes from pre-commit.com hooks

f608bc5

for more information, see https://pre-commit.ci

fix docs

c79f1ce

Merge remote-tracking branch 'origin/gpse' into gpse

f0116dd

Merge branch 'master' into gpse

5ebb2fd

semihcanturk mentioned this pull request Mar 14, 2025

GPSE example #10118

Merged

puririshi98 self-requested a review March 14, 2025 02:13

puririshi98 and others added 2 commits March 20, 2025 10:42

Merge branch 'master' into gpse

0f7b8eb

Merge branch 'master' into gpse

a56377e

puririshi98 approved these changes Apr 1, 2025

View reviewed changes

puririshi98 enabled auto-merge (squash) April 1, 2025 01:09

fix GPSE type hint

dcc9593

auto-merge was automatically disabled April 1, 2025 18:33
Head branch was pushed to by a user without write access

semihcanturk and others added 3 commits April 2, 2025 12:24

Merge branch 'master' into gpse

f1df833

Merge branch 'master' into gpse

f12f78e

Merge branch 'master' into gpse

5c9a8f2

puririshi98 merged commit a71fe2a into pyg-team:master Apr 4, 2025
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPSE Implementation #9018

GPSE Implementation #9018

Uh oh!

semihcanturk commented Mar 5, 2024 •

edited

Loading

Uh oh!

codecov bot commented Mar 27, 2024 •

edited

Loading

Uh oh!

rusty1s commented Jun 5, 2024

Uh oh!

pjspol commented Jul 15, 2024 •

edited

Loading

Uh oh!

semihcanturk commented Aug 16, 2024

Uh oh!

luke-a-thompson commented Oct 17, 2024

Uh oh!

semihcanturk commented Jan 15, 2025

Uh oh!

zzzzzzyc commented Mar 9, 2025

Uh oh!

puririshi98 commented Mar 13, 2025

Uh oh!

puririshi98 left a comment

Uh oh!

puririshi98 commented Apr 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

GPSE Implementation #9018

GPSE Implementation #9018

Uh oh!

Conversation

semihcanturk commented Mar 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rusty1s commented Jun 5, 2024

Uh oh!

pjspol commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

semihcanturk commented Aug 16, 2024

Uh oh!

luke-a-thompson commented Oct 17, 2024

Uh oh!

semihcanturk commented Jan 15, 2025

Uh oh!

zzzzzzyc commented Mar 9, 2025

Uh oh!

puririshi98 commented Mar 13, 2025

Uh oh!

puririshi98 left a comment

Choose a reason for hiding this comment

Uh oh!

puririshi98 commented Apr 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

semihcanturk commented Mar 5, 2024 •

edited

Loading

codecov bot commented Mar 27, 2024 •

edited

Loading

pjspol commented Jul 15, 2024 •

edited

Loading