Skip to content

Conversation

@semihcanturk
Copy link
Contributor

@semihcanturk semihcanturk commented Mar 14, 2025

Following #9018, Provides a comprehensive example in examples/gpse.py to compute GPSE encodings and use them for a graph regression task on the ZINC dataset. Two methods to compute GPSE encodings is demonstrated:

  • Through precompute_GPSE: Given a PyG dataset, computes GPSE encodings in-place once before training, without saving them to storage. Ideal if you want to compute the encodings only once per run (unlike a dataset transform) but do not want to save the pre-transformed dataset to storage (unlike a PyG pre-transform).

To run with default pretrained weights (molpcba):

python examples/gpse.py --gpse

To run with pretrained weights from any other dataset, please provide the pretraining dataset name from the available options as a kwarg:

python examples/gpse.py --gpse geom
python examples/gpse.py --gpse --as_transform

Using as a transform is not recommended as recomputing them for every batch in every epoch is quite inefficient; using it as a pre-transform or through precompute_GPSE is suggested instead. In either case, the torch_geometric.nn.GPSENodeEncoder is then used to compute a mapping of the GPSE encodings to the desired dimension, and append them to batch.x to prepare them as inputs to a GNN.

This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues.

puririshi98 added a commit that referenced this pull request Apr 4, 2025
[Graph Positional and Structural
Encoder](https://arxiv.org/abs/2307.07107) implementation as per #8310.
Adapted from the original repository:
https://github.com/G-Taxonomy-Workgroup/GPSE. This version is a
standalone implementation that is decoupled from GraphGym, and thus aims
for better accessibility and a smoother integration into PyG. While the
priority of this PR is to enable loading and using pre-trained models in
plug-and-play fashion, it also includes the custom loss function used to
train the model. Nevertheless, it might be easier to use the original
repository for pre-training and fine-tuning new GPSE models for the time
being.

This PR includes the following:

- `GPSE`: The main GPSE module, that generates learned encodings for
input graphs.
- Several helper classes (`FeatureEncoder`, `GNNStackStage`,
`IdentityHead`, `GNNInductiveHybridMultiHead`,
`ResGatedGCNConvGraphGymLayer`, `Linear`, `MLP`, `GeneralMultiLayer`,
`GeneralLayer`, `BatchNorm1dNode`, `BatchNorm1dEdge`,
`VirtualNodePatchSingleton`) and wrapper functions (`GNNPreMP`,
`GNNLayer`), all adapted from their GraphGym versions for compatibility
and enabling the loading of weights pre-trained using the
GraphGym/original version.
- The class method `GPSE.from_pretrained()` that returns a model with
pre-trained weights from the original repository/Zenodo files.
- `GPSENodeEncoder`, a helper linear/MLP encoder that takes the GPSE
encodings precomputed as`batch.pestat_GPSE` in the input graphs, maps
them to a desired dimension and appends them to node features.
- `precompute_GPSE` , a function that takes in a GPSE model and a
dataset, and precomputes GPSE encodings in-place for a given dataset
using the helper function `gpse_process_batch`.
- The transform `AddGPSE`, which in similar fashion to
`AddLaplacianEigenvectorPE` and `AddRandomWalkPE` adds the GPSE
encodings to a given graph using the helper function `gpse_process`
- The testing modules `test/test_gpse.py` and `test/test_add_gpse.py`.
- The loss function `gpse_loss` and helper functions `cosim_col_sep` and
`process_batch_idx` used in GPSE training.
- A comprehensive example in `examples/gpse.py` is provided as a
separate PR in #10118.

This PR has been tested with the latest (25.01-py3) NVIDIA stack, and
has works without any issues.

---------

Co-authored-by: Semih Cantürk <=>
Co-authored-by: rusty1s <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rishi Puri <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
Copy link
Contributor

@puririshi98 puririshi98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@puririshi98 puririshi98 enabled auto-merge (squash) April 22, 2025 19:45
auto-merge was automatically disabled April 22, 2025 19:49

Invalid email address

@puririshi98 puririshi98 merged commit 7e078e6 into pyg-team:master Apr 22, 2025
17 checks passed
chrisn-pik pushed a commit to chrisn-pik/pytorch_geometric that referenced this pull request Jun 30, 2025
[Graph Positional and Structural
Encoder](https://arxiv.org/abs/2307.07107) implementation as per pyg-team#8310.
Adapted from the original repository:
https://github.com/G-Taxonomy-Workgroup/GPSE. This version is a
standalone implementation that is decoupled from GraphGym, and thus aims
for better accessibility and a smoother integration into PyG. While the
priority of this PR is to enable loading and using pre-trained models in
plug-and-play fashion, it also includes the custom loss function used to
train the model. Nevertheless, it might be easier to use the original
repository for pre-training and fine-tuning new GPSE models for the time
being.

This PR includes the following:

- `GPSE`: The main GPSE module, that generates learned encodings for
input graphs.
- Several helper classes (`FeatureEncoder`, `GNNStackStage`,
`IdentityHead`, `GNNInductiveHybridMultiHead`,
`ResGatedGCNConvGraphGymLayer`, `Linear`, `MLP`, `GeneralMultiLayer`,
`GeneralLayer`, `BatchNorm1dNode`, `BatchNorm1dEdge`,
`VirtualNodePatchSingleton`) and wrapper functions (`GNNPreMP`,
`GNNLayer`), all adapted from their GraphGym versions for compatibility
and enabling the loading of weights pre-trained using the
GraphGym/original version.
- The class method `GPSE.from_pretrained()` that returns a model with
pre-trained weights from the original repository/Zenodo files.
- `GPSENodeEncoder`, a helper linear/MLP encoder that takes the GPSE
encodings precomputed as`batch.pestat_GPSE` in the input graphs, maps
them to a desired dimension and appends them to node features.
- `precompute_GPSE` , a function that takes in a GPSE model and a
dataset, and precomputes GPSE encodings in-place for a given dataset
using the helper function `gpse_process_batch`.
- The transform `AddGPSE`, which in similar fashion to
`AddLaplacianEigenvectorPE` and `AddRandomWalkPE` adds the GPSE
encodings to a given graph using the helper function `gpse_process`
- The testing modules `test/test_gpse.py` and `test/test_add_gpse.py`.
- The loss function `gpse_loss` and helper functions `cosim_col_sep` and
`process_batch_idx` used in GPSE training.
- A comprehensive example in `examples/gpse.py` is provided as a
separate PR in pyg-team#10118.

This PR has been tested with the latest (25.01-py3) NVIDIA stack, and
has works without any issues.

---------

Co-authored-by: Semih Cantürk <=>
Co-authored-by: rusty1s <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rishi Puri <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
chrisn-pik pushed a commit to chrisn-pik/pytorch_geometric that referenced this pull request Jun 30, 2025
Following pyg-team#9018, Provides a comprehensive example in `examples/gpse.py`
to compute GPSE encodings and use them for a graph regression task on
the ZINC dataset. Two methods to compute GPSE encodings is demonstrated:

- Through `precompute_GPSE`: Given a PyG dataset, computes GPSE
encodings in-place _once_ before training, without saving them to
storage. Ideal if you want to compute the encodings only once per run
(unlike a dataset transform) but do not want to save the pre-transformed
dataset to storage (unlike a PyG pre-transform).

To run with default pretrained weights (molpcba):
```
python examples/gpse.py --gpse
```
To run with pretrained weights from any other dataset, please provide
the pretraining dataset name from the available options as a kwarg:
```
python examples/gpse.py --gpse geom
```

- Through the `AddGPSE` transform: A PyG transform analogous to
[AddLaplacianEigenvectorPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddLaplacianEigenvectorPE.html#torch_geometric.transforms.AddLaplacianEigenvectorPE)
and
[AddRandomWalkPE](https://pytorch-geometric.readthedocs.io/en/2.6.0/generated/torch_geometric.transforms.AddRandomWalkPE.html#torch_geometric.transforms.AddRandomWalkPE),
can be used as a pre-transform or transform to a PyG dataset.
```
python examples/gpse.py --gpse --as_transform
```

Using as a transform is not recommended as recomputing them for every
batch in every epoch is quite inefficient; using it as a pre-transform
or through `precompute_GPSE` is suggested instead. In either case, the
`torch_geometric.nn.GPSENodeEncoder` is then used to compute a mapping
of the GPSE encodings to the desired dimension, and append them to
`batch.x` to prepare them as inputs to a GNN.

This PR has been tested with the latest (25.01-py3) NVIDIA stack, and
has works without any issues.

---------

Co-authored-by: Semih Cantürk <=>
Co-authored-by: rusty1s <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rishi Puri <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants