Skip to content

Conversation

@xnuohz
Copy link
Contributor

@xnuohz xnuohz commented Oct 15, 2024

Issue

Feature Summary

@xnuohz xnuohz changed the title [WIP] MoleculeGPT Add MoleculeGPT Oct 21, 2024
@rusty1s
Copy link
Member

rusty1s commented Oct 22, 2024

@xnuohz Looks great. Can you do us a favor and split the PR into multiple? I would imagine we can merge dataset, model and example separately to ease reviewing.

@xnuohz
Copy link
Contributor Author

xnuohz commented Oct 23, 2024

@xnuohz Looks great. Can you do us a favor and split the PR into multiple? I would imagine we can merge dataset, model and example separately to ease reviewing.

@rusty1s Got it. I'll do this later.

@puririshi98
Copy link
Contributor

@xnuohz, notice CI fails because:
"E ModuleNotFoundError: No module named 'transformers'"
you need to add "@withPackage('transformers', 'sentencepiece', 'accelerate')"
Since the PyG github CI does not have these installed by default please manually test the unit test and share the results as a comment so we know everything works fine. At NVIDIA we do have CI that tests with these packages installed so once your work is merged it will be mantained :)

@xnuohz
Copy link
Contributor Author

xnuohz commented Oct 30, 2024

@puririshi98 Fixed CI and test the unit test locally.
image

@puririshi98
Copy link
Contributor

LGTM

@puririshi98
Copy link
Contributor

will wait for @rusty1s and @akihironitta to review/merge

@puririshi98
Copy link
Contributor

root@keystone-dvt1d-023-114:/workspace/pytorch_geometric# python3 examples/llm/molecule_gpt.py 
Setting up 'TinyLlama/TinyLlama-1.1B-Chat-v0.1' with configuration: {'revision': 'main', 'max_memory': {0: '93GiB'}, 'low_cpu_mem_usage': True, 'device_map': 'auto', 'torch_dtype': torch.bfloat16}
Some weights of RobertaModel were not initialized from the model checkpoint at DeepChem/ChemBERTa-77M-MTR and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Total Preparation Time: 1.987466s
Training beginning...
Epoch: 1|3:   0%|                                                                                                                                                 | 0/1719 [00:00<?, ?it/s]/usr/local/lib/python3.12/dist-packages/torch/autograd/graph.py:825: UserWarning: cuDNN SDPA backward got grad_output.strides() != output.strides(), attempting to materialize a grad_output with matching strides... (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/cudnn/MHA.cpp:674.)
  return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
Epoch: 1|3: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1719/1719 [03:22<00:00,  8.51it/s]
Epoch: 1|3, Train loss: 1.067092, Val loss: 1.081951
Epoch: 2|3: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1719/1719 [02:59<00:00,  9.58it/s]
Epoch: 2|3, Train loss: 0.844542, Val loss: 1.037265
Epoch: 3|3: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1719/1719 [03:01<00:00,  9.48it/s]
Epoch: 3|3, Train loss: 0.812881, Val loss: 1.026247
/usr/local/lib/python3.12/dist-packages/torch/cuda/memory.py:369: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
  warnings.warn(
Total Training Time: 591.698638s
Test loss: 1.042540
Total Time: 602.299925s

merging since @rusty1s is busy until new year. cc @akihironitta

@puririshi98 puririshi98 self-requested a review November 20, 2024 01:44
Copy link
Contributor

@puririshi98 puririshi98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@puririshi98 puririshi98 merged commit 529237c into pyg-team:master Nov 20, 2024
16 checks passed
@xnuohz xnuohz deleted the moleculegpt/dataset branch November 20, 2024 06:21
mattjhayes3 pushed a commit to mattjhayes3/pytorch_geometric that referenced this pull request Dec 14, 2024
### Issue
- pyg-team#9694 
- pyg-team#9698

### Feature Summary

- Add `MoleculeGPTDataset`
- Add `MoleculeGPT` as GNN & LLM Co-training model to PyG
- Add an example for training and testing
- Split the PR into 3 sub-PRs (pyg-team#9723, pyg-team#9724, pyg-team#9725)
- Limited hardware resources, can't load `lmsys/vicuna-7b-v1.5`, use
`TinyLlama/TinyLlama-1.1B-Chat-v0.1` instead, and the full training
pipeline was not tested

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Giovanni Gatti <[email protected]>
Co-authored-by: Rishi Puri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants