[WIP] gpt_fused #189

cpuhrsch · 2024-04-29T22:58:59Z

A torchao version of gpt-fast's model.py for experimentation.

Currently just a copy-paste of gpt-fast's model.py to get feedback on the idea.

msaroufim · 2024-04-29T23:04:16Z

torchao/prototype/models/gpt_fused/README.md

+For example
+
+```
+PYTHONPATH=/home/cpuhrsch/local/ao/torchao/prototype/models/gpt_fused CUDA_VISIBLE_DEVICES=0 numactl --membind 0 --cpubind 0 python generate.py --compile --checkpoint_path checkpoints/$MODEL_REPO/model.pth --prompt "Hello, my name is"


what's going on here lol, why do i need to set the python path?

So that the import statements in gpt-fast pick up on the location of model.py in torchao

msaroufim · 2024-04-29T23:04:33Z

torchao/prototype/models/gpt_fused/README.md

@@ -0,0 +1,13 @@
+## gpt-fused
+
+A more handwritten version of [gpt-fast](https://github.com/pytorch-labs/gpt-fast)'s model.py for us to experiment with.


wdym by more handwritten?

We could use this to try various fused kernels (Triton or CUDA).

msaroufim · 2024-04-29T23:08:32Z

torchao/prototype/models/gpt_fused/model.py

@@ -0,0 +1,255 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.


could you add a file in test or benchmark script that would just sanity check that the script works with real and random weights

Also why prototype namespace? I think torchao.models.gpt is better, i expect a lot of people will use this as is

Using prototype just to get started. Yes, we can add a benchmark script. I'll work on that next.

msaroufim

OK! Feel free to do the benchmark script and namespace change in a future PR

pytorch-bot · 2024-05-14T04:50:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/189

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit de8400d with merge base e7bbbd2 ():

NEW FAILURES - The following jobs have failed:

.github/workflows/build.yml (gh)
Run Regression Tests / test (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://download.pytorc... / linux-job (gh)
ValueError: SELinux policy is not managed or store cannot be accessed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

* convert group_size to groupsize * group_size to groupsize in REAADME.md

cpuhrsch added 2 commits April 29, 2024 14:56

gpt-fast copy-pasta

3824a28

gpt_fused

480d83e

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2024

cpuhrsch requested a review from msaroufim April 29, 2024 22:59

Remove print

d362042

msaroufim requested changes Apr 29, 2024

View reviewed changes

msaroufim approved these changes Apr 29, 2024

View reviewed changes

Benchmark copy pasta

f730525

cpuhrsch marked this pull request as draft April 30, 2024 00:02

cpuhrsch added 3 commits April 29, 2024 17:23

Simpler benchmark

620e208

Remove tp or speculative decoding

a994000

Trying weight only int8 quant

12535d8

msaroufim mentioned this pull request May 13, 2024

Add FP16Act-FP6Weight Linear #223

Merged

7 tasks

Merge remote-tracking branch 'origin/main' into llama3benchmark1

cc1f244

Merge branch 'main' into llama3benchmark1

de8400d

msaroufim closed this Jun 18, 2024

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Group size2groupsize (pytorch#189)

0b147d3

* convert group_size to groupsize * group_size to groupsize in REAADME.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] gpt_fused #189

[WIP] gpt_fused #189

cpuhrsch commented Apr 29, 2024

msaroufim Apr 29, 2024

cpuhrsch Apr 29, 2024

msaroufim Apr 29, 2024

cpuhrsch Apr 29, 2024

msaroufim Apr 29, 2024

cpuhrsch Apr 29, 2024

msaroufim left a comment

pytorch-bot bot commented May 14, 2024 •

edited

Loading

		@@ -0,0 +1,13 @@
		## gpt-fused

		A more handwritten version of [gpt-fast](https://github.com/pytorch-labs/gpt-fast)'s model.py for us to experiment with.

		@@ -0,0 +1,255 @@
		# Copyright (c) Meta Platforms, Inc. and affiliates.

[WIP] gpt_fused #189

[WIP] gpt_fused #189

Conversation

cpuhrsch commented Apr 29, 2024

msaroufim Apr 29, 2024

Choose a reason for hiding this comment

cpuhrsch Apr 29, 2024

Choose a reason for hiding this comment

msaroufim Apr 29, 2024

Choose a reason for hiding this comment

cpuhrsch Apr 29, 2024

Choose a reason for hiding this comment

msaroufim Apr 29, 2024

Choose a reason for hiding this comment

cpuhrsch Apr 29, 2024

Choose a reason for hiding this comment

msaroufim left a comment

Choose a reason for hiding this comment

pytorch-bot bot commented May 14, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/189

❌ 2 New Failures

pytorch-bot bot commented May 14, 2024 •

edited

Loading