Update int4pack related in torchchat gguf #1404

yanbing-j · 2024-12-09T05:02:38Z

Fix pytorch/ao#1389.

pytorch-bot · 2024-12-09T05:02:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1404

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Network outage on ROCm runners

❌ 4 New Failures

As of commit e7b6f14 with merge base bb72b09 ():

NEW FAILURES - The following jobs have failed:

pull / test-gpu-aoti-bfloat16 (cuda, stories15M) / linux-job (gh)
RuntimeError: run_func_( container_handle_, input_handles.data(), input_handles.size(), output_handles.data(), output_handles.size(), reinterpret_cast<AOTInductorStreamHandle>(stream_handle), proxy_executor_handle_) API call failed at /pytorch/torch/csrc/inductor/aoti_runner/model_container_runner.cpp, line 107
pull / test-gpu-aoti-float16 (cuda, stories15M) / linux-job (gh)
RuntimeError: Command docker exec -t df6f70a530a76b132502b8ad0ac3e9f5cf365b700941efdcb4ac0efefc7acccd /exec failed with exit code 1
pull / test-gpu-aoti-float32 (cuda, stories15M) / linux-job (gh)
RuntimeError: run_func_( container_handle_, input_handles.data(), input_handles.size(), output_handles.data(), output_handles.size(), reinterpret_cast<AOTInductorStreamHandle>(stream_handle), proxy_executor_handle_) API call failed at /pytorch/torch/csrc/inductor/aoti_runner/model_container_runner.cpp, line 107
Run the aoti runner with CUDA using stories / test-runner-aot-cuda / linux-job (gh)
RuntimeError: Command docker exec -t 363f07816b0dbe0fa356315803eaa202d939433737d652d2de01e6156d92f971 /exec failed with exit code 134

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Jack-Khuu · 2024-12-09T19:06:44Z

I'll bump the pin real quick

#1407

Jack-Khuu · 2024-12-09T21:11:17Z

torchchat/utils/gguf_loader.py

@@ -24,6 +24,9 @@
    pack_scales_and_zeros,
 )

+from torchao.dtypes.utils import is_device
+from torchao.utils import TORCH_VERSION_AT_LEAST_2_6


torchchat locks onto a specific torch version, so we don't need to check

Assume > 2.6

The CI failures seem that torchao version is not that new, because TORCH_VERSION_AT_LEAST_2_6 is a new one. And I saw you pin pytorch nightly to 20241013, which is also not new, and this nightly does not have pytorch/pytorch#139611 inside. This is my question, because the nightly used in the CI is 20241126.

yup, working on the bump here: #1367

We'll test your fixes on there

Jack-Khuu · 2024-12-09T21:11:58Z

torchchat/utils/gguf_loader.py

+                weight = torch.empty(
+                    (
+                        out_features,
+                        in_features // 2,


Jack-Khuu · 2024-12-09T21:13:44Z

torchchat/utils/gguf_loader.py

@@ -623,7 +655,7 @@ def load_model_and_state_dict(
                    in_features=in_features,
                    out_features=out_features,
                    bias=False,
-                    device="meta",
+                    device="cpu",


Let's keep this as a meta device as long as we can

Now only CPU acts different from cuda and meta. https://github.com/pytorch/torchchat/pull/1404/files/b884e295a164fa0b8cd172196e4409e51315567b#diff-28cab20c48af32e561f6e95cec7d029fa076708223a00d64afa80ad62b9b52a4R192 Use device meta here cannot tell the right shape of weight.

Jack-Khuu · 2024-12-14T00:31:40Z

I did a quick rebase for you; feel free to change as needed

yanbing-j · 2024-12-16T05:28:00Z

Hi @Jack-Khuu , thanks for the rebase! The remaining 4 CI failures seem related to cuda device and I cannot get the obvious errors related to Int4 code change. Could you please help me find out the simple reproducer? Thanks!

Jack-Khuu · 2024-12-17T03:25:57Z

The cuda failures are known issues

thanks for the fix

* Update int4pack related for gguf * Update gguf_loader.py --------- Co-authored-by: Jack-Khuu <[email protected]>

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 9, 2024

Update int4pack related for gguf

29d4ab7

yanbing-j force-pushed the yanbing/fix_1389 branch from 6604277 to 29d4ab7 Compare December 9, 2024 05:47

yanbing-j mentioned this pull request Dec 9, 2024

Working around new int4wo weight packing pytorch/ao#1389

Closed

Jack-Khuu mentioned this pull request Dec 9, 2024

Multi Pin Bumps across PT/AO/tune/ET #1367

Merged

Merge branch 'main' into yanbing/fix_1389

b884e29

Jack-Khuu approved these changes Dec 9, 2024

View reviewed changes

Jack-Khuu added the Quantization Issues related to Quantization or torchao label Dec 10, 2024

mikekgfb mentioned this pull request Dec 11, 2024

MacOS test falsely uses MPS, fails and is misreported as passing #1416

Open

Merge branch 'main' into yanbing/fix_1389

c7ccb44

Jack-Khuu mentioned this pull request Dec 14, 2024

[KNOWN BUG] gguf + GPU AOTI Inference bug due to PT version, fix in progress #1423

Closed

Merge branch 'main' into yanbing/fix_1389

f60594f

Update gguf_loader.py

e7b6f14

Jack-Khuu merged commit 56be609 into pytorch:main Dec 17, 2024
49 of 53 checks passed

yanbing-j deleted the yanbing/fix_1389 branch December 17, 2024 05:03

vmpuri pushed a commit that referenced this pull request Feb 4, 2025

Update int4pack related in torchchat gguf (#1404)

902542d

* Update int4pack related for gguf * Update gguf_loader.py --------- Co-authored-by: Jack-Khuu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update int4pack related in torchchat gguf #1404

Update int4pack related in torchchat gguf #1404

Uh oh!

yanbing-j commented Dec 9, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 9, 2024 •

edited

Loading

Uh oh!

Jack-Khuu commented Dec 9, 2024 •

edited

Loading

Uh oh!

Jack-Khuu Dec 9, 2024

Uh oh!

yanbing-j Dec 10, 2024

Uh oh!

Jack-Khuu Dec 10, 2024

Uh oh!

yanbing-j Dec 12, 2024

Uh oh!

Jack-Khuu Dec 9, 2024

Uh oh!

Jack-Khuu Dec 9, 2024

Uh oh!

yanbing-j Dec 10, 2024

Uh oh!

Jack-Khuu commented Dec 14, 2024

Uh oh!

yanbing-j commented Dec 16, 2024

Uh oh!

Jack-Khuu commented Dec 17, 2024

Uh oh!

Uh oh!

Uh oh!

Update int4pack related in torchchat gguf #1404

Update int4pack related in torchchat gguf #1404

Uh oh!

Conversation

yanbing-j commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1404

❗ 1 Active SEVs

❌ 4 New Failures

Uh oh!

Jack-Khuu commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jack-Khuu Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

yanbing-j Dec 10, 2024

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Dec 10, 2024

Choose a reason for hiding this comment

Uh oh!

yanbing-j Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

yanbing-j Dec 10, 2024

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu commented Dec 14, 2024

Uh oh!

yanbing-j commented Dec 16, 2024

Uh oh!

Jack-Khuu commented Dec 17, 2024

Uh oh!

Uh oh!

Uh oh!

yanbing-j commented Dec 9, 2024 •

edited

Loading

pytorch-bot bot commented Dec 9, 2024 •

edited

Loading

Jack-Khuu commented Dec 9, 2024 •

edited

Loading