Add missing Block size + Update Configs to not hardcode rope_scaling #1128

Jack-Khuu · 2024-09-10T21:37:43Z

Previously the block_size was not being included in the model_params.json of llama3/3.1 models so it just used a hard coded 2048. This PR adds them into the config.
For rope_scaling (llama 3.1), we were hardcoding the parameters inside of apply_scaling() instead of taking them in as TransformerArgs built from model_params. These happen to line up for 3.1, but this PR makes them an explicit read

pytorch-bot · 2024-09-10T21:37:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1128

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7cdb226 with merge base 5986ed2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lessw2020 · 2024-09-11T03:25:54Z

torchchat/model.py

-    old_context_len = 8192  # original llama3 length
+def apply_scaling(freqs: torch.Tensor, rope_scaling: Dict[str, Any]):
+    # Check for the presence of the required keys
+    assert set(rope_scaling.keys()) >= {"factor", "low_freq_factor", "high_freq_factor", "original_max_position_embeddings"}


could be clearer with issubset and a raiseValueError for more informative error vs assert?

required_keys = {"factor", "low_freq_factor", "high_freq_factor", "original_max_position_embeddings"} if not required_keys.issubset(rope_scaling.keys()): raise ValueError(f"Missing required keys in apply_scaling. Expected: {required_keys}")

lessw2020

lgtm!
1 - I know I wasn't asked to review but since we did a lot of work getting rope embeddings happy for distributed wanted to check this.
2 - verified no issue with distributed run
3 - left minor option for using issubset to potentially read a bit more clearly.

Jack-Khuu · 2024-09-11T07:18:37Z

Thanks for the review, always welcome!! Good idea with the Error raise

Update Configs to not hardcode rope_scaling fields

f39347e

Jack-Khuu requested review from larryliu0820 and Gasoonjia September 10, 2024 21:37

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 10, 2024

Merge branch 'main' into add-rope-fields

b20a964

lessw2020 reviewed Sep 11, 2024

View reviewed changes

lessw2020 approved these changes Sep 11, 2024

View reviewed changes

Jack-Khuu added 4 commits September 11, 2024 00:19

Merge branch 'main' into add-rope-fields

49aa45e

Merge branch 'main' into add-rope-fields

22e57e0

Adding explcit error

cfd3b6a

Merge branch 'main' into add-rope-fields

7cdb226

Jack-Khuu merged commit bc3a365 into main Sep 11, 2024
51 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add missing Block size + Update Configs to not hardcode rope_scaling #1128

Add missing Block size + Update Configs to not hardcode rope_scaling #1128

Uh oh!

Jack-Khuu commented Sep 10, 2024

Uh oh!

pytorch-bot bot commented Sep 10, 2024 •

edited

Loading

Uh oh!

lessw2020 Sep 11, 2024

Uh oh!

lessw2020 left a comment

Uh oh!

Jack-Khuu commented Sep 11, 2024

Uh oh!

Uh oh!

Uh oh!

Add missing Block size + Update Configs to not hardcode rope_scaling #1128

Add missing Block size + Update Configs to not hardcode rope_scaling #1128

Uh oh!

Conversation

Jack-Khuu commented Sep 10, 2024

Uh oh!

pytorch-bot bot commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1128

✅ No Failures

Uh oh!

lessw2020 Sep 11, 2024

Choose a reason for hiding this comment

Uh oh!

lessw2020 left a comment

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu commented Sep 11, 2024

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 10, 2024 •

edited

Loading