-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Simplify Tensor Parallel implementation with PyTorch TP #34184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
e60fb87
fd7f7c7
9224cab
79cc524
a2934b3
a8fc418
e84a388
396d158
7b346b5
d60679b
dda058a
12fbbe7
02c8c39
073c521
db6e5ee
5bb294e
290a7f1
bd2e89c
4892cef
9648f31
93ba283
73524c9
f312e55
ca93bdb
dc2672f
1e27d6f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -141,6 +141,16 @@ class LlamaConfig(PretrainedConfig): | |||
|
|
||||
| model_type = "llama" | ||||
| keys_to_ignore_at_inference = ["past_key_values"] | ||||
| # Default tensor parallel plan for base model `LlamaModel` | ||||
| _base_model_tp_plan = { | ||||
|
||||
| class PretrainedConfig(PushToHubMixin): |
{} which I believe is best possible default for any config sub class inheriting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kmehant @ArthurZucker for the suggestion. I moved base_model_tp_plan to PretrainedConfig in the latest commit.
Uh oh!
There was an error while loading. Please reload this page.