-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Open
Labels
new-featureNew feature or requestNew feature or request
Description
Hello,
could we please have 13b and 7b models with the updated architecture that includes grouped query attention? A lot of people are running these models on machines with low memory and this would really help them to use a larger context. A context of 4096 just needs too much memory to be feasible right now with good speed and quality on most common hardware.
Thank you!
Metadata
Metadata
Assignees
Labels
new-featureNew feature or requestNew feature or request