-
Notifications
You must be signed in to change notification settings - Fork 282
Fix #218 - unable to load OpenELM #228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
davidkoski
commented
Mar 7, 2025
- OpenELM had optional layers that were always created
- see Qwen 2.5 fails to load: Key weight not found in Linear #214
@ModuleInfo(key: "q_norm") var qNorm: RMSNorm | ||
@ModuleInfo(key: "k_norm") var kNorm: RMSNorm | ||
@ModuleInfo(key: "q_norm") var qNorm: RMSNorm? | ||
@ModuleInfo(key: "k_norm") var kNorm: RMSNorm? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These really should have been optional based on the config.
if !args.shareInputOutputLayers { | ||
self._lmHead.wrappedValue = Linear( | ||
args.numTransformerLayers, args.vocabularySize, bias: false) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same -- this was the reported key.
out = lmHead(out) | ||
} else { | ||
out = matmul(out, transformer.embedTokens.weight.T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't that use asLinear
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good call -- it looks like there are quite a few uses of the bare property. I will file an issue for the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, and I forgot to commit/push that. Well, I will hit it with #231
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Left one comment, please take a look and merge when ready!