-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Model debugger upgrades #37391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model debugger upgrades #37391
Conversation
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
| if hasattr(value, "_local_tensor"): | ||
| # DTensor-like handling, just use local tensor attribute | ||
| return { | ||
| torch.set_printoptions(sci_mode=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could ahve max line width increased as well1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's done in the repr_to_list method, will unify this ;)
|
Can we make sure hooks are removed after we exit the context manager? |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
cc @eustlb @zucchini-nlp @qubvel @yonigozlan , my fellow model adders, could be helpful! Check out the doc (and @Cyrilvallez but I think you've seen/used it already maybe) |
|
Hey @molbap, I just had a random thought while reviewing a model and was thinking about your super nice util, so dropping it here right now as I'm seeing the tag (this is NOT intended as a follow-up to your util hahaha, just to see if it could be helpful for everyone/others have ideas about that): Another super cool helper IMO would be some kind of library scanner to find close models for modular. E.g., I want to find a MLP with only 2 Linear layers and Dropout -> helper find the different related models |
|
Following @Cyrilvallez's idea, it might indeed be very helpful to run such a tool even across the library to find identical modules that may have differently named attributes but are identical actually. Unfortunately, we cannot rename attributes because it would break the weights loading (or we would need a hook for renaming), but at least we can choose one as a standard and leave comments for the rest, indicating that e.g. LlamaMLP is identical to this one and is preferred to be used for modular purposes |
qubvel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for updating!
The debugger will only output the first and last layer of a sequence of layers.
Is it a configurable option? I remember for mllama we messed up self/cross attention layers order, so the diff appears after layer 4.
| if is_vision_available(): | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, artifact of make fixup
| from torch import nn | ||
|
|
||
|
|
||
| class ToyModel(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be also under the if is_torch_available(), otherwise we don't need a guard actually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah indeed!
|
|
||
|
|
||
| [[autodoc]] model_addition_debugger | ||
| ### Reading results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment for the above lines actually, but can't comment there.
# call forward method (not .generate!)
with model_addition_debugger_context(model, "optional_path_to_your_output_file.json"):
output = model.forward(**inputs)
Should we call model.forward? Why not model(**intpus)? Are we avoiding the top-level hooks for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, top-level works as well - it's just to be explicit and oppose model.forward to model.generate! Can explain (as the latter would create a several 100MB json file...)
|
Ah that's true, I'll add a configuration option now to output all the layers and add to the doc. For the |
|
Added:
merging! |
|
Very nice! |
* debugging improvements * add debugging details * add more debugging details * debug more * clean up layers + output * add summary json file * cleanup * copies 👀 * remove hooks + add documentation * draft a small test, why not * respect the format (respect it) * fixup imports * nit * add tests and configurable pruning of layers

What does this PR do?
A continuation of #36798 , now:
..._SUMMARY.jsonfile will contain only statistics, not full tensors.