Model debugger upgrades #37391

molbap · 2025-04-09T11:21:16Z

What does this PR do?

A continuation of #36798 , now:

The debugger will only output the first and last layer of a sequence of layers.
mean/stds are added as well, and a ..._SUMMARY.json file will contain only statistics, not full tensors.
General printing improvements.

ArthurZucker

Nice!

ArthurZucker · 2025-04-09T11:22:33Z

src/transformers/model_debugging_utils.py

    if hasattr(value, "_local_tensor"):
        # DTensor-like handling, just use local tensor attribute
-        return {
+        torch.set_printoptions(sci_mode=True)


we could ahve max line width increased as well1

it's done in the repr_to_list method, will unify this ;)

ArthurZucker · 2025-04-09T11:23:52Z

Can we make sure hooks are removed after we exit the context manager?

HuggingFaceDocBuilderDev · 2025-04-09T11:59:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

molbap · 2025-04-18T09:58:25Z

cc @eustlb @zucchini-nlp @qubvel @yonigozlan , my fellow model adders, could be helpful! Check out the doc (and @Cyrilvallez but I think you've seen/used it already maybe)

Cyrilvallez · 2025-04-18T10:09:08Z

Hey @molbap, I just had a random thought while reviewing a model and was thinking about your super nice util, so dropping it here right now as I'm seeing the tag (this is NOT intended as a follow-up to your util hahaha, just to see if it could be helpful for everyone/others have ideas about that):

Another super cool helper IMO would be some kind of library scanner to find close models for modular. E.g., I want to find a MLP with only 2 Linear layers and Dropout -> helper find the different related models
As far as I know we cannot easily do it with usual IDE search tools as most of the time we don't know (and don't really care, as we have a weight converter most of the time anyway) the actual names of those layers

qubvel · 2025-04-18T11:43:29Z

Following @Cyrilvallez's idea, it might indeed be very helpful to run such a tool even across the library to find identical modules that may have differently named attributes but are identical actually. Unfortunately, we cannot rename attributes because it would break the weights loading (or we would need a hook for renaming), but at least we can choose one as a standard and leave comments for the rest, indicating that e.g. LlamaMLP is identical to this one and is preferred to be used for modular purposes

qubvel

Nice, thanks for updating!

The debugger will only output the first and last layer of a sequence of layers.

Is it a configurable option? I remember for mllama we messed up self/cross attention layers order, so the diff appears after layer 4.

qubvel · 2025-04-18T11:50:21Z

tests/utils/test_model_debugging_utils.py

+if is_vision_available():
+    pass


can be removed

indeed, artifact of make fixup

qubvel · 2025-04-18T11:51:08Z

tests/utils/test_model_debugging_utils.py

+    from torch import nn
+
+
+class ToyModel(nn.Module):


Should be also under the if is_torch_available(), otherwise we don't need a guard actually

qubvel · 2025-04-18T11:54:21Z

docs/source/en/internal/model_debugging_utils.md



-[[autodoc]] model_addition_debugger
+### Reading results


Comment for the above lines actually, but can't comment there.

# call forward method (not .generate!) with model_addition_debugger_context(model, "optional_path_to_your_output_file.json"): output = model.forward(**inputs)

Should we call model.forward? Why not model(**intpus)? Are we avoiding the top-level hooks for some reason?

no, top-level works as well - it's just to be explicit and oppose model.forward to model.generate! Can explain (as the latter would create a several 100MB json file...)

molbap · 2025-04-18T13:09:04Z

Ah that's true, I'll add a configuration option now to output all the layers and add to the doc.

For the modular scanner: indeed, it'd be a very nice util for model adders as well, agree that we can't easily modularize existing code (we can only if we don't change the names), but I'm pretty sure we could add a name mapping util to make sure to preserve naming. E.g. "this module is identical to that one, but rename self.attn to self.self_attn".

…sformers into model_debugger_upgrades

molbap · 2025-04-18T14:33:32Z

Added:

configurable do_prune_layers and associated test @qubvel (+ torch guarding the whole thing, why not)
more documentation
docstrings because I'm feeling nice today

merging!

ArthurZucker · 2025-04-28T13:14:18Z

Very nice!

* debugging improvements * add debugging details * add more debugging details * debug more * clean up layers + output * add summary json file * cleanup * copies 👀 * remove hooks + add documentation * draft a small test, why not * respect the format (respect it) * fixup imports * nit * add tests and configurable pruning of layers

molbap added 7 commits April 7, 2025 17:14

debugging improvements

2227d49

add debugging details

5d0db09

add more debugging details

a98c69a

debug more

35f5bba

clean up layers + output

7e1d220

add summary json file

1909548

cleanup

a31018a

ArthurZucker approved these changes Apr 9, 2025

View reviewed changes

copies 👀

fdd8209

molbap added 6 commits April 17, 2025 16:46

remove hooks + add documentation

9ecb5bf

draft a small test, why not

7e1502f

respect the format (respect it)

bc37570

Merge branch 'main' into model_debugger_upgrades

23a8af5

fixup imports

a43e426

nit

cd5fdfb

molbap marked this pull request as ready for review April 17, 2025 15:31

molbap mentioned this pull request Apr 18, 2025

Flag SpeechT5 flaky test #37587

Merged

molbap added 2 commits April 18, 2025 11:27

Merge branch 'main' into model_debugger_upgrades

e46e6bd

Merge branch 'main' into model_debugger_upgrades

63959f3

qubvel reviewed Apr 18, 2025

View reviewed changes

molbap and others added 3 commits April 18, 2025 16:29

add tests and configurable pruning of layers

6556a2e

Merge branch 'model_debugger_upgrades' of github.com:huggingface/tran…

3d3680e

…sformers into model_debugger_upgrades

Merge branch 'main' into model_debugger_upgrades

af95aee

molbap merged commit 4afd3f4 into main Apr 18, 2025
21 checks passed

molbap deleted the model_debugger_upgrades branch April 18, 2025 14:45

Model debugger upgrades #37391

Model debugger upgrades #37391

Uh oh!

Conversation

molbap commented Apr 9, 2025

What does this PR do?

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker commented Apr 9, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 9, 2025

Uh oh!

molbap commented Apr 18, 2025

Uh oh!

Cyrilvallez commented Apr 18, 2025

Uh oh!

qubvel commented Apr 18, 2025

Uh oh!

qubvel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

molbap commented Apr 18, 2025

Uh oh!

molbap commented Apr 18, 2025

Uh oh!

Uh oh!

ArthurZucker commented Apr 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants