Skip to content

[perf] feat: simplify precision_debugger config behavior and docs#5986

Merged
tardis-key merged 2 commits intoverl-project:mainfrom
Tjh-UKN:main
Apr 14, 2026
Merged

[perf] feat: simplify precision_debugger config behavior and docs#5986
tardis-key merged 2 commits intoverl-project:mainfrom
Tjh-UKN:main

Conversation

@Tjh-UKN
Copy link
Copy Markdown
Contributor

@Tjh-UKN Tjh-UKN commented Apr 13, 2026

Summary

This PR aligns and simplifies PrecisionDebugger integration and documentation.

Changes

  • Align PrecisionDebugger profiling behavior with global profiler controls.
  • Simplify precision_debugger config behavior and usage guidance.
  • Improve PrecisionDebugger docs with practical msprobe config.json examples (statistics and tensor) and simple CLI enablement examples.

Why this is not duplicate work

  • Checked existing open PRs for this head/base and did not find an existing open PR from Tjh-UKN:main to verl-project/verl:main.

Tests run

  • python -m py_compile verl/utils/profiler/config.py verl/utils/profiler/profile.py verl/utils/profiler/precision_debugger_profile.py
  • Result: pass

fix #5985

Test Result

tree /data01/tjh/verl/outputs/precision_debug_SIMP/step_1/
/data01/tjh/verl/outputs/precision_debug_SIMP/step_1/
├── actor_compute_log_prob
│ └── step0
│ ├── rank0
│ │ └── dump.json
│ └── rank1
│ └── dump.json
├── actor_update
│ └── step0
│ ├── rank0
│ │ └── dump.json
│ └── rank1
│ └── dump.json
└── ref_compute_log_prob
└── step0
├── rank0
│ └── dump.json
└── rank1
└── dump.json

12 directories, 6 files

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


TAJh seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request simplifies the configuration of the msprobe Precision Debugger by centralizing step filtering and output path management. Redundant fields such as data_dir and tool-specific steps have been deprecated or removed in favor of global_profiler.save_path and global_profiler.steps. Additionally, rank filtering for msprobe is now delegated to its internal config.json, ensuring the verl-side rank gate remains open when the tool is enabled. I have no feedback to provide.

@Tjh-UKN Tjh-UKN requested a review from wucong25 as a code owner April 13, 2026 11:35
@Tjh-UKN
Copy link
Copy Markdown
Contributor Author

Tjh-UKN commented Apr 13, 2026

@tardis-key I have simplified the usage, please review.

@tardis-key
Copy link
Copy Markdown
Collaborator

Thank you for your quick response!
This PR has handled Examples for command-line startup and Configuration Optimization, so what about Msprobe Installation?

- Use global_profiler.steps as the single step gate for precision_debugger\n- Default dump root to global_profiler.save_path (data_dir overrides when set)\n- Mark precision_debugger.steps as deprecated/ignored in config and runtime\n- Update precision_debugger docs with common config.json samples and minimal CLI usage\n\nCo-authored-by: OpenAI Codex <codex@openai.com>
@Tjh-UKN Tjh-UKN changed the title profiler: simplify precision_debugger config behavior and docs [feat]profiler: simplify precision_debugger config behavior and docs Apr 14, 2026
@tardis-key tardis-key changed the title [feat]profiler: simplify precision_debugger config behavior and docs [perf] feat: simplify precision_debugger config behavior and docs Apr 14, 2026
- Update _generated_* trainer config snapshots to match current source configs\n- Include precision_debugger tool_config propagation in generated files\n- Remove stale precision_debugger enable/data_dir generated fields\n\nCo-authored-by: OpenAI Codex <codex@openai.com>
@tardis-key tardis-key merged commit f8b0dd2 into verl-project:main Apr 14, 2026
95 of 189 checks passed
huaiyizhao pushed a commit to huaiyizhao/verl that referenced this pull request Apr 15, 2026
…rl-project#5986)

## Summary
This PR aligns and simplifies PrecisionDebugger integration and
documentation.

### Changes
- Align PrecisionDebugger profiling behavior with global profiler
controls.
- Simplify precision_debugger config behavior and usage guidance.
- Improve PrecisionDebugger docs with practical msprobe `config.json`
examples (`statistics` and `tensor`) and simple CLI enablement examples.

## Why this is not duplicate work
- Checked existing open PRs for this head/base and did not find an
existing open PR from `Tjh-UKN:main` to `verl-project/verl:main`.

## Tests run
- `python -m py_compile verl/utils/profiler/config.py
verl/utils/profiler/profile.py
verl/utils/profiler/precision_debugger_profile.py`
- Result: pass

fix verl-project#5985 

## Test Result
tree /data01/tjh/verl/outputs/precision_debug_SIMP/step_1/
/data01/tjh/verl/outputs/precision_debug_SIMP/step_1/
├── actor_compute_log_prob
│   └── step0
│       ├── rank0
│       │   └── dump.json
│       └── rank1
│           └── dump.json
├── actor_update
│   └── step0
│       ├── rank0
│       │   └── dump.json
│       └── rank1
│           └── dump.json
└── ref_compute_log_prob
    └── step0
        ├── rank0
        │   └── dump.json
        └── rank1
            └── dump.json

12 directories, 6 files

---------

Co-authored-by: TAJh <taojiaheng1@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

msprobe usage issue summary

3 participants