Skip to content

[RFC] Profiling system in async mode #4207

@tardis-key

Description

@tardis-key

Feature request

a profiling system designed for asynchronous mode,this system needs to support the following key scenarios.

  1. different async backends (vllm/sglang/fsdp/megatron).
  2. AgentLoop (default)
  3. engine worker

Motivation

The adoption of agentloop, full async, and similar paradigms has shifted Verl's workflow towards an asynchronous mode. And in #4106, rollout is changed to server mode by default. However, the current profiling system seems incompatible with asynchronous frameworks. Therefore, there is a need to redesign a profiling framework to support profiling data collection across different hardware architectures.

Profiling Capability for Rollout Independent Processes

Continuous Adaptation During Reconstruction

Configuration and Documentation Optimization

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions