Skip to content

[serve] Transform replica level metrics to AutoScalingContext constructor args#1

Merged
snorkelopstesting2-coder merged 1 commit intomainfrom
pr-57202-autoscaling-context-metrics
Oct 22, 2025
Merged

[serve] Transform replica level metrics to AutoScalingContext constructor args#1
snorkelopstesting2-coder merged 1 commit intomainfrom
pr-57202-autoscaling-context-metrics

Conversation

@snorkelopstesting2-coder
Copy link
Contributor

Recreated from original PR: ray-project/ray#57202

Changes

  1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function.
  2. dropped requests_per_replica since its expensive to compute
  3. renamed queued_requests to total_queued_requests for consistency with total_num_requests
  4. added total_running_requests
  5. added tests assert new fields are populated correctly
  6. run custom metrics tests with RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER = 0 and 1
  7. updated docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant