Skip to content

[serve] Transform replica level metrics to AutoScalingContext constructor args#57202

Merged
zcin merged 18 commits intoray-project:masterfrom
arcyleung:autoscaling-context-metrics
Oct 16, 2025
Merged

[serve] Transform replica level metrics to AutoScalingContext constructor args#57202
zcin merged 18 commits intoray-project:masterfrom
arcyleung:autoscaling-context-metrics

Conversation

@arcyleung
Copy link
Contributor

@arcyleung arcyleung commented Oct 6, 2025

Changes

  1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function.
  2. dropped requests_per_replica since its expensive to compute
  3. renamed queued_requests to total_queued_requests for consistency with total_num_requests
  4. added total_running_requests
  5. added tests assert new fields are populated correctly
  6. run custom metrics tests with RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER = 0 and 1
  7. updated docs

@arcyleung arcyleung requested a review from a team as a code owner October 6, 2025 01:55
@arcyleung arcyleung changed the title [serve] Transform replica level metrics to AutoScalingContext constructor args [serve][Draft] Transform replica level metrics to AutoScalingContext constructor args Oct 6, 2025
cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request wires up replica-level metrics to the AutoscalingContext for use in custom autoscaling policies. The overall change is in the right direction, but the implementation for transforming the metrics has several critical issues, including potential KeyError and IndexError exceptions, as well as type mismatches with the AutoscalingContext definition. I've provided a detailed comment with a suggested fix to make the metric collection robust and correct.

@ray-gardener ray-gardener bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels Oct 6, 2025
cursor[bot]

This comment was marked as outdated.

@arcyleung arcyleung force-pushed the autoscaling-context-metrics branch from f2b9a74 to 6243438 Compare October 9, 2025 15:42
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@arcyleung arcyleung force-pushed the autoscaling-context-metrics branch from 6ffe2e3 to a242e68 Compare October 10, 2025 22:39
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh force-pushed the autoscaling-context-metrics branch from c3fc84f to 3fbea82 Compare October 11, 2025 12:20
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Oct 11, 2025
cursor[bot]

This comment was marked as outdated.

Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh
Copy link
Contributor

@arcyleung I pushed some changes to your PR. Mainly

  1. removed and renamed some fields on autoscaling context
  2. added tests

cursor[bot]

This comment was marked as outdated.

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh changed the title [serve][Draft] Transform replica level metrics to AutoScalingContext constructor args [serve] Transform replica level metrics to AutoScalingContext constructor args Oct 11, 2025
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh requested a review from a team as a code owner October 11, 2025 13:08
cursor[bot]

This comment was marked as outdated.

Signed-off-by: abrar <abrar@anyscale.com>
@arcyleung
Copy link
Contributor Author

I've just added an additional test case corresponding to the example in docs, since the counter one might not be intuitive to showcase both up-and-downscaling.

@arcyleung
Copy link
Contributor Author

@arcyleung I pushed some changes to your PR. Mainly

1. removed and renamed some fields on autoscaling context

2. added tests

Thanks! Appreciate the help with the refactor and docs

"env": {
"RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER": "1",
"RAY_SERVE_COLLECT_AUTOSCALING_METRICS_ON_HANDLE": "0",
"RAY_SERVE_REPLICA_AUTOSCALING_METRIC_RECORD_INTERVAL_S": "0.1",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test also fails locally for me when I set "RAY_SERVE_REPLICA_AUTOSCALING_METRIC_RECORD_INTERVAL_S": "0.1" but passes with "0.5"

…RVAL_S to 0.5

Signed-off-by: Arthur Leung <arcyleung@gmail.com>
@abrarsheikh abrarsheikh requested a review from zcin October 14, 2025 21:36
arcyleung and others added 2 commits October 14, 2025 21:01
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
@arcyleung arcyleung force-pushed the autoscaling-context-metrics branch from d8b26d5 to dbd575f Compare October 15, 2025 03:05
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
@zcin
Copy link
Contributor

zcin commented Oct 16, 2025

@abrarsheikh I think tests are failing.

@arcyleung
Copy link
Contributor Author

arcyleung commented Oct 16, 2025

@abrarsheikh I think tests are failing.

Will update, the test cases were just missing the look_back_period_s autoscaling config

…configs

Signed-off-by: Arthur Leung <arcyleung@gmail.com>
cursor[bot]

This comment was marked as outdated.

Signed-off-by: Arthur Leung <arcyleung@gmail.com>
@zcin zcin merged commit 3763246 into ray-project:master Oct 16, 2025
6 checks passed
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…ctor args (ray-project#57202)

## Changes

1. Wire up the AutoScalingContext constructor args to make metrics
readable in the custom AutoScalingPolicy function.
2. dropped `requests_per_replica` since its expensive to compute
3. renamed `queued_requests` to `total_queued_requests` for consistency
with `total_num_requests`
4. added `total_running_requests`
5. added tests assert new fields are populated correctly
6. run custom metrics tests with
`RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1
7. updated docs

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
Co-authored-by: abrar <abrar@anyscale.com>
Co-authored-by: Arthur Leung <arcyleung@gmail.com>
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 22, 2025
…ctor args (ray-project#57202)

## Changes

1. Wire up the AutoScalingContext constructor args to make metrics
readable in the custom AutoScalingPolicy function.
2. dropped `requests_per_replica` since its expensive to compute
3. renamed `queued_requests` to `total_queued_requests` for consistency
with `total_num_requests`
4. added `total_running_requests`
5. added tests assert new fields are populated correctly
6. run custom metrics tests with
`RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1
7. updated docs

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
Co-authored-by: abrar <abrar@anyscale.com>
Co-authored-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: xgui <xgui@anyscale.com>
snorkelopstesting2-coder pushed a commit to snorkel-marlin-repos/ray-project_ray_pr_57202_e39dfb53-78b5-4294-b765-088dcf6a5717 that referenced this pull request Oct 22, 2025
snorkelopstesting2-coder added a commit to snorkel-marlin-repos/ray-project_ray_pr_57202_e39dfb53-78b5-4294-b765-088dcf6a5717 that referenced this pull request Oct 22, 2025
…oScalingContext constructor args

Merged from original PR #57202
Original: ray-project/ray#57202
elliot-barn pushed a commit that referenced this pull request Oct 23, 2025
…ctor args (#57202)

## Changes

1. Wire up the AutoScalingContext constructor args to make metrics
readable in the custom AutoScalingPolicy function.
2. dropped `requests_per_replica` since its expensive to compute
3. renamed `queued_requests` to `total_queued_requests` for consistency
with `total_num_requests`
4. added `total_running_requests`
5. added tests assert new fields are populated correctly
6. run custom metrics tests with
`RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1
7. updated docs

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
Co-authored-by: abrar <abrar@anyscale.com>
Co-authored-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…ctor args (ray-project#57202)

## Changes

1. Wire up the AutoScalingContext constructor args to make metrics
readable in the custom AutoScalingPolicy function.
2. dropped `requests_per_replica` since its expensive to compute
3. renamed `queued_requests` to `total_queued_requests` for consistency
with `total_num_requests`
4. added `total_running_requests`
5. added tests assert new fields are populated correctly
6. run custom metrics tests with
`RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1
7. updated docs

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
Co-authored-by: abrar <abrar@anyscale.com>
Co-authored-by: Arthur Leung <arcyleung@gmail.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…ctor args (ray-project#57202)

## Changes

1. Wire up the AutoScalingContext constructor args to make metrics
readable in the custom AutoScalingPolicy function.
2. dropped `requests_per_replica` since its expensive to compute
3. renamed `queued_requests` to `total_queued_requests` for consistency
with `total_num_requests`
4. added `total_running_requests`
5. added tests assert new fields are populated correctly
6. run custom metrics tests with
`RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1
7. updated docs

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
Co-authored-by: abrar <abrar@anyscale.com>
Co-authored-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…ctor args (ray-project#57202)

## Changes

1. Wire up the AutoScalingContext constructor args to make metrics
readable in the custom AutoScalingPolicy function.
2. dropped `requests_per_replica` since its expensive to compute
3. renamed `queued_requests` to `total_queued_requests` for consistency
with `total_num_requests`
4. added `total_running_requests`
5. added tests assert new fields are populated correctly
6. run custom metrics tests with
`RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1
7. updated docs

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
Co-authored-by: abrar <abrar@anyscale.com>
Co-authored-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants