[serve] Transform replica level metrics to AutoScalingContext constructor args#57202
Conversation
There was a problem hiding this comment.
Code Review
This pull request wires up replica-level metrics to the AutoscalingContext for use in custom autoscaling policies. The overall change is in the right direction, but the implementation for transforming the metrics has several critical issues, including potential KeyError and IndexError exceptions, as well as type mismatches with the AutoscalingContext definition. I've provided a detailed comment with a suggested fix to make the metric collection robust and correct.
f2b9a74 to
6243438
Compare
6ffe2e3 to
a242e68
Compare
Signed-off-by: abrar <abrar@anyscale.com>
c3fc84f to
3fbea82
Compare
|
@arcyleung I pushed some changes to your PR. Mainly
|
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
|
I've just added an additional test case corresponding to the example in docs, since the counter one might not be intuitive to showcase both up-and-downscaling. |
Thanks! Appreciate the help with the refactor and docs |
python/ray/serve/tests/BUILD.bazel
Outdated
| "env": { | ||
| "RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER": "1", | ||
| "RAY_SERVE_COLLECT_AUTOSCALING_METRICS_ON_HANDLE": "0", | ||
| "RAY_SERVE_REPLICA_AUTOSCALING_METRIC_RECORD_INTERVAL_S": "0.1", |
There was a problem hiding this comment.
This test also fails locally for me when I set "RAY_SERVE_REPLICA_AUTOSCALING_METRIC_RECORD_INTERVAL_S": "0.1" but passes with "0.5"
…RVAL_S to 0.5 Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung+github@gmail.com>
d8b26d5 to
dbd575f
Compare
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
|
@abrarsheikh I think tests are failing. |
Will update, the test cases were just missing the |
…configs Signed-off-by: Arthur Leung <arcyleung@gmail.com>
Signed-off-by: Arthur Leung <arcyleung@gmail.com>
…ctor args (ray-project#57202) ## Changes 1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function. 2. dropped `requests_per_replica` since its expensive to compute 3. renamed `queued_requests` to `total_queued_requests` for consistency with `total_num_requests` 4. added `total_running_requests` 5. added tests assert new fields are populated correctly 6. run custom metrics tests with `RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1 7. updated docs --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Arthur Leung <arcyleung+github@gmail.com> Co-authored-by: abrar <abrar@anyscale.com> Co-authored-by: Arthur Leung <arcyleung@gmail.com>
…ctor args (ray-project#57202) ## Changes 1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function. 2. dropped `requests_per_replica` since its expensive to compute 3. renamed `queued_requests` to `total_queued_requests` for consistency with `total_num_requests` 4. added `total_running_requests` 5. added tests assert new fields are populated correctly 6. run custom metrics tests with `RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1 7. updated docs --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Arthur Leung <arcyleung+github@gmail.com> Co-authored-by: abrar <abrar@anyscale.com> Co-authored-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: xgui <xgui@anyscale.com>
…ctor args Original PR #57202 by arcyleung Original: ray-project/ray#57202
…oScalingContext constructor args Merged from original PR #57202 Original: ray-project/ray#57202
…ctor args (#57202) ## Changes 1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function. 2. dropped `requests_per_replica` since its expensive to compute 3. renamed `queued_requests` to `total_queued_requests` for consistency with `total_num_requests` 4. added `total_running_requests` 5. added tests assert new fields are populated correctly 6. run custom metrics tests with `RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1 7. updated docs --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Arthur Leung <arcyleung+github@gmail.com> Co-authored-by: abrar <abrar@anyscale.com> Co-authored-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…ctor args (ray-project#57202) ## Changes 1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function. 2. dropped `requests_per_replica` since its expensive to compute 3. renamed `queued_requests` to `total_queued_requests` for consistency with `total_num_requests` 4. added `total_running_requests` 5. added tests assert new fields are populated correctly 6. run custom metrics tests with `RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1 7. updated docs --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Arthur Leung <arcyleung+github@gmail.com> Co-authored-by: abrar <abrar@anyscale.com> Co-authored-by: Arthur Leung <arcyleung@gmail.com>
…ctor args (ray-project#57202) ## Changes 1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function. 2. dropped `requests_per_replica` since its expensive to compute 3. renamed `queued_requests` to `total_queued_requests` for consistency with `total_num_requests` 4. added `total_running_requests` 5. added tests assert new fields are populated correctly 6. run custom metrics tests with `RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1 7. updated docs --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Arthur Leung <arcyleung+github@gmail.com> Co-authored-by: abrar <abrar@anyscale.com> Co-authored-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…ctor args (ray-project#57202) ## Changes 1. Wire up the AutoScalingContext constructor args to make metrics readable in the custom AutoScalingPolicy function. 2. dropped `requests_per_replica` since its expensive to compute 3. renamed `queued_requests` to `total_queued_requests` for consistency with `total_num_requests` 4. added `total_running_requests` 5. added tests assert new fields are populated correctly 6. run custom metrics tests with `RAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER` = 0 and 1 7. updated docs --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Arthur Leung <arcyleung+github@gmail.com> Co-authored-by: abrar <abrar@anyscale.com> Co-authored-by: Arthur Leung <arcyleung@gmail.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Changes
requests_per_replicasince its expensive to computequeued_requeststototal_queued_requestsfor consistency withtotal_num_requeststotal_running_requestsRAY_SERVE_AGGREGATE_METRICS_AT_CONTROLLER= 0 and 1