[Serve][2/n] add batching metrics by abrarsheikh · Pull Request #59232 · ray-project/ray

abrarsheikh · 2025-12-07T02:15:00Z

Performance Delta

from ray import serve
from typing import List

@serve.deployment(max_ongoing_requests=1000)
class MyDeployment:
    @serve.batch(max_batch_size=10, batch_wait_timeout_s=1)
    async def handle_batch(self, requests: List[int]) -> List[int]:
        return [request + 1 for request in requests]

    async def __call__(self) -> List[int]:
        return await self.handle_batch(1)

app = MyDeployment.bind()

ray start --head --metrics-export-port=8080 -> serve run batch_test:app

locust 100 users

Metric	With Change	Master	Δ (Master – With Change)
Requests	32,033	33,541	+1,508
Fails	0	0	0
Median (ms)	170	170	0
95%ile (ms)	240	240	0
99%ile (ms)	280	270	–10 ms
Average (ms)	172.98	171.87	–1.11 ms
Min (ms)	70	84	+14 ms
Max (ms)	352	365	+13 ms
Average size (bytes)	1	1	0
Current RPS	581.9	604.1	+22.2
Current Failures/s	0	0	0

Signed-off-by: abrar <abrar@anyscale.com>

gemini-code-assist

Code Review

This pull request introduces metrics for batching functionality in Ray Serve. New constants for histogram buckets are added, and the batching logic is updated to record metrics for wait time, execution time, queue length, utilization, and processed batches. A new test is also added to verify these metrics.

The changes are well-implemented. I have a couple of suggestions for the new test file:

Move local imports to the top of the file to follow PEP 8 guidelines.
Enhance the test to also verify the sum of the batch utilization metric, which would make the test more robust.

python/ray/serve/tests/test_metrics.py

Signed-off-by: abrar <abrar@anyscale.com>

…batch

Signed-off-by: abrar <abrar@anyscale.com>

python/ray/serve/batching.py

Signed-off-by: abrar <abrar@anyscale.com>

harshit-anyscale

lgtm

fixes ray-project#59218 ### Performance Delta ```python from ray import serve from typing import List @serve.deployment(max_ongoing_requests=1000) class MyDeployment: @serve.batch(max_batch_size=10, batch_wait_timeout_s=1) async def handle_batch(self, requests: List[int]) -> List[int]: return [request + 1 for request in requests] async def __call__(self) -> List[int]: return await self.handle_batch(1) app = MyDeployment.bind() ``` `ray start --head --metrics-export-port=8080` -> `serve run batch_test:app` locust 100 users Metric | With Change | Master | Δ (Master – With Change) -- | -- | -- | -- Requests | 32,033 | 33,541 | +1,508 Fails | 0 | 0 | 0 Median (ms) | 170 | 170 | 0 95%ile (ms) | 240 | 240 | 0 99%ile (ms) | 280 | 270 | –10 ms Average (ms) | 172.98 | 171.87 | –1.11 ms Min (ms) | 70 | 84 | +14 ms Max (ms) | 352 | 365 | +13 ms Average size (bytes) | 1 | 1 | 0 Current RPS | 581.9 | 604.1 | +22.2 Current Failures/s | 0 | 0 | 0 --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: kriyanshii <kriyanshishah06@gmail.com>

fixes ray-project#59218 ### Performance Delta ```python from ray import serve from typing import List @serve.deployment(max_ongoing_requests=1000) class MyDeployment: @serve.batch(max_batch_size=10, batch_wait_timeout_s=1) async def handle_batch(self, requests: List[int]) -> List[int]: return [request + 1 for request in requests] async def __call__(self) -> List[int]: return await self.handle_batch(1) app = MyDeployment.bind() ``` `ray start --head --metrics-export-port=8080` -> `serve run batch_test:app` locust 100 users Metric | With Change | Master | Δ (Master – With Change) -- | -- | -- | -- Requests | 32,033 | 33,541 | +1,508 Fails | 0 | 0 | 0 Median (ms) | 170 | 170 | 0 95%ile (ms) | 240 | 240 | 0 99%ile (ms) | 280 | 270 | –10 ms Average (ms) | 172.98 | 171.87 | –1.11 ms Min (ms) | 70 | 84 | +14 ms Max (ms) | 352 | 365 | +13 ms Average size (bytes) | 1 | 1 | 0 Current RPS | 581.9 | 604.1 | +22.2 Current Failures/s | 0 | 0 | 0 --------- Signed-off-by: abrar <abrar@anyscale.com>

fixes ray-project#59218 ### Performance Delta ```python from ray import serve from typing import List @serve.deployment(max_ongoing_requests=1000) class MyDeployment: @serve.batch(max_batch_size=10, batch_wait_timeout_s=1) async def handle_batch(self, requests: List[int]) -> List[int]: return [request + 1 for request in requests] async def __call__(self) -> List[int]: return await self.handle_batch(1) app = MyDeployment.bind() ``` `ray start --head --metrics-export-port=8080` -> `serve run batch_test:app` locust 100 users Metric | With Change | Master | Δ (Master – With Change) -- | -- | -- | -- Requests | 32,033 | 33,541 | +1,508 Fails | 0 | 0 | 0 Median (ms) | 170 | 170 | 0 95%ile (ms) | 240 | 240 | 0 99%ile (ms) | 280 | 270 | –10 ms Average (ms) | 172.98 | 171.87 | –1.11 ms Min (ms) | 70 | 84 | +14 ms Max (ms) | 352 | 365 | +13 ms Average size (bytes) | 1 | 1 | 0 Current RPS | 581.9 | 604.1 | +22.2 Current Failures/s | 0 | 0 | 0 --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

[Serve][2/n] add batching metrics

25acb8b

Signed-off-by: abrar <abrar@anyscale.com>

abrarsheikh added the go add ONLY when ready to merge, run all tests label Dec 7, 2025

gemini-code-assist bot reviewed Dec 7, 2025

View reviewed changes

python/ray/serve/tests/test_metrics.py Show resolved Hide resolved

python/ray/serve/tests/test_metrics.py Show resolved Hide resolved

abrarsheikh added 3 commits December 7, 2025 06:41

fix metrics test

73e1d12

Signed-off-by: abrar <abrar@anyscale.com>

Merge branch 'master' of github.com:ray-project/ray into 59218-abrar-…

45ed319

…batch

move imports

cb6c9d8

Signed-off-by: abrar <abrar@anyscale.com>

abrarsheikh marked this pull request as ready for review December 13, 2025 01:44

abrarsheikh requested review from a team as code owners December 13, 2025 01:44

cursor bot reviewed Dec 13, 2025

View reviewed changes

python/ray/serve/batching.py Show resolved Hide resolved

python/ray/serve/batching.py Outdated Show resolved Hide resolved

ray-gardener bot added serve Ray Serve Related Issue observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling labels Dec 13, 2025

fix batch size defination

7690e69

Signed-off-by: abrar <abrar@anyscale.com>

abrarsheikh requested a review from harshit-anyscale December 13, 2025 08:21

harshit-anyscale approved these changes Dec 15, 2025

View reviewed changes

abrarsheikh merged commit 8b3003b into master Dec 15, 2025
6 checks passed

abrarsheikh deleted the 59218-abrar-batch branch December 15, 2025 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Serve][2/n] add batching metrics#59232

[Serve][2/n] add batching metrics#59232
abrarsheikh merged 5 commits intomasterfrom
59218-abrar-batch

abrarsheikh commented Dec 7, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harshit-anyscale left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

abrarsheikh commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Delta

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harshit-anyscale left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abrarsheikh commented Dec 7, 2025 •

edited

Loading