[Data] - Don't reserve GPU budget for non-GPU tasks by goutamvenkat-anyscale · Pull Request #59789 · ray-project/ray

goutamvenkat-anyscale · 2025-12-31T19:07:40Z

Description

Only assign GPU budget if the operator requires it.

Image classification Release Test: https://buildkite.com/ray-project/release/builds/73917#019b90b1-2a29-424f-861b-8715909fe02e

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: Goutam <goutam@anyscale.com>

goutamvenkat-anyscale · 2025-12-31T19:08:14Z

/gemini summary

gemini-code-assist

Code Review

This pull request introduces a sensible change to prevent non-GPU operators from reserving GPU budget, which is particularly important for unbounded operators. The changes in ActorPoolMapOperator and TaskPoolMapOperator correctly adjust the maximum resource requirements based on whether the operator uses GPUs. A comprehensive test case has been added to validate this behavior in a realistic inference pipeline, ensuring that GPU resources are allocated efficiently. The implementation is clean and the test is thorough. I've added one minor suggestion to refactor the test setup for better readability.

python/ray/data/tests/test_reservation_based_resource_allocator.py

gemini-code-assist · 2025-12-31T19:09:09Z

Summary of Changes

This pull request refines Ray Data's resource allocation mechanism to prevent non-GPU tasks from being assigned GPU budget. By explicitly capping GPU requirements to zero for CPU-only operators, the system ensures that valuable GPU resources are exclusively available for tasks that genuinely need them, leading to improved resource efficiency and better performance for mixed-workload execution.

Highlights

GPU Budget Allocation Logic: Modified the resource allocation logic for ActorPoolMapOperator and TaskPoolMapOperator to explicitly set GPU budget to zero for operators that do not require GPUs, even when using infinite resource limits.
Resource Hoarding Prevention: This change prevents non-GPU tasks from inadvertently 'hoarding' or reserving GPU resources, ensuring more efficient utilization of GPU hardware.
New Test Case: Introduced a comprehensive test case (test_gpu_not_reserved_for_non_gpu_operators) to validate that GPU resources are correctly reserved only for GPU-dependent operators in a simulated inference pipeline.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

python/ray/data/_internal/execution/operators/actor_pool_map_operator.py
- Updated min_max_resource_requirements to set gpu=0 in ExecutionResources.for_limits() if num_gpus_per_actor is zero, preventing non-GPU actors from reserving GPU budget.
python/ray/data/_internal/execution/operators/task_pool_map_operator.py
- Updated min_max_resource_requirements to set gpu=0 in ExecutionResources.for_limits() if per_task.gpu is zero, preventing non-GPU tasks from reserving GPU budget.
python/ray/data/tests/test_reservation_based_resource_allocator.py
- Added test_gpu_not_reserved_for_non_gpu_operators to verify that CPU-only operators (e.g., Read, Write) are allocated zero GPUs, while GPU-dependent operators (e.g., Infer1, Infer2) correctly reserve GPUs in a pipeline.

Activity

goutamvenkat-anyscale requested a summary of the pull request.
gemini-code-assist[bot] provided a suggestion to refactor the new test case by defining the list of operators once to improve readability and avoid repetition.

Signed-off-by: Goutam <goutam@anyscale.com>

bveeramani

LGTM except comment about CPUs

bveeramani · 2026-01-02T22:53:34Z

python/ray/data/_internal/execution/operators/actor_pool_map_operator.py

+            # Use infinite limits, but cap GPU to 0 if this operator doesn't use GPUs.
+            # This prevents non-GPU operators from hoarding GPU budget.
+            max_resource_usage = ExecutionResources.for_limits(
+                gpu=None if num_gpus_per_actor else 0
+            )


I think this implementation special-cases GPUs because it assumes that all tasks/actors require logical CPUs and memory, but I don't think that assumption holds.

For example, here's a common thing users do:

ds.map_batches(Inference, num_gpus=1, batch_size=...)

In this case, I don't think the Inference actors request any logical CPUs.

For this reason, should we also include CPUs and memory?

Also simplify this conditional to be just:

gpu=0 if num_gpus == 0 else max_actors * num_gpus

bveeramani · 2026-01-02T23:02:12Z

python/ray/data/_internal/execution/operators/task_pool_map_operator.py

+            # Use infinite limits, but cap GPU to 0 if this operator doesn't use GPUs.
+            # This prevents non-GPU operators from hoarding GPU budget.
+            max_resource_usage = ExecutionResources.for_limits(
+                gpu=None if per_task.gpu else 0
+            )


Sort of out-of-scope for this PR, but this might cause issues if users (or optimization rules like ConfigureMapTaskMemoryRule) specify ray_remote_args_fn.

For example, if a user does:

ds.map_batches(..., ray_remote_args_fn=lambda: {"num_cpus": 10})

Then the max resource usage will be num_cpus=1, but each task requires 10 CPUs.

An easy fix might be to consider ray_remote_args_fn when returning incremental_resource_usage

Signed-off-by: Goutam <goutam@anyscale.com>

python/ray/data/_internal/execution/operators/task_pool_map_operator.py

Signed-off-by: Goutam <goutam@anyscale.com>

python/ray/data/_internal/execution/operators/actor_pool_map_operator.py

Signed-off-by: Goutam <goutam@anyscale.com>

Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>

Signed-off-by: lee1258561 <lee1258561@gmail.com>

Signed-off-by: ryanaoleary <ryanaoleary@google.com>

Signed-off-by: peterxcli <peterxcli@gmail.com>

[Data] - Don't allocate GPU budget for non-GPU tasks

46315b2

Signed-off-by: Goutam <goutam@anyscale.com>

goutamvenkat-anyscale requested a review from a team as a code owner December 31, 2025 19:07

goutamvenkat-anyscale added data Ray Data-related issues go add ONLY when ready to merge, run all tests labels Dec 31, 2025

gemini-code-assist bot reviewed Dec 31, 2025

View reviewed changes

python/ray/data/tests/test_reservation_based_resource_allocator.py Outdated Show resolved Hide resolved

Clean up

a82db08

Signed-off-by: Goutam <goutam@anyscale.com>

goutamvenkat-anyscale changed the title ~~[Data] - Don't allocate GPU budget for non-GPU tasks~~ [Data] - Don't reserve GPU budget for non-GPU tasks Dec 31, 2025

Fix test

4ac11c3

Signed-off-by: Goutam <goutam@anyscale.com>

bveeramani reviewed Jan 2, 2026

View reviewed changes

Address comments

5ea3535

Signed-off-by: Goutam <goutam@anyscale.com>

cursor bot reviewed Jan 5, 2026

View reviewed changes

python/ray/data/_internal/execution/operators/task_pool_map_operator.py Outdated Show resolved Hide resolved

Fix

72c96ea

Signed-off-by: Goutam <goutam@anyscale.com>

cursor bot reviewed Jan 5, 2026

View reviewed changes

python/ray/data/_internal/execution/operators/actor_pool_map_operator.py Show resolved Hide resolved

goutamvenkat-anyscale added 2 commits January 5, 2026 12:00

One more

1d970aa

Signed-off-by: Goutam <goutam@anyscale.com>

Merge branch 'master' into goutam/no_gpu_budget_for_non_gpu

8be2410

alexeykudinkin enabled auto-merge (squash) January 6, 2026 04:05

alexeykudinkin approved these changes Jan 6, 2026

View reviewed changes

alexeykudinkin merged commit 9e2de8d into ray-project:master Jan 6, 2026
7 checks passed

goutamvenkat-anyscale deleted the goutam/no_gpu_budget_for_non_gpu branch January 8, 2026 19:25

AYou0207 pushed a commit to AYou0207/ray that referenced this pull request Jan 13, 2026

[Data] - Don't reserve GPU budget for non-GPU tasks (ray-project#59789)

c598f8a

Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>

lee1258561 pushed a commit to pinterest/ray that referenced this pull request Feb 3, 2026

[Data] - Don't reserve GPU budget for non-GPU tasks (ray-project#59789)

69d02af

Signed-off-by: lee1258561 <lee1258561@gmail.com>

ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Feb 3, 2026

[Data] - Don't reserve GPU budget for non-GPU tasks (ray-project#59789)

6ea8fc2

Signed-off-by: ryanaoleary <ryanaoleary@google.com>

peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026

[Data] - Don't reserve GPU budget for non-GPU tasks (ray-project#59789)

7cc8d18

Signed-off-by: peterxcli <peterxcli@gmail.com>

peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026

[Data] - Don't reserve GPU budget for non-GPU tasks (ray-project#59789)

f1205ac

Signed-off-by: peterxcli <peterxcli@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] - Don't reserve GPU budget for non-GPU tasks#59789

[Data] - Don't reserve GPU budget for non-GPU tasks#59789
alexeykudinkin merged 7 commits intoray-project:masterfrom
goutamvenkat-anyscale:goutam/no_gpu_budget_for_non_gpu

goutamvenkat-anyscale commented Dec 31, 2025 •

edited

Loading

Uh oh!

goutamvenkat-anyscale commented Dec 31, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot commented Dec 31, 2025

Uh oh!

bveeramani left a comment

Uh oh!

bveeramani Jan 2, 2026

Uh oh!

alexeykudinkin Jan 5, 2026

Uh oh!

bveeramani Jan 2, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

goutamvenkat-anyscale commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

goutamvenkat-anyscale commented Dec 31, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot commented Dec 31, 2025

Summary of Changes

Highlights

Uh oh!

bveeramani left a comment

Choose a reason for hiding this comment

Uh oh!

bveeramani Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

alexeykudinkin Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

bveeramani Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

goutamvenkat-anyscale commented Dec 31, 2025 •

edited

Loading