[Data] Revisiting `OpResourceAllocator` to make data flow explicit by alexeykudinkin · Pull Request #57788 · ray-project/ray

alexeykudinkin · 2025-10-16T06:22:20Z

Description

This change primarily converts OpResourceAllocator APIs to make data flow explicit by exposing required params in the APIs.

Additionally:

Abstracting common methods inside OpResourceAllocator base-class.
Adding allocation to progress bar in verbose mode logging budgets & allocations.
Adding byte-size of all enqueued blocks to the progress bar

Related issues

Types of change

Checklist

Does this PR introduce breaking changes?

Yes ⚠️
No

Testing:

Added/updated tests for my changes
Tested the changes manually
This PR is not tested ❌ (please explain why)

Code Quality:

Signed off every commit (git commit -s)
Ran pre-commit hooks (setup guide)

Documentation:

Updated documentation (if applicable) (contribution guide)
Added new APIs to doc/source/ (if applicable)

Additional context

gemini-code-assist · 2025-10-16T06:22:25Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

python/ray/data/_internal/execution/backpressure_policy/resource_budget_backpressure_policy.py

bveeramani · 2025-10-20T17:56:59Z

python/ray/data/_internal/execution/operators/actor_pool_map_operator.py

        return self._actor_pool.get_actor_info()

+    def get_max_concurrency_limit(self) -> Optional[int]:
+        return self._actor_pool.max_size() * self._actor_pool.max_actor_concurrency()


Out of scope for this PR since this is an existing issue, but if self._actor_pool.max_size() is float("inf"), I think we'd probably want to return None rather than float("inf") for consistency with the return type

Good call

Looked t/h the code and we need to holistically clean this up (since we define max_size as int)

bveeramani · 2025-10-20T17:58:18Z

python/ray/data/_internal/execution/interfaces/op_runtime_metrics.py

        5000.0,
    ]
-    task_completion_time: float = metric_field(
+    task_completion_time_s: float = metric_field(


Do we need to update test_stats.py and the dashboard code after renaming these metrics?

Yep, will do

bveeramani · 2025-10-20T18:06:41Z

python/ray/data/_internal/execution/streaming_executor_state.py

@@ -266,6 +267,19 @@ def total_enqueued_input_bundles(self) -> int:

        return self._pending_dispatch_input_bundles_count() + internal_queue_size


If we change the internal queue size to represent blocks rather than bundles, then total_enqueued_input_bundles will return incorrect values, and DownstreamCapacityBackpressurePolicy will break.

I think even if we update total_enqueued_input_bundles to represent blocks, we'd still need to update the DownstreamCapacityBackpressurePolicy logic:

ray/python/ray/data/_internal/execution/backpressure_policy/downstream_capacity_backpressure_policy.py

Lines 76 to 79 in 3287523

avg_inputs_per_task = (

output_dependency.metrics.num_task_inputs_processed

/ max(output_dependency.metrics.num_tasks_finished, 1)

)

Yeah, we need to fix that across the board

bveeramani · 2025-10-20T18:07:30Z

python/ray/data/_internal/execution/operators/base_physical_operator.py

-        """Returns Operator's internal queue size"""
+        """Returns Operator's internal queue size (in blocks)"""
+        ...


What are we hoping to achieve by changing the unit of internal_queue_size from bundles to blocks?

I just realized that we're assuming that every bundle holds just 1 block, which is not enforced

python/ray/data/_internal/execution/streaming_executor_state.py

python/ray/data/_internal/execution/resource_manager.py

bveeramani · 2025-10-20T18:22:09Z

python/ray/data/_internal/execution/resource_manager.py

+    def __init__(self, topology: "Topology"):
+        self._topology = topology
+        self._idle_detector = self.IdleDetector()
+        self._ticker = 0


I know this is updated in update_budgets, but is it used anywhere else?

Are subclasses required to increment this? If so, I think this should be an explicit part of the interface

Missed to clean up

bveeramani · 2025-10-20T18:30:24Z

python/ray/data/_internal/execution/resource_manager.py

+    @abstractmethod
+    def can_submit_new_task(self, op: PhysicalOperator) -> bool:
+        """Return whether the given operator can submit a new task."""
+        ...


What's the motivation for copying this from the backpressure policy interface to here? Would the implementation ever be non-trivial?

If the implementation of this method is always going to be like below, it might be better to remove the method to make the OpResourceAllocator interface deeper and simpler

def can_submit_new_task(self, op): return op.incremental_resource_usage().satisfies_limit(budget)

Idea here is that the logic whether task can be scheduled should live w/ Resource Allocator (it will be more complicated than the one you referred above)

bveeramani

Looks reasonable to me.

Let's merge #58030 first to minimize size of the diff, and then merge this one?

bveeramani · 2025-10-24T22:20:32Z

python/ray/tests/test_runtime_env_working_dir.py

    @ray.remote
    def test_import():
        import file_module
+


Signed-off-by: Alexey Kudinkin <ak@anyscale.com> # Conflicts: # python/ray/data/_internal/execution/streaming_executor_state.py Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Signed-off-by: Alexey Kudinkin <ak@anyscale.com> # Conflicts: # python/ray/data/tests/test_autoscaler.py Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

cursor · 2025-10-27T20:40:22Z

python/ray/data/_internal/execution/resource_manager.py

+            op,
+            task_resource_usage=self._op_usages,
+            output_object_store_usage=self._mem_op_outputs,
+        )


Bug: Inconsistent Return Types in Resource Management

Type mismatch bug: ResourceManager.max_task_output_bytes_to_read() declares return type as int but calls self._op_resource_allocator.max_task_output_bytes_to_read() which returns Optional[int]. The abstract method in OpResourceAllocator and its implementation in ReservationOpResourceAllocator can return None, but the wrapper method signature promises to always return int. This will cause runtime type errors when None is returned but an int is expected by callers.

…ay-project#57788)    ## Description This change primarily converts `OpResourceAllocator` APIs to make data flow explicit by exposing required params in the APIs. Additionally: 1. Abstracting common methods inside `OpResourceAllocator` base-class. 2. Adding allocation to progress bar in verbose mode logging budgets & allocations. 3. Adding byte-size of all enqueued blocks to the progress bar ## Related issues  ## Types of change - [ ] Bug fix 🐛 - [ ] New feature ✨ - [ ] Enhancement 🚀 - [ ] Code refactoring 🔧 - [ ] Documentation update 📖 - [ ] Chore 🧹 - [ ] Style 🎨 ## Checklist **Does this PR introduce breaking changes?** - [ ] Yes ⚠️ - [ ] No  **Testing:** - [ ] Added/updated tests for my changes - [ ] Tested the changes manually - [ ] This PR is not tested ❌ _(please explain why)_ **Code Quality:** - [ ] Signed off every commit (`git commit -s`) - [ ] Ran pre-commit hooks ([setup guide](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting)) **Documentation:** - [ ] Updated documentation (if applicable) ([contribution guide](https://docs.ray.io/en/latest/ray-contribute/docs.html)) - [ ] Added new APIs to `doc/source/` (if applicable) ## Additional context  --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

…ay-project#57788)    ## Description This change primarily converts `OpResourceAllocator` APIs to make data flow explicit by exposing required params in the APIs. Additionally: 1. Abstracting common methods inside `OpResourceAllocator` base-class. 2. Adding allocation to progress bar in verbose mode logging budgets & allocations. 3. Adding byte-size of all enqueued blocks to the progress bar ## Related issues  ## Types of change - [ ] Bug fix 🐛 - [ ] New feature ✨ - [ ] Enhancement 🚀 - [ ] Code refactoring 🔧 - [ ] Documentation update 📖 - [ ] Chore 🧹 - [ ] Style 🎨 ## Checklist **Does this PR introduce breaking changes?** - [ ] Yes ⚠️ - [ ] No  **Testing:** - [ ] Added/updated tests for my changes - [ ] Tested the changes manually - [ ] This PR is not tested ❌ _(please explain why)_ **Code Quality:** - [ ] Signed off every commit (`git commit -s`) - [ ] Ran pre-commit hooks ([setup guide](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting)) **Documentation:** - [ ] Updated documentation (if applicable) ([contribution guide](https://docs.ray.io/en/latest/ray-contribute/docs.html)) - [ ] Added new APIs to `doc/source/` (if applicable) ## Additional context  --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>

…ay-project#57788)    ## Description This change primarily converts `OpResourceAllocator` APIs to make data flow explicit by exposing required params in the APIs. Additionally: 1. Abstracting common methods inside `OpResourceAllocator` base-class. 2. Adding allocation to progress bar in verbose mode logging budgets & allocations. 3. Adding byte-size of all enqueued blocks to the progress bar ## Related issues  ## Types of change - [ ] Bug fix 🐛 - [ ] New feature ✨ - [ ] Enhancement 🚀 - [ ] Code refactoring 🔧 - [ ] Documentation update 📖 - [ ] Chore 🧹 - [ ] Style 🎨 ## Checklist **Does this PR introduce breaking changes?** - [ ] Yes ⚠️ - [ ] No  **Testing:** - [ ] Added/updated tests for my changes - [ ] Tested the changes manually - [ ] This PR is not tested ❌ _(please explain why)_ **Code Quality:** - [ ] Signed off every commit (`git commit -s`) - [ ] Ran pre-commit hooks ([setup guide](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting)) **Documentation:** - [ ] Updated documentation (if applicable) ([contribution guide](https://docs.ray.io/en/latest/ray-contribute/docs.html)) - [ ] Added new APIs to `doc/source/` (if applicable) ## Additional context  --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>

… concurrency (#59331) #57788 renamed the `max_task_concurrency()` as `get_max_concurrency_limit()`. However, the PR didn't remove the original ``max_task_concurrency()` method on the `PhysicalOperator` base class (probably an oversight?). This PR fixes the issue by removing the dead method. Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>

…59672) #57788 added `task_resource_usage` and `output_object_store_usage` to `OpResourceAllocator.max_task_output_bytes_to_read`. But, the parameters aren't actually used anywhere, so this PR removes them. --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Co-authored-by: Alexey Kudinkin <alexey.kudinkin@gmail.com>

…ay-project#59672) ray-project#57788 added `task_resource_usage` and `output_object_store_usage` to `OpResourceAllocator.max_task_output_bytes_to_read`. But, the parameters aren't actually used anywhere, so this PR removes them. --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Co-authored-by: Alexey Kudinkin <alexey.kudinkin@gmail.com> Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>

Update the panel's Prometheus expr to use ray_data_task_completion_time_excl_backpressure_s instead of ray_data_task_completion_time_without_backpressure. The metric was renamed in ray-project#57788 (op_runtime_metrics) but the data_dashboard_panels.py panel was not updated, causing the chart to show no data.

Update the panel's Prometheus expr to use ray_data_task_completion_time_excl_backpressure_s instead of ray_data_task_completion_time_without_backpressure. The metric was renamed in ray-project#57788 (op_runtime_metrics) but the data_dashboard_panels.py panel was not updated, causing the chart to show no data. Signed-off-by: kriyanshii <kriyanshishah06@gmail.com>

…tric name (#60481) ## Description Fixes the broken **"Task Completion Time Without Backpressure"** metrics chart in the Ray Data Grafana dashboard. The panel was querying `ray_data_task_completion_time_without_backpressure`, which no longer exists. PR #57788 renamed the underlying metric to `task_completion_time_excl_backpressure_s` in `op_runtime_metrics.py`, but the Grafana panel in `data_dashboard_panels.py` was not updated. This PR updates the panel's Prometheus `expr` to use `ray_data_task_completion_time_excl_backpressure_s` so the chart displays data again. **Change:** Single-line fix in `data_dashboard_panels.py` — replace the old metric name with the correct one in the panel's `expr`. The formula (average task completion time excluding backpressure over a 5-minute window) is unchanged. ## Related issues Fixes the regression from #57788 (metric rename). Related to Ray Data monitoring / dashboard. Closes: #60163 ## Additional information - **Metric flow:** `op_runtime_metrics.task_completion_time_excl_backpressure_s` → Stats uses `data_{name}` → Metrics agent adds `ray_` namespace → **`ray_data_task_completion_time_excl_backpressure_s`** - **Manual verification:** Run a Ray Data job with Grafana + Prometheus (see [cluster metrics](https://docs.ray.io/en/latest/cluster/metrics.html)), then confirm the "Task Completion Time Without Backpressure" panel shows data. Signed-off-by: kriyanshii <kriyanshishah06@gmail.com>

…tric name (ray-project#60481) ## Description Fixes the broken **"Task Completion Time Without Backpressure"** metrics chart in the Ray Data Grafana dashboard. The panel was querying `ray_data_task_completion_time_without_backpressure`, which no longer exists. PR ray-project#57788 renamed the underlying metric to `task_completion_time_excl_backpressure_s` in `op_runtime_metrics.py`, but the Grafana panel in `data_dashboard_panels.py` was not updated. This PR updates the panel's Prometheus `expr` to use `ray_data_task_completion_time_excl_backpressure_s` so the chart displays data again. **Change:** Single-line fix in `data_dashboard_panels.py` — replace the old metric name with the correct one in the panel's `expr`. The formula (average task completion time excluding backpressure over a 5-minute window) is unchanged. ## Related issues Fixes the regression from ray-project#57788 (metric rename). Related to Ray Data monitoring / dashboard. Closes: ray-project#60163 ## Additional information - **Metric flow:** `op_runtime_metrics.task_completion_time_excl_backpressure_s` → Stats uses `data_{name}` → Metrics agent adds `ray_` namespace → **`ray_data_task_completion_time_excl_backpressure_s`** - **Manual verification:** Run a Ray Data job with Grafana + Prometheus (see [cluster metrics](https://docs.ray.io/en/latest/cluster/metrics.html)), then confirm the "Task Completion Time Without Backpressure" panel shows data. Signed-off-by: kriyanshii <kriyanshishah06@gmail.com> Signed-off-by: jinbum-kim <jinbum9958@gmail.com>

…tric name (ray-project#60481) ## Description Fixes the broken **"Task Completion Time Without Backpressure"** metrics chart in the Ray Data Grafana dashboard. The panel was querying `ray_data_task_completion_time_without_backpressure`, which no longer exists. PR ray-project#57788 renamed the underlying metric to `task_completion_time_excl_backpressure_s` in `op_runtime_metrics.py`, but the Grafana panel in `data_dashboard_panels.py` was not updated. This PR updates the panel's Prometheus `expr` to use `ray_data_task_completion_time_excl_backpressure_s` so the chart displays data again. **Change:** Single-line fix in `data_dashboard_panels.py` — replace the old metric name with the correct one in the panel's `expr`. The formula (average task completion time excluding backpressure over a 5-minute window) is unchanged. ## Related issues Fixes the regression from ray-project#57788 (metric rename). Related to Ray Data monitoring / dashboard. Closes: ray-project#60163 ## Additional information - **Metric flow:** `op_runtime_metrics.task_completion_time_excl_backpressure_s` → Stats uses `data_{name}` → Metrics agent adds `ray_` namespace → **`ray_data_task_completion_time_excl_backpressure_s`** - **Manual verification:** Run a Ray Data job with Grafana + Prometheus (see [cluster metrics](https://docs.ray.io/en/latest/cluster/metrics.html)), then confirm the "Task Completion Time Without Backpressure" panel shows data. Signed-off-by: kriyanshii <kriyanshishah06@gmail.com>

…tric name (ray-project#60481) ## Description Fixes the broken **"Task Completion Time Without Backpressure"** metrics chart in the Ray Data Grafana dashboard. The panel was querying `ray_data_task_completion_time_without_backpressure`, which no longer exists. PR ray-project#57788 renamed the underlying metric to `task_completion_time_excl_backpressure_s` in `op_runtime_metrics.py`, but the Grafana panel in `data_dashboard_panels.py` was not updated. This PR updates the panel's Prometheus `expr` to use `ray_data_task_completion_time_excl_backpressure_s` so the chart displays data again. **Change:** Single-line fix in `data_dashboard_panels.py` — replace the old metric name with the correct one in the panel's `expr`. The formula (average task completion time excluding backpressure over a 5-minute window) is unchanged. ## Related issues Fixes the regression from ray-project#57788 (metric rename). Related to Ray Data monitoring / dashboard. Closes: ray-project#60163 ## Additional information - **Metric flow:** `op_runtime_metrics.task_completion_time_excl_backpressure_s` → Stats uses `data_{name}` → Metrics agent adds `ray_` namespace → **`ray_data_task_completion_time_excl_backpressure_s`** - **Manual verification:** Run a Ray Data job with Grafana + Prometheus (see [cluster metrics](https://docs.ray.io/en/latest/cluster/metrics.html)), then confirm the "Task Completion Time Without Backpressure" panel shows data. Signed-off-by: kriyanshii <kriyanshishah06@gmail.com> Signed-off-by: 400Ping <jiekaichang@apache.org>

…ay-project#59672) ray-project#57788 added `task_resource_usage` and `output_object_store_usage` to `OpResourceAllocator.max_task_output_bytes_to_read`. But, the parameters aren't actually used anywhere, so this PR removes them. --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Co-authored-by: Alexey Kudinkin <alexey.kudinkin@gmail.com>

…tric name (ray-project#60481) ## Description Fixes the broken **"Task Completion Time Without Backpressure"** metrics chart in the Ray Data Grafana dashboard. The panel was querying `ray_data_task_completion_time_without_backpressure`, which no longer exists. PR ray-project#57788 renamed the underlying metric to `task_completion_time_excl_backpressure_s` in `op_runtime_metrics.py`, but the Grafana panel in `data_dashboard_panels.py` was not updated. This PR updates the panel's Prometheus `expr` to use `ray_data_task_completion_time_excl_backpressure_s` so the chart displays data again. **Change:** Single-line fix in `data_dashboard_panels.py` — replace the old metric name with the correct one in the panel's `expr`. The formula (average task completion time excluding backpressure over a 5-minute window) is unchanged. ## Related issues Fixes the regression from ray-project#57788 (metric rename). Related to Ray Data monitoring / dashboard. Closes: ray-project#60163 ## Additional information - **Metric flow:** `op_runtime_metrics.task_completion_time_excl_backpressure_s` → Stats uses `data_{name}` → Metrics agent adds `ray_` namespace → **`ray_data_task_completion_time_excl_backpressure_s`** - **Manual verification:** Run a Ray Data job with Grafana + Prometheus (see [cluster metrics](https://docs.ray.io/en/latest/cluster/metrics.html)), then confirm the "Task Completion Time Without Backpressure" panel shows data. Signed-off-by: kriyanshii <kriyanshishah06@gmail.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>

…ay-project#57788)    ## Description This change primarily converts `OpResourceAllocator` APIs to make data flow explicit by exposing required params in the APIs. Additionally: 1. Abstracting common methods inside `OpResourceAllocator` base-class. 2. Adding allocation to progress bar in verbose mode logging budgets & allocations. 3. Adding byte-size of all enqueued blocks to the progress bar ## Related issues  ## Types of change - [ ] Bug fix 🐛 - [ ] New feature ✨ - [ ] Enhancement 🚀 - [ ] Code refactoring 🔧 - [ ] Documentation update 📖 - [ ] Chore 🧹 - [ ] Style 🎨 ## Checklist **Does this PR introduce breaking changes?** - [ ] Yes ⚠️ - [ ] No  **Testing:** - [ ] Added/updated tests for my changes - [ ] Tested the changes manually - [ ] This PR is not tested ❌ _(please explain why)_ **Code Quality:** - [ ] Signed off every commit (`git commit -s`) - [ ] Ran pre-commit hooks ([setup guide](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting)) **Documentation:** - [ ] Updated documentation (if applicable) ([contribution guide](https://docs.ray.io/en/latest/ray-contribute/docs.html)) - [ ] Added new APIs to `doc/source/` (if applicable) ## Additional context  --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

… concurrency (ray-project#59331) ray-project#57788 renamed the `max_task_concurrency()` as `get_max_concurrency_limit()`. However, the PR didn't remove the original ``max_task_concurrency()` method on the `PhysicalOperator` base class (probably an oversight?). This PR fixes the issue by removing the dead method. Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Signed-off-by: peterxcli <peterxcli@gmail.com>

…ay-project#59672) ray-project#57788 added `task_resource_usage` and `output_object_store_usage` to `OpResourceAllocator.max_task_output_bytes_to_read`. But, the parameters aren't actually used anywhere, so this PR removes them. --------- Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Co-authored-by: Alexey Kudinkin <alexey.kudinkin@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

…tric name (ray-project#60481) ## Description Fixes the broken **"Task Completion Time Without Backpressure"** metrics chart in the Ray Data Grafana dashboard. The panel was querying `ray_data_task_completion_time_without_backpressure`, which no longer exists. PR ray-project#57788 renamed the underlying metric to `task_completion_time_excl_backpressure_s` in `op_runtime_metrics.py`, but the Grafana panel in `data_dashboard_panels.py` was not updated. This PR updates the panel's Prometheus `expr` to use `ray_data_task_completion_time_excl_backpressure_s` so the chart displays data again. **Change:** Single-line fix in `data_dashboard_panels.py` — replace the old metric name with the correct one in the panel's `expr`. The formula (average task completion time excluding backpressure over a 5-minute window) is unchanged. ## Related issues Fixes the regression from ray-project#57788 (metric rename). Related to Ray Data monitoring / dashboard. Closes: ray-project#60163 ## Additional information - **Metric flow:** `op_runtime_metrics.task_completion_time_excl_backpressure_s` → Stats uses `data_{name}` → Metrics agent adds `ray_` namespace → **`ray_data_task_completion_time_excl_backpressure_s`** - **Manual verification:** Run a Ray Data job with Grafana + Prometheus (see [cluster metrics](https://docs.ray.io/en/latest/cluster/metrics.html)), then confirm the "Task Completion Time Without Backpressure" panel shows data. Signed-off-by: kriyanshii <kriyanshishah06@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

alexeykudinkin requested a review from a team as a code owner October 16, 2025 06:22

alexeykudinkin added the go add ONLY when ready to merge, run all tests label Oct 16, 2025

ray-gardener bot added the data Ray Data-related issues label Oct 16, 2025

alexeykudinkin changed the title ~~[WIP][Data] Cleaning up OpResourceAllocator APIs~~ [Data] Cleaning up OpResourceAllocator APIs Oct 17, 2025

alexeykudinkin changed the title ~~[Data] Cleaning up OpResourceAllocator APIs~~ [Data] Revisiting OpResourceAllocator to make data flow explicit Oct 17, 2025

bveeramani reviewed Oct 20, 2025

View reviewed changes

alexeykudinkin force-pushed the ak/res-mngr-clup branch from e271ecb to e763d0c Compare October 22, 2025 23:15

This comment was marked as outdated.

Sign in to view

alexeykudinkin force-pushed the ak/res-mngr-clup branch from 5b23ecb to 9d757b7 Compare October 23, 2025 05:02

alexeykudinkin requested a review from a team as a code owner October 23, 2025 05:02

alexeykudinkin changed the base branch from master to ak/bndl-blk-fix October 23, 2025 05:20

This comment was marked as outdated.

Sign in to view

alexeykudinkin force-pushed the ak/bndl-blk-fix branch from bb87078 to ea982b3 Compare October 24, 2025 06:48

bveeramani approved these changes Oct 24, 2025

View reviewed changes

python/ray/tests/test_runtime_env_working_dir.py Outdated

@ray.remote

def test_import():

import file_module

Copy link

Member

bveeramani Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated?

alexeykudinkin force-pushed the ak/bndl-blk-fix branch from 5379f9c to e8ab2c8 Compare October 27, 2025 05:05

alexeykudinkin force-pushed the ak/res-mngr-clup branch from b7ec22e to 7881742 Compare October 27, 2025 05:12

This comment was marked as outdated.

Sign in to view

alexeykudinkin force-pushed the ak/bndl-blk-fix branch from bad2cd7 to 65e7295 Compare October 27, 2025 18:07

alexeykudinkin force-pushed the ak/res-mngr-clup branch from c42f2c3 to 0b1eb1b Compare October 27, 2025 18:38

alexeykudinkin force-pushed the ak/bndl-blk-fix branch from 65e7295 to 7165108 Compare October 27, 2025 19:14

alexeykudinkin force-pushed the ak/res-mngr-clup branch from 0b1eb1b to 6d9160a Compare October 27, 2025 19:15

alexeykudinkin deleted the branch ray-project:master October 27, 2025 20:31

alexeykudinkin closed this Oct 27, 2025

alexeykudinkin reopened this Oct 27, 2025

alexeykudinkin changed the base branch from ak/bndl-blk-fix to master October 27, 2025 20:37

alexeykudinkin added 3 commits October 27, 2025 13:38

Cleaning up OpResourceAllocator

cefafa9

Signed-off-by: Alexey Kudinkin <ak@anyscale.com> # Conflicts: # python/ray/data/_internal/execution/streaming_executor_state.py Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Tidying up

2160465

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Updated refs

e579fdf

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

alexeykudinkin added 4 commits October 27, 2025 13:38

Fixed tests;

508411e

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Fixed tests

7aa6b18

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Fixing more tests

af3845a

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

lint

eeb01a7

Signed-off-by: Alexey Kudinkin <ak@anyscale.com> # Conflicts: # python/ray/data/tests/test_autoscaler.py Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

alexeykudinkin force-pushed the ak/res-mngr-clup branch from 6d9160a to eeb01a7 Compare October 27, 2025 20:38

cursor bot reviewed Oct 27, 2025

View reviewed changes

alexeykudinkin merged commit 95b011f into ray-project:master Oct 27, 2025
6 checks passed

bveeramani mentioned this pull request Dec 10, 2025

[Data] Remove duplicate PhysicalOperator method to get max operator concurrency #59331

Merged

bveeramani mentioned this pull request Dec 26, 2025

[Data] Remove dead parameters from max_task_output_bytes_to_read #59672

Merged

bveeramani mentioned this pull request Jan 15, 2026

[Data] "Task Completion Time Without Backpressure" Grafana panel is broken #60163

Closed

kriyanshii mentioned this pull request Jan 25, 2026

Fix Task Completion Time Without Backpressure Grafana panel metric name #60481

Merged

		@@ -266,6 +267,19 @@ def total_enqueued_input_bundles(self) -> int:

		return self._pending_dispatch_input_bundles_count() + internal_queue_size

	avg_inputs_per_task = (
	output_dependency.metrics.num_task_inputs_processed
	/ max(output_dependency.metrics.num_tasks_finished, 1)
	)

Conversation

alexeykudinkin commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Types of change

Checklist

Additional context

Uh oh!

gemini-code-assist bot commented Oct 16, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

bveeramani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot Oct 27, 2025

Choose a reason for hiding this comment

Bug: Inconsistent Return Types in Resource Management

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alexeykudinkin commented Oct 16, 2025 •

edited

Loading