[Data] Replace number of bundles with proper number of blocks#58030
[Data] Replace number of bundles with proper number of blocks#58030alexeykudinkin merged 20 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request is a good step towards making block counting more accurate by explicitly using the number of blocks instead of assuming one block per bundle. The refactoring is applied across many files. However, I've found a few critical issues where the new counting logic is implemented incorrectly, which could lead to bugs in metrics, backpressure, and autoscaling. I've also included some minor suggestions for improving code style and efficiency.
python/ray/data/_internal/execution/operators/union_operator.py
Outdated
Show resolved
Hide resolved
.../ray/data/_internal/execution/backpressure_policy/downstream_capacity_backpressure_policy.py
Show resolved
Hide resolved
python/ray/data/_internal/execution/operators/base_physical_operator.py
Outdated
Show resolved
Hide resolved
bb87078 to
ea982b3
Compare
| @ray.remote | ||
| def test_import(): | ||
| import file_module | ||
|
|
There was a problem hiding this comment.
Unrelated change for this and python/ray/tests/test_runtime_env_working_dir_3.py?
| self._bundle_buffer_size = 0 | ||
| self._bundle_buffer_size_bytes = 0 |
There was a problem hiding this comment.
what's the difference between these 2 vars?
| def internal_queue_size(self) -> int: | ||
| return len(self._buffer) | ||
| def internal_queue_num_blocks(self) -> int: | ||
| return sum(len(b.block_refs) for b in self._buffer) |
There was a problem hiding this comment.
if b is type RefBundle, you can do len(b)
5379f9c to
e8ab2c8
Compare
bad2cd7 to
65e7295
Compare
…bundle Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
… num bytes held inside internal queue; Revisited `InternalQueueOperatorMixin.internal_queue_size` to remove assumption that every bundle holds just 1 block Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Fixing tests Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
65e7295 to
7165108
Compare
…oject#58030) ## Description Currently, we implicitly assume that `RefBundle` holds exactly 1 block. That's not a safe assumption, and this change is addressing that by explicitly referring to number of blocks instead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
…oject#58030) ## Description Currently, we implicitly assume that `RefBundle` holds exactly 1 block. That's not a safe assumption, and this change is addressing that by explicitly referring to number of blocks instead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…oject#58030) ## Description Currently, we implicitly assume that `RefBundle` holds exactly 1 block. That's not a safe assumption, and this change is addressing that by explicitly referring to number of blocks instead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
…oject#58030) ## Description Currently, we implicitly assume that `RefBundle` holds exactly 1 block. That's not a safe assumption, and this change is addressing that by explicitly referring to number of blocks instead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
Currently, we implicitly assume that
RefBundleholds exactly 1 block. That's not a safe assumption, and this change is addressing that by explicitly referring to number of blocks insteadRelated issues
Additional information