[Data] Fix obj_store_mem_max_pending_output_per_task reporting#58864
[Data] Fix obj_store_mem_max_pending_output_per_task reporting#58864
Conversation
Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request correctly adjusts the memory estimation for pending task outputs by incorporating MAX_SAFE_BLOCK_SIZE_FACTOR when no block samples are available. The accompanying tests are thorough and validate the new logic across various scenarios. I have one suggestion to refactor the implementation for better code clarity and to reduce duplication.
| if context.target_max_block_size is None: | ||
| return None | ||
| bytes_per_output = context.target_max_block_size | ||
| bytes_per_output = ( | ||
| DEFAULT_TARGET_MAX_BLOCK_SIZE * MAX_SAFE_BLOCK_SIZE_FACTOR | ||
| ) | ||
| else: | ||
| bytes_per_output = ( | ||
| context.target_max_block_size * MAX_SAFE_BLOCK_SIZE_FACTOR | ||
| ) |
There was a problem hiding this comment.
The logic for determining bytes_per_output when a sample is not available can be simplified to reduce code duplication. Both branches of the if/else statement perform the same multiplication with MAX_SAFE_BLOCK_SIZE_FACTOR. You can determine the target_block_size first and then perform the multiplication once.
target_block_size = context.target_max_block_size
if target_block_size is None:
target_block_size = DEFAULT_TARGET_MAX_BLOCK_SIZE
bytes_per_output = target_block_size * MAX_SAFE_BLOCK_SIZE_FACTOR
python/ray/data/_internal/execution/interfaces/op_runtime_metrics.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Bug: Tests not updated for MAX_SAFE_BLOCK_SIZE_FACTOR change
The code change multiplies bytes_per_output by MAX_SAFE_BLOCK_SIZE_FACTOR when estimating pending task outputs, but three existing test assertions in test_task_pool_resource_reporting, test_task_pool_resource_reporting_with_bundling, and test_actor_pool_resource_reporting still expect the old calculation without the factor. These tests will fail because the actual memory usage will be 1.5x larger than their expected values.
python/ray/data/tests/test_executor_resource_management.py#L222-L228
ray/python/ray/data/tests/test_executor_resource_management.py
Lines 222 to 228 in 523bb08
python/ray/data/tests/test_executor_resource_management.py#L282-L288
ray/python/ray/data/tests/test_executor_resource_management.py
Lines 282 to 288 in 523bb08
python/ray/data/tests/test_executor_resource_management.py#L355-L361
ray/python/ray/data/tests/test_executor_resource_management.py
Lines 355 to 361 in 523bb08
…roject#58864) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] Fix obj_store_mem_max_pending_output_per_task reporting Fix `obj_store_mem_max_pending_output_per_task` when sample is unavailable to factor in, - `bytes_per_output` = `MAX_SAFE_BLOCK_SIZE_FACTOR` * `target_max_block_size`. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
…roject#58864) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] Fix obj_store_mem_max_pending_output_per_task reporting Fix `obj_store_mem_max_pending_output_per_task` when sample is unavailable to factor in, - `bytes_per_output` = `MAX_SAFE_BLOCK_SIZE_FACTOR` * `target_max_block_size`. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
…roject#58864) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] Fix obj_store_mem_max_pending_output_per_task reporting Fix `obj_store_mem_max_pending_output_per_task` when sample is unavailable to factor in, - `bytes_per_output` = `MAX_SAFE_BLOCK_SIZE_FACTOR` * `target_max_block_size`. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
[Data] Fix obj_store_mem_max_pending_output_per_task reporting
Fix
obj_store_mem_max_pending_output_per_taskwhen sample is unavailable to factor in,bytes_per_output=MAX_SAFE_BLOCK_SIZE_FACTOR*target_max_block_size.Related issues
Additional information