[data] Disable block slicing for shuffle ops (#40538)#40602
Merged
vitsai merged 1 commit intoray-project:releases/2.8.0from Oct 24, 2023
Merged
[data] Disable block slicing for shuffle ops (#40538)#40602vitsai merged 1 commit intoray-project:releases/2.8.0from
vitsai merged 1 commit intoray-project:releases/2.8.0from
Conversation
ray-project#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly. This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number Closes ray-project#40518. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
raulchen
approved these changes
Oct 23, 2023
zhe-thoughts
approved these changes
Oct 24, 2023
Contributor
zhe-thoughts
left a comment
There was a problem hiding this comment.
This is a blocker and let's pick
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly.
This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number
Closes #40518.
Why are these changes needed?
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.