[data] Slice output blocks to respect target block size#40248
Merged
stephanie-wang merged 2 commits intoray-project:masterfrom Oct 12, 2023
Merged
[data] Slice output blocks to respect target block size#40248stephanie-wang merged 2 commits intoray-project:masterfrom
stephanie-wang merged 2 commits intoray-project:masterfrom
Conversation
Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
raulchen
approved these changes
Oct 11, 2023
Contributor
raulchen
left a comment
There was a problem hiding this comment.
Nice fix. Have you run the benchmarks? Would like to learn the perf impact.
Contributor
Author
Good idea, will do this. |
Contributor
Author
|
Did some spot checks on the single-node performance benchmarks, and seems like there's on obvious difference. |
stephanie-wang
added a commit
to stephanie-wang/ray
that referenced
this pull request
Oct 20, 2023
…project#40248)" This reverts commit d5f1eed.
This was referenced Oct 20, 2023
stephanie-wang
added a commit
that referenced
this pull request
Oct 23, 2023
#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly. This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number Closes #40518. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
stephanie-wang
added a commit
to stephanie-wang/ray
that referenced
this pull request
Oct 23, 2023
ray-project#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly. This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number Closes ray-project#40518. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
8 tasks
vitsai
pushed a commit
that referenced
this pull request
Oct 24, 2023
#40248 changed output block creation so that when a task produces its output blocks, it will try to slice them before yielding to respect the target block size. Unfortunately, all-to-all ops currently don't support dynamic block splitting. This means that if we try to fuse an upstream map iterator with an all-to-all op, the all-to-all task will still have to fuse all of the sliced blocks back together again. This seems to increase memory usage significantly. This PR avoids this issue by overriding the upstream map iterator's target block size to infinity when it is fused with an all-to-all op. This also adds a logger warning for how to workaround. Related issue number Closes #40518. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
This was referenced Oct 28, 2023
can-anyscale
pushed a commit
that referenced
this pull request
Oct 30, 2023
This addresses #40759 and #38400 for the 2.8 release branch. This change OR reverting #40248 seems to fix #40759, but the root cause has not been identified yet. For #38400, we will merge a longer-term fix to master for 2.9. This PR should be safe since it reverts Data block size back to the 2.7 default. Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
This was referenced Oct 31, 2023
stephanie-wang
added a commit
that referenced
this pull request
Nov 1, 2023
With #40248, block sizes are now respected. This increases the default shuffle block size to 1GiB, which restores the previous behavior in the release test dataset_shuffle_sort_1tb. There is a possibility that this increases worker heap memory pressure during shuffle operations, but it can be resolved by overriding DataContext. Related issue number Closes #38400. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
ujjawal-khare
pushed a commit
to ujjawal-khare-27/ray
that referenced
this pull request
Nov 29, 2023
With ray-project#40248, block sizes are now respected. This increases the default shuffle block size to 1GiB, which restores the previous behavior in the release test dataset_shuffle_sort_1tb. There is a possibility that this increases worker heap memory pressure during shuffle operations, but it can be resolved by overriding DataContext. Related issue number Closes ray-project#38400. --------- Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
This slices a task's output blocks to ensure that we respect the target max block size. This can cause a performance penalty for cases where the batch size is misaligned with the output block size, but this is necessary for stability and can be optimized later (by auto-choosing a better batch size).
Related issue number
#40026.
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.