-
Notifications
You must be signed in to change notification settings - Fork 7.3k
[Data] Streaming Partition enforce row_num per block #57984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
raulchen
merged 62 commits into
ray-project:master
from
owenowenisme:data/use-map-op-for-streaming-repartition
Nov 14, 2025
Merged
Changes from 8 commits
Commits
Show all changes
62 commits
Select commit
Hold shift + click to select a range
7e39adb
update
owenowenisme ad81683
Merge remote-tracking branch 'upstream/master' into data/use-map-op-f…
owenowenisme 02a27c5
rename _schedule_task_input
owenowenisme 1a28369
update
owenowenisme 344c0c7
make a task can output multiple blocks
owenowenisme 8c63cd0
remove new term chunk
owenowenisme 6f47d8a
resolve comment
owenowenisme 6796f01
merge _build_task_from_single_block_full_blocks & _build_single_outpu…
owenowenisme 0c2e9a1
Merge branch 'master' into data/use-map-op-for-streaming-repartition
owenowenisme ff71cbe
Merge remote-tracking branch 'upstream/master' into data/use-map-op-f…
owenowenisme 05c5e61
Merge remote-tracking branch 'upstream/master' into data/use-map-op-f…
owenowenisme 13d2280
remove _TaskInput
owenowenisme a60fbe6
make StreamingRepartition default preserve order
owenowenisme 25be194
rename slice_rows num_rows_in_slice
owenowenisme 3c0b02e
unify the interface for block_ref_bundler
owenowenisme e09ce6d
make enforce_target_num_rows_per_block True
owenowenisme 4612f71
fix
owenowenisme 944f920
Merge remote-tracking branch 'upstream/master' into data/use-map-op-f…
owenowenisme 7fd7c49
rename StreamingRepartitionTaskBuilder
owenowenisme ba7f5f0
remove set_block_ref_bundler and put the logic into constructor
owenowenisme 37a51a4
keep track on ref bundle fully consumed
owenowenisme 1bdf310
remove input_bundle from task_context
owenowenisme f1dc0f7
Merge branch 'master' into data/use-map-op-for-streaming-repartition
owenowenisme c5f3532
Merge remote-tracking branch 'upstream/master' into data/use-map-op-f…
owenowenisme 7577cef
remove num_rows_in_slice property
owenowenisme 4e83486
rename task_kwargs into task_kwargs_for_bundle
owenowenisme 8c2e4cf
make slice explicit in map op
owenowenisme d51180a
update
owenowenisme 75e3435
update
owenowenisme 6231a20
remove preserve order or streaming repartition
owenowenisme 98493fa
add class description for BaseRefBundler
owenowenisme 3106805
remove streaming_repartition_block_fn
owenowenisme f0c0bd3
make block slice in ref bundle
owenowenisme 59d419c
update
owenowenisme 130fd44
remove block_index & output_index from BlockSlice
owenowenisme ea7a3dc
update
owenowenisme 0ab949e
added slice row and bytes calculation in ref bundle with unit test
owenowenisme 96ffe1a
update
owenowenisme e1bbb0e
add ref bundler func and unit test
owenowenisme 6e17feb
refactor
owenowenisme 6e1ed9f
update to track consumed input ref
owenowenisme a08fe95
update
owenowenisme ca02719
refine
owenowenisme bb8fdfd
Merge branch 'master' into data/use-map-op-for-streaming-repartition
owenowenisme d60d898
Merge remote-tracking branch 'upstream/master' into data/use-map-op-f…
owenowenisme e14c832
rename consumed_bundle to sliced bundle
owenowenisme fd15bff
use len function in sr num_blocks
owenowenisme 75bed79
refactor _try_build_ready_bundle
owenowenisme 98aa170
make slice a method of Refbundle
owenowenisme e23874e
make merge_ref_bundles classmethod of ref_bundle
owenowenisme 9bf1a09
update
owenowenisme c22c97f
use None to represent full block
owenowenisme c9b66e0
add more test for test_slice_ref_bundle_invalid_rows
owenowenisme dfe11ab
add __str__ method
owenowenisme 82ab74b
check 0 of num_rows
owenowenisme 589b97f
Merge branch 'master' into data/use-map-op-for-streaming-repartition
owenowenisme 5b23b18
fix logic of row_need_from_last_block and add test for ref_bundle met…
owenowenisme 970b551
add bundler testing
owenowenisme a4390f9
add assertion to rows_needed_from_last_bundle
owenowenisme 042312f
update
owenowenisme 2868f9a
Merge branch 'master' into data/use-map-op-for-streaming-repartition
owenowenisme 2c5cc47
make test streaming repartition bundler unit test
owenowenisme File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -344,19 +344,26 @@ class StreamingRepartition(AbstractMap): | |||||
| Args: | ||||||
| target_num_rows_per_block: The target number of rows per block granularity for | ||||||
| streaming repartition. | ||||||
| enforce_target_num_rows_per_block: Whether to enforce the target number of rows per block. Default to False. | ||||||
| """ | ||||||
|
|
||||||
| def __init__( | ||||||
| self, | ||||||
| input_op: LogicalOperator, | ||||||
| target_num_rows_per_block: int, | ||||||
| enforce_target_num_rows_per_block: bool = False, | ||||||
|
||||||
| enforce_target_num_rows_per_block: bool = False, | |
| strict_target_num_rows_per_block: bool = False, |
Member
Author
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed strict_target_num_rows_per_block since we always enable exact size blcok
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.