-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Closed
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tdataRay Data-related issuesRay Data-related issues
Description
What happened + What you expected to happen
Operator fusion can change the target block size of a Read op. This happens if there is a downstream AllToAll op, since AllToAll ops have a different and larger default block size from Map ops. Currently there is a circular dependency in the optimizer rules:
- SplitReadOutputBlocks rule: Read op determines its split factor based on its target block size.
- OperatorFusion rule: When an AllToAll op gets fused with a Map op, the immediately upstream op inherits the larger target block size. This upstream op may be a Read op.
However, SplitReadOutputBlocks needs the inherited target block size in step 1. This means that for all-to-all ops, the read stage's computed parallelism may be higher than it's supposed to be.
Users can work around this issue by setting DataContext.target_max_block_size = DataContext.target_shuffle_max_block_size.
Versions / Dependencies
2.7+
Reproduction script
ray.data.range(N).random_shuffle()
Issue Severity
None
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tdataRay Data-related issuesRay Data-related issues