Skip to content

[Data] Fixing ReorderingBundleQueue handling of empty output sequences#60470

Merged
alexeykudinkin merged 14 commits intomasterfrom
ak/reord-que-fix
Jan 28, 2026
Merged

[Data] Fixing ReorderingBundleQueue handling of empty output sequences#60470
alexeykudinkin merged 14 commits intomasterfrom
ak/reord-que-fix

Conversation

@alexeykudinkin
Copy link
Contributor

@alexeykudinkin alexeykudinkin commented Jan 24, 2026

Description

This PR revisits ReorderingBundleQueue to move pointer advancements from get_next_inner and finalize into has_next method to guarantee that the queue will not get stuck with any operations sequence.

Currently, ReorderingBundleQueue could still get stuck in case of the sequence captured in test_ordered_queue_getting_stuck.

The queue is guaranteed to traverse through all bundles so long as all keys are finalized (ie tasks finished).

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

@alexeykudinkin alexeykudinkin requested a review from a team as a code owner January 24, 2026 03:58
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an issue where _OrderedOutputQueue could get stuck by refactoring its state transition logic. My review identified a critical issue in the new implementation of _OrderedOutputQueue.has_next(). The updated logic can cause a crash by returning True when no output is available, which violates the has_next/get_next contract. I've provided a code suggestion to fix this bug while preserving the intended fix for the 'stuck' issue.

@ray-gardener ray-gardener bot added the data Ray Data-related issues label Jan 24, 2026
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
@alexeykudinkin alexeykudinkin changed the title [Data] Fixed _OrderedOutputQueue to avoid getting stuck [Data] Fixed ReorderingBundleQueue to avoid getting stuck Jan 27, 2026
@alexeykudinkin alexeykudinkin changed the title [Data] Fixed ReorderingBundleQueue to avoid getting stuck [Data] Streamlining ReorderingBundleQueue Jan 27, 2026
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Tidying up

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
@alexeykudinkin alexeykudinkin changed the title [Data] Streamlining ReorderingBundleQueue [Data] Fixing ReorderingBundleQueue handling of empty output sequences Jan 28, 2026
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Copy link
Contributor

@iamjustinhsu iamjustinhsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!!


@override
def peek_next(self) -> Optional[RefBundle]:
return (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to rotate the pointer here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Comment on lines +226 to +228
# - `has_next` would return False, (_inner[_current_key] is empty)
# - `get_next` will never be invoked (b/c `has_next` returns false)
# - `finalize(key=1)` has already been invoked, no pointer advancement will happen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these comments and the above needed? I think they might be too implementation-detailed specific

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This is a test so extra verbosity is fine
  • Wanted to provide ample comment to explain how it would get stuck

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
@alexeykudinkin alexeykudinkin enabled auto-merge (squash) January 28, 2026 01:04
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Jan 28, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Comment on lines +235 to +236
assert queue.peek_next() is bundle2
assert queue.get_next() is bundle2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not always go into this case? (since target_op=="peek" also covers the "get" case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

B/c we want to check that get also does advance

@alexeykudinkin alexeykudinkin merged commit 0c18b72 into master Jan 28, 2026
7 of 8 checks passed
@alexeykudinkin alexeykudinkin deleted the ak/reord-que-fix branch January 28, 2026 01:52
jinbum-kim pushed a commit to jinbum-kim/ray that referenced this pull request Jan 29, 2026
…ces (ray-project#60470)

## Description

This PR revisits `ReorderingBundleQueue` to move pointer advancements
from `get_next_inner` and `finalize` into `has_next` method to guarantee
that the queue will not get stuck with any operations sequence.

Currently, `ReorderingBundleQueue` could still get stuck in case of the
sequence captured in `test_ordered_queue_getting_stuck`.

The queue is guaranteed to traverse through all bundles so long as all
keys are finalized (ie tasks finished).

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: jinbum-kim <jinbum9958@gmail.com>
limarkdcunha pushed a commit to limarkdcunha/ray that referenced this pull request Jan 29, 2026
…ces (ray-project#60470)

## Description

This PR revisits `ReorderingBundleQueue` to move pointer advancements
from `get_next_inner` and `finalize` into `has_next` method to guarantee
that the queue will not get stuck with any operations sequence.

Currently, `ReorderingBundleQueue` could still get stuck in case of the
sequence captured in `test_ordered_queue_getting_stuck`.

The queue is guaranteed to traverse through all bundles so long as all
keys are finalized (ie tasks finished).

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
400Ping pushed a commit to 400Ping/ray that referenced this pull request Feb 1, 2026
…ces (ray-project#60470)

## Description

This PR revisits `ReorderingBundleQueue` to move pointer advancements
from `get_next_inner` and `finalize` into `has_next` method to guarantee
that the queue will not get stuck with any operations sequence.

Currently, `ReorderingBundleQueue` could still get stuck in case of the
sequence captured in `test_ordered_queue_getting_stuck`.

The queue is guaranteed to traverse through all bundles so long as all
keys are finalized (ie tasks finished).

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: 400Ping <jiekaichang@apache.org>
ans9868 pushed a commit to ans9868/ray that referenced this pull request Feb 18, 2026
…ces (ray-project#60470)

## Description

This PR revisits `ReorderingBundleQueue` to move pointer advancements
from `get_next_inner` and `finalize` into `has_next` method to guarantee
that the queue will not get stuck with any operations sequence.

Currently, `ReorderingBundleQueue` could still get stuck in case of the
sequence captured in `test_ordered_queue_getting_stuck`.

The queue is guaranteed to traverse through all bundles so long as all
keys are finalized (ie tasks finished).

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Adel Nour <ans9868@nyu.edu>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…ces (ray-project#60470)

## Description

This PR revisits `ReorderingBundleQueue` to move pointer advancements
from `get_next_inner` and `finalize` into `has_next` method to guarantee
that the queue will not get stuck with any operations sequence.

Currently, `ReorderingBundleQueue` could still get stuck in case of the
sequence captured in `test_ordered_queue_getting_stuck`.

The queue is guaranteed to traverse through all bundles so long as all
keys are finalized (ie tasks finished).

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…ces (ray-project#60470)

## Description

This PR revisits `ReorderingBundleQueue` to move pointer advancements
from `get_next_inner` and `finalize` into `has_next` method to guarantee
that the queue will not get stuck with any operations sequence.

Currently, `ReorderingBundleQueue` could still get stuck in case of the
sequence captured in `test_ordered_queue_getting_stuck`.

The queue is guaranteed to traverse through all bundles so long as all
keys are finalized (ie tasks finished).

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray fails to serialize self-reference objects

2 participants