Skip to content

[Data] Fix operator name in metadata timeout warning and improve mess…#61401

Open
kaveti wants to merge 1 commit intoray-project:masterfrom
kaveti:fix/data-op-task-operator-name-warning
Open

[Data] Fix operator name in metadata timeout warning and improve mess…#61401
kaveti wants to merge 1 commit intoray-project:masterfrom
kaveti:fix/data-op-task-operator-name-warning

Conversation

@kaveti
Copy link

@kaveti kaveti commented Feb 28, 2026

[Data] Fix operator name in metadata timeout warning and improve message clarity

Pass the actual operator name to DataOpTask so the GetTimeoutError warning identifies the specific pipeline stage instead of always showing 'DataOpTask'. Also rewrites the message to include the timeout duration and actionable causes.

Fixes #61334

Thank you for contributing to Ray! 🚀
Please review the Ray Contribution Guide before opening a pull request.

⚠️ Remove these instructions before submitting your PR.

💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete.

Description

Briefly describe what this PR accomplishes and why it's needed.

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

@kaveti kaveti requested a review from a team as a code owner February 28, 2026 18:13
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the warning message for metadata timeouts in Ray Data. By passing the operator name to DataOpTask, the warning now clearly identifies which pipeline stage is affected, which is a great improvement for debuggability. The warning message itself has been rewritten to be more informative, including the timeout duration and potential causes. The implementation is clean and correctly applied to MapOperator and HashShuffleOperator. The changes look good and improve the user experience.

…age clarity

Pass the actual operator name to DataOpTask so the GetTimeoutError warning
identifies the specific pipeline stage instead of always showing 'DataOpTask'.
Also rewrites the message to include the timeout duration and actionable causes.

Fixes ray-project#61334

Signed-off-by: rkaveti <kavetiraviteja1992@gmail.com>
@kaveti kaveti force-pushed the fix/data-op-task-operator-name-warning branch from 1ef62cd to 507e5ec Compare February 28, 2026 18:19
@ray-gardener ray-gardener bot added data Ray Data-related issues community-contribution Contributed by the community labels Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Data] Fix unclear metadata warning and incorrect operator name logging Ray fails to serialize self-reference objects

1 participant