[train][tune] Fix LightGBM v2 callbacks for Tune only usage#57042
Merged
justinvyu merged 7 commits intoray-project:masterfrom Oct 6, 2025
Merged
[train][tune] Fix LightGBM v2 callbacks for Tune only usage#57042justinvyu merged 7 commits intoray-project:masterfrom
justinvyu merged 7 commits intoray-project:masterfrom
Conversation
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Contributor
There was a problem hiding this comment.
Code Review
This pull request refactors the LightGBM callbacks to decouple Ray Tune and Ray Train usage, which is a great improvement. A base RayReportCallback is introduced with specific implementations for Tune and Train. My review focuses on improving the new abstractions and reducing code redundancy. I've pointed out an incorrect method signature in the new abstract base class and suggested removing redundant __init__ methods in the subclasses to make the code cleaner and more maintainable.
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
justinvyu
approved these changes
Oct 3, 2025
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
dstrodtman
pushed a commit
that referenced
this pull request
Oct 6, 2025
1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
eicherseiji
pushed a commit
to eicherseiji/ray
that referenced
this pull request
Oct 6, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Seiji Eicher <seiji@anyscale.com>
eicherseiji
pushed a commit
to eicherseiji/ray
that referenced
this pull request
Oct 6, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
eicherseiji
pushed a commit
to eicherseiji/ray
that referenced
this pull request
Oct 6, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
eicherseiji
pushed a commit
to eicherseiji/ray
that referenced
this pull request
Oct 6, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
eicherseiji
pushed a commit
to eicherseiji/ray
that referenced
this pull request
Oct 6, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
liulehui
added a commit
to liulehui/ray
that referenced
this pull request
Oct 9, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
joshkodi
pushed a commit
to joshkodi/ray
that referenced
this pull request
Oct 13, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Josh Kodi <joshkodi@gmail.com>
7 tasks
justinvyu
added a commit
that referenced
this pull request
Oct 16, 2025
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: #57534, #57256, #56868, #56820, #56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by #57042 and #57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>
justinyeh1995
pushed a commit
to justinyeh1995/ray
that referenced
this pull request
Oct 20, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
justinyeh1995
pushed a commit
to justinyeh1995/ray
that referenced
this pull request
Oct 20, 2025
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>
xinyuangui2
pushed a commit
to xinyuangui2/ray
that referenced
this pull request
Oct 22, 2025
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com>
elliot-barn
pushed a commit
that referenced
this pull request
Oct 23, 2025
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: #57534, #57256, #56868, #56820, #56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by #57042 and #57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
landscapepainter
pushed a commit
to landscapepainter/ray
that referenced
this pull request
Nov 17, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
landscapepainter
pushed a commit
to landscapepainter/ray
that referenced
this pull request
Nov 17, 2025
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Aydin-ab
pushed a commit
to Aydin-ab/ray-aydin
that referenced
this pull request
Nov 19, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Aydin-ab
pushed a commit
to Aydin-ab/ray-aydin
that referenced
this pull request
Nov 19, 2025
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier
pushed a commit
to Future-Outlier/ray
that referenced
this pull request
Dec 7, 2025
…ect#57042) 1. in the ray train [revamp REP](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage), we decouple the ray train/ray tune dependency. 2. Hence, when using RayTrainReportCallback when reporting metrics or checkpoint: the v2 context api will throw RuntimeError that TrainFnUtils is not found. 3. in this PR, refactor the Callback by inheriting the same base class but using `ray.tune.report` for tune only and `ray.train.report` for `RayTrainReportCallback` based on migration example [here](https://github.com/ray-project/enhancements/blob/main/reps/2024-10-18-train-tune-api-revamp/2024-10-18-train-tune-api-revamp.md#tune-only-usage) to further differentiate these callbacks. --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Future-Outlier
pushed a commit
to Future-Outlier/ray
that referenced
this pull request
Dec 7, 2025
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
ray.tune.reportfor tune only andray.train.reportforRayTrainReportCallbackbased on migration example here to further differentiate these callbacks.Related issue number
Checks
git commit -s) in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.Note
Refactors LightGBM callback into a shared base and implements separate Train and Tune callbacks with correct checkpointing/reporting for each API.
RayReportCallbackwith abstract methods for checkpointing and reporting.RayTrainReportCallbacknow subclassesRayReportCallback, usingray.train.reportand rank-aware checkpointing.TuneReportCheckpointCallbacksubclass usingray.tune.reportandtune.Checkpoint(no rank check), replacing prior alias.Written by Cursor Bugbot for commit 8e6abfb. This will update automatically on new commits. Configure here.