Skip to content

[tune] Enable Train v2 in doc examples#56820

Merged
justinvyu merged 11 commits intoray-project:masterfrom
justinvyu:tune_doc_enable_v2
Sep 24, 2025
Merged

[tune] Enable Train v2 in doc examples#56820
justinvyu merged 11 commits intoray-project:masterfrom
justinvyu:tune_doc_enable_v2

Conversation

@justinvyu
Copy link
Contributor

@justinvyu justinvyu commented Sep 23, 2025

Summary

Flip the flag for Tune doctest CI in preparation for turning on Train V2 by default. This doesn't have any behavior change, but this asserts that ray.train -> ray.tune updates have all happened.

Note that a few tests have been left behind due to Tune lightgbm and Keras callbacks not being updated yet. We need to do the equivalent of this PR: #54787

  • lightgbm_example
  • lightgbm_example_cv
  • tune_mnist_keras

Deletes horovod_simple.ipynb example because we don't support HorovodTrainer anymore.

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@justinvyu justinvyu requested review from a team as code owners September 23, 2025 02:30
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request focuses on updating documentation examples to enable and use the new Ray Tune v2 API. The changes include migrating from older APIs like session.report to tune.report, updating checkpointing logic, and refactoring examples to use tune.Tuner instead of older patterns. Overall, the changes are consistent and align with the goal of adopting the v2 API. However, I've found one critical issue where a test is being enabled for a notebook file that is being deleted in this same PR, which will break the build.

cursor[bot]

This comment was marked as outdated.

@ray-gardener ray-gardener bot added the tune Tune-related issues label Sep 23, 2025
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@justinvyu justinvyu changed the title [tune] Enable v2 in doc examples [tune] Enable Train v2 in doc examples Sep 23, 2025
Copy link
Contributor

@matthewdeng matthewdeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


exclude = [
"pbt_ppo_example.ipynb",
"tune-xgboost.ipynb",
"lightgbm_example.ipynb", # TODO: Uncomment after fixing Tune lightgbm callback.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure this is tracked somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include in PR description why we're removing Horovod here? Tune should still work directly with Horovod right? Is this just general cleanup since Horovod doesn't support py312?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this uses horovod trainer which we don't have anymore

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
" }\n",
"\n",
" checkpoint = train.get_checkpoint()\n",
" checkpoint = tune.get_checkpoint()\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing Import Causes Checkpoint Loading Failure

The tune.get_checkpoint() call was introduced without a visible import for the tune module. This likely causes a NameError, preventing the notebook from loading checkpoints and resuming training.

Fix in Cursor Fix in Web

@justinvyu justinvyu enabled auto-merge (squash) September 24, 2025 00:42
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Sep 24, 2025
@justinvyu justinvyu merged commit a7cd10b into ray-project:master Sep 24, 2025
8 checks passed
@justinvyu justinvyu deleted the tune_doc_enable_v2 branch September 24, 2025 02:15
marcostephan pushed a commit to marcostephan/ray that referenced this pull request Sep 24, 2025
Flip the flag for Tune doctest CI in preparation for turning on Train V2
by default. This doesn't have any behavior change, but this asserts that
ray.train -> ray.tune updates have all happened.

Note that a few tests have been left behind due to Tune lightgbm and
Keras callbacks not being updated yet. We need to do the equivalent of
this PR: ray-project#54787
* `lightgbm_example`
* `lightgbm_example_cv`
* `tune_mnist_keras`

Deletes `horovod_simple.ipynb` example because we don't support
`HorovodTrainer` anymore.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Marco Stephan <marco@magic.dev>
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
Flip the flag for Tune doctest CI in preparation for turning on Train V2
by default. This doesn't have any behavior change, but this asserts that
ray.train -> ray.tune updates have all happened.

Note that a few tests have been left behind due to Tune lightgbm and
Keras callbacks not being updated yet. We need to do the equivalent of
this PR: #54787
* `lightgbm_example`
* `lightgbm_example_cv`
* `tune_mnist_keras`

Deletes `horovod_simple.ipynb` example because we don't support
`HorovodTrainer` anymore.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
dstrodtman pushed a commit to dstrodtman/ray that referenced this pull request Oct 6, 2025
Flip the flag for Tune doctest CI in preparation for turning on Train V2
by default. This doesn't have any behavior change, but this asserts that
ray.train -> ray.tune updates have all happened.

Note that a few tests have been left behind due to Tune lightgbm and
Keras callbacks not being updated yet. We need to do the equivalent of
this PR: ray-project#54787
* `lightgbm_example`
* `lightgbm_example_cv`
* `tune_mnist_keras`

Deletes `horovod_simple.ipynb` example because we don't support
`HorovodTrainer` anymore.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
justinvyu added a commit that referenced this pull request Oct 16, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: #57534, #57256, #56868, #56820, #56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by #57042 and
#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
Flip the flag for Tune doctest CI in preparation for turning on Train V2
by default. This doesn't have any behavior change, but this asserts that
ray.train -> ray.tune updates have all happened.

Note that a few tests have been left behind due to Tune lightgbm and
Keras callbacks not being updated yet. We need to do the equivalent of
this PR: ray-project#54787
* `lightgbm_example`
* `lightgbm_example_cv`
* `tune_mnist_keras`

Deletes `horovod_simple.ipynb` example because we don't support
`HorovodTrainer` anymore.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 22, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: xgui <xgui@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Oct 23, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: #57534, #57256, #56868, #56820, #56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by #57042 and
#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
Flip the flag for Tune doctest CI in preparation for turning on Train V2
by default. This doesn't have any behavior change, but this asserts that
ray.train -> ray.tune updates have all happened.

Note that a few tests have been left behind due to Tune lightgbm and
Keras callbacks not being updated yet. We need to do the equivalent of
this PR: ray-project#54787
* `lightgbm_example`
* `lightgbm_example_cv`
* `tune_mnist_keras`

Deletes `horovod_simple.ipynb` example because we don't support
`HorovodTrainer` anymore.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
Flip the flag for Tune doctest CI in preparation for turning on Train V2
by default. This doesn't have any behavior change, but this asserts that
ray.train -> ray.tune updates have all happened.

Note that a few tests have been left behind due to Tune lightgbm and
Keras callbacks not being updated yet. We need to do the equivalent of
this PR: ray-project#54787
* `lightgbm_example`
* `lightgbm_example_cv`
* `tune_mnist_keras`

Deletes `horovod_simple.ipynb` example because we don't support
`HorovodTrainer` anymore.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests tune Tune-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants