[doc][train] Recommend tree_learner="data_parallel" in examples to enable distributed lightgbm training#58709
Conversation
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request is a valuable improvement for users of LightGBM with Ray Train. It correctly identifies a common pitfall where users might not be running in a distributed fashion and addresses it by updating documentation, examples, and the legacy trainer implementation to include tree_learner="data_parallel" and pre_partition=True. The changes are consistent and well-executed. I have one minor suggestion to add a comment to the implementation to improve long-term maintainability by explaining why these default parameters are being set.
| config.setdefault("tree_learner", "data_parallel") | ||
| config.setdefault("pre_partition", True) |
There was a problem hiding this comment.
It's great that you're setting sensible defaults for distributed training. To improve code clarity and maintainability, consider adding a comment explaining why these specific default values are chosen. This will help future developers understand the reasoning behind these settings.
# Set default parameters for distributed training.
# `tree_learner="data_parallel"` enables data-parallel training.
# `pre_partition=True` is needed since the data is sharded by Ray Data.
config.setdefault("tree_learner", "data_parallel")
config.setdefault("pre_partition", True)…enable distributed lightgbm training (ray-project#58709) The default is tree_learner=”serial”, which trains a separate model per worker. Users should set tree_learner in order to configure lightgbm to train a single model across all the dataset shards. `pre_partition` should also be set if using Ray Data to shard the dataset. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…enable distributed lightgbm training (ray-project#58709) The default is tree_learner=”serial”, which trains a separate model per worker. Users should set tree_learner in order to configure lightgbm to train a single model across all the dataset shards. `pre_partition` should also be set if using Ray Data to shard the dataset. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
…enable distributed lightgbm training (ray-project#58709) The default is tree_learner=”serial”, which trains a separate model per worker. Users should set tree_learner in order to configure lightgbm to train a single model across all the dataset shards. `pre_partition` should also be set if using Ray Data to shard the dataset. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…enable distributed lightgbm training (ray-project#58709) The default is tree_learner=”serial”, which trains a separate model per worker. Users should set tree_learner in order to configure lightgbm to train a single model across all the dataset shards. `pre_partition` should also be set if using Ray Data to shard the dataset. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
…enable distributed lightgbm training (ray-project#58709) The default is tree_learner=”serial”, which trains a separate model per worker. Users should set tree_learner in order to configure lightgbm to train a single model across all the dataset shards. `pre_partition` should also be set if using Ray Data to shard the dataset. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
The default is tree_learner=”serial”, which trains a separate model per worker. Users should set tree_learner in order to configure lightgbm to train a single model across all the dataset shards.
pre_partitionshould also be set if using Ray Data to shard the dataset.Additional information
See here: https://lightgbm.readthedocs.io/en/stable/Parallel-Learning-Guide.html