[RLlib] Revert PPO back to old API stack (by default). New stack and PPO not ready yet on several features.#40706
Conversation
| "but have not enabled the new API stack. To enable it, call " | ||
| "`config.experimental(_enable_new_api_stack=True)`." | ||
| ) | ||
| # LR-schedule checking. |
There was a problem hiding this comment.
Moved here for better overview.
|
|
||
| @override(MARWILConfig) | ||
| def validate(self) -> None: | ||
| # Can not use Tf with learner api. |
There was a problem hiding this comment.
This is already checked in validate(). We should never(!) automatically change properties inside AlgorithmConfig (unless private ones that are covered by (public) @properties).
| # `self.rl_module()` | ||
| self.rl_module_spec = None | ||
| self._enable_rl_module_api = False | ||
| self._rl_module_spec = None |
There was a problem hiding this comment.
Made this private (plus a @Property for rl_module_spec). We should never automatically (or inside validate()) set any properties. This now makes a lot of tests better as they don't require to magically wait for things to change under the hood after calling validate().
…rt_ppo_back_to_old_stack_by_default
…rt_ppo_back_to_old_stack_by_default
| dashboard. If you're seeing that the object store is filling up, | ||
| turn down the number of remote requests in flight, or enable compression | ||
| in your experiment of timesteps. | ||
| _enable_learner_api: Whether to enable the LearnerGroup and Learner |
There was a problem hiding this comment.
nit: don't remove the doc for it. Just add that it has been replaced with _enable_new_stack_api
I see that you throw the deprecation warning right away.
…PPO not ready yet on several features. (ray-project#40706)
_enable_rl_module_apiand_enable_learner_apiinto a single_enable_new_api_stacksetting to remove confusion. These two settings already had to be either both switch on or both switched off anyways.Why are these changes needed?
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.