[RLlib] Revert PPO back to old API stack (by default). New stack and PPO not ready yet on several features. by sven1977 · Pull Request #40706 · ray-project/ray

sven1977 · 2023-10-26T13:07:41Z

Revert PPO back to old API stack (by default).

PPO on the new stack is NOT ready yet on several features, including LSTM, disabling exploration (e.g. on the eval workers), attention net, trajectory view API.
We will re-activate PPO on the new stack by default, once it has been fully moved to the EnvRunner APIs and supports multi-agent, connectors, and all the above mentioned currently missing functionalities.

Renamed config args: _enable_rl_module_api and _enable_learner_api into a single _enable_new_api_stack setting to remove confusion. These two settings already had to be either both switch on or both switched off anyways.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Signed-off-by: Sven Mika <svenmika1977@gmail.com>

sven1977 · 2023-10-26T14:29:52Z

rllib/algorithms/algorithm_config.py

+                "but have not enabled the new API stack. To enable it, call "
+                "`config.experimental(_enable_new_api_stack=True)`."
+            )
+        # LR-schedule checking.


Moved here for better overview.

sven1977 · 2023-10-26T14:32:05Z

rllib/algorithms/bc/bc.py


    @override(MARWILConfig)
    def validate(self) -> None:
-        # Can not use Tf with learner api.


This is already checked in validate(). We should never(!) automatically change properties inside AlgorithmConfig (unless private ones that are covered by (public) @properties).

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 · 2023-10-26T16:51:24Z

rllib/algorithms/algorithm_config.py

        # `self.rl_module()`
-        self.rl_module_spec = None
-        self._enable_rl_module_api = False
+        self._rl_module_spec = None


Made this private (plus a @Property for rl_module_spec). We should never automatically (or inside validate()) set any properties. This now makes a lot of tests better as they don't require to magically wait for things to change under the hood after calling validate().

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…rt_ppo_back_to_old_stack_by_default

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…rt_ppo_back_to_old_stack_by_default

Signed-off-by: sven1977 <svenmika1977@gmail.com>

kouroshHakha

LGTM

kouroshHakha · 2023-10-27T16:11:08Z

rllib/algorithms/algorithm_config.py

                dashboard. If you're seeing that the object store is filling up,
                turn down the number of remote requests in flight, or enable compression
                in your experiment of timesteps.
-            _enable_learner_api: Whether to enable the LearnerGroup and Learner


~~nit: don't remove the doc for it. Just add that it has been replaced with _enable_new_stack_api~~

I see that you throw the deprecation warning right away.

…PPO not ready yet on several features. (ray-project#40706)

sven1977 added 2 commits October 26, 2023 13:14

wip

3db0d95

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

909cc7b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from a team, ArturNiederfahrenhorst, avnishn, kouroshHakha, maxpumperla and smorad as code owners October 26, 2023 13:07

sven1977 added 3 commits October 26, 2023 15:57

wip

f95f72b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

4215c8e

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' into revert_ppo_back_to_old_stack_by_default

5cdb817

Signed-off-by: Sven Mika <svenmika1977@gmail.com>

sven1977 commented Oct 26, 2023

View reviewed changes

wip

9690ff3

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 assigned kouroshHakha Oct 26, 2023

wip

211d932

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 commented Oct 26, 2023

View reviewed changes

sven1977 changed the title ~~[RLlib] Revert PPO back to old API stack (by default). Not ready yet on several features.~~ [RLlib] Revert PPO back to old API stack (by default). New stack and PPO not ready yet on several features. Oct 26, 2023

sven1977 added 5 commits October 26, 2023 22:27

wip

d52d99b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into reve…

1787185

…rt_ppo_back_to_old_stack_by_default

wip

e4dd7f4

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into reve…

5743fd4

…rt_ppo_back_to_old_stack_by_default

wip

31b1682

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Oct 27, 2023

kouroshHakha approved these changes Oct 27, 2023

View reviewed changes

sven1977 merged commit eabd18e into ray-project:master Oct 27, 2023

sven1977 deleted the revert_ppo_back_to_old_stack_by_default branch May 17, 2024 04:33

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Dec 17, 2025

[RLlib] Revert PPO back to old API stack (by default). New stack and …

a3d74a3

…PPO not ready yet on several features. (ray-project#40706)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Revert PPO back to old API stack (by default). New stack and PPO not ready yet on several features.#40706

[RLlib] Revert PPO back to old API stack (by default). New stack and PPO not ready yet on several features.#40706
sven1977 merged 12 commits intoray-project:masterfrom
sven1977:revert_ppo_back_to_old_stack_by_default

sven1977 commented Oct 26, 2023 •

edited

Loading

Uh oh!

sven1977 Oct 26, 2023 •

edited

Loading

Uh oh!

sven1977 Oct 26, 2023

Uh oh!

sven1977 Oct 26, 2023

Uh oh!

kouroshHakha left a comment

Uh oh!

kouroshHakha Oct 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sven1977 commented Oct 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

sven1977 Oct 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sven1977 Oct 26, 2023

Choose a reason for hiding this comment

Uh oh!

sven1977 Oct 26, 2023

Choose a reason for hiding this comment

Uh oh!

kouroshHakha left a comment

Choose a reason for hiding this comment

Uh oh!

kouroshHakha Oct 27, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sven1977 commented Oct 26, 2023 •

edited

Loading

sven1977 Oct 26, 2023 •

edited

Loading