[RLlib] Evaluation do-over: Make parallel evaluation to training the default behavior and deprecate async eval option. by sven1977 · Pull Request #43787 · ray-project/ray

sven1977 · 2024-03-07T19:20:25Z

Evaluation do-over:
This PR aims at simplifying our evaluation code a little bit.

It deprecates the enable_async_evaluation option entirely. This should be replaced by a combination of existing options, evaluation_parallel_to_training=True AND evaluation_duration="auto" AND evaluation_interval=1 (AND evaluation_force_reset_envs_before_iteration=True, which is the default any way). These settings combined should be enough to 100% replace the old async behavior in the sense:
- that there is always an eval job running under the hood (parallel to training).
- the eval track is fault tolerant, restarting/ignoring worker failures as configured in config.fault_tolerance()
- the eval track does NOT report episode metrics from episodes that were run with an outdated weight set.
- no eval worker is ever idle, even if it run through a very short episode (it'll start right away with the next one and include all finished episodes - as always - in the resorted metrics).

The PR in particular:

Splits up the self.evaluation method into various sub-methods, depending on the logic configured: _evaluate_with_auto_duration, _evaluate_with_fixed_duration, _evaluate_on_local_worker, _evaluate_with_custom_function.
- Some of these methods require dissemination between old stack (RolloutWorkers return batches) and new API stack (EnvRunners return lists of episodes), however, these additional if-else blocks will go away once we have moved entirely to the new stack.
Speeds up the parallel evaluation step with auto duration by: Computing a better rollout_fragment_length dynamically depending on the estimated time it takes for the parallel training step to finish. This used to be a fixed 10 timesteps, which was causing lots of (expensive) remote calls on the eval workers.
Deprecates the async eval option (see above for an analysis on why this should NOT lead to diminished functionality).

TODOs for future PRs:

Unify reported metrics (this is currently quite a mess). This PR does NOT change the current behavior.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…uation_do_over_make_parallel_default_deprecate_async

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…uation_do_over_make_parallel_default_deprecate_async

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…uation_do_over_make_parallel_default_deprecate_async

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 · 2024-03-08T14:01:26Z

rllib/algorithms/algorithm.py

            **kwargs,
        )

-        # Check, whether `training_iteration` is still a tune.Trainable property


No longer needed, imo.

At the end it is in the responsibility of the user to override a Trainable method or not. Erroring out is too much of a consequence, agreed. We could leave it as a warning, but better imo is to refer more explicitly in the documentation to the tune.trainable.Trainable as the base class - so, if a user wants to use tune she should not override such methods.

sven1977 · 2024-03-08T14:02:02Z

rllib/algorithms/algorithm.py

        # self.iteration will be 0.
        evaluate_this_iter = (
-            self.config.evaluation_interval is not None
+            self.config.evaluation_interval


This was a bug, both in code and docs. Can also be 0 to cause NO evaluation.

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 · 2024-03-08T14:54:05Z

rllib/examples/parallel_evaluation_and_training.py

@@ -1,168 +1,6 @@
-import argparse
-import os
+msg = """


Slight examples folder cleanup. Will have to do more of these :)

sven1977 · 2024-03-08T14:54:20Z

rllib/examples/evaluation/evaluation_parallel_to_training.py

@@ -0,0 +1,157 @@
+from ray.rllib.algorithms.callbacks import DefaultCallbacks


Same script as before, just moved here and cleaned up a little.

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…eprecate_async

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_default_deprecate_async' into evaluation_do_over_make_parallel_default_deprecate_async

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…uation_do_over_make_parallel_default_deprecate_async

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…uation_do_over_make_parallel_default_deprecate_async

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added 5 commits March 7, 2024 14:43

wip

d0228f9

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into eval…

1278bab

…uation_do_over_make_parallel_default_deprecate_async

wip

0e96d7b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into eval…

7acc442

…uation_do_over_make_parallel_default_deprecate_async

wip

b51b6db

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from ArturNiederfahrenhorst, avnishn, kouroshHakha, maxpumperla and simonsays1980 as code owners March 7, 2024 19:20

sven1977 assigned simonsays1980 Mar 7, 2024

sven1977 added the do-not-merge Do not merge this PR! label Mar 7, 2024

sven1977 changed the title ~~[RLlib] Evaluation do-over: Make parallel evaluation to training the default behavior and deprecate async eval option.~~ [WIP; RLlib] Evaluation do-over: Make parallel evaluation to training the default behavior and deprecate async eval option. Mar 7, 2024

sven1977 added 4 commits March 8, 2024 12:41

wip

5b8dd60

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into eval…

89624f9

…uation_do_over_make_parallel_default_deprecate_async

wip

d0637b2

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

7742363

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 commented Mar 8, 2024

View reviewed changes

wip

adc002e

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested a review from a team as a code owner March 8, 2024 14:47

sven1977 changed the title ~~[WIP; RLlib] Evaluation do-over: Make parallel evaluation to training the default behavior and deprecate async eval option.~~ [RLlib] Evaluation do-over: Make parallel evaluation to training the default behavior and deprecate async eval option. Mar 8, 2024

sven1977 added rllib RLlib related issues rllib-evaluation Bug affecting policy evaluation with RLlib. rllib-newstack and removed do-not-merge Do not merge this PR! labels Mar 8, 2024

sven1977 commented Mar 8, 2024

View reviewed changes

wip

ec1c2e1

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added 27 commits March 11, 2024 16:15

wip

90d0a91

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

8651748

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

6b8c422

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

79c1d03

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

87f76fd

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

f369323

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

e3dde6f

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

00efe80

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' into evaluation_do_over_make_parallel_default_d…

0f959d0

…eprecate_async

wip

ce7500d

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge remote-tracking branch 'origin/evaluation_do_over_make_parallel…

324180a

…_default_deprecate_async' into evaluation_do_over_make_parallel_default_deprecate_async

LINT

36e18cf

Signed-off-by: sven1977 <svenmika1977@gmail.com>

LINT

3d7276a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

bac408b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

81b38ec

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

6beaaa9

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

d1d9e0a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into eval…

1195d15

…uation_do_over_make_parallel_default_deprecate_async

fix

534637e

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

49371da

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into eval…

f767fa8

…uation_do_over_make_parallel_default_deprecate_async

fix

234b158

Signed-off-by: sven1977 <svenmika1977@gmail.com>

test early return on locally failing worker-failure-tests

f0c363d

Signed-off-by: sven1977 <svenmika1977@gmail.com>

test early return on infinte eval episode test

6c13ec1

Signed-off-by: sven1977 <svenmika1977@gmail.com>

take out ONLY eval infinite episode test case

d347d42

Signed-off-by: sven1977 <svenmika1977@gmail.com>

take out ONLY eval infinite episode test case

19bef6a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

209c5dd

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 merged commit 4b47995 into ray-project:master Mar 13, 2024

sven1977 deleted the evaluation_do_over_make_parallel_default_deprecate_async branch March 13, 2024 12:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Evaluation do-over: Make parallel evaluation to training the default behavior and deprecate async eval option.#43787

[RLlib] Evaluation do-over: Make parallel evaluation to training the default behavior and deprecate async eval option.#43787
sven1977 merged 57 commits intoray-project:masterfrom
sven1977:evaluation_do_over_make_parallel_default_deprecate_async

sven1977 commented Mar 7, 2024 •

edited

Loading

Uh oh!

sven1977 Mar 8, 2024

Uh oh!

simonsays1980 Mar 11, 2024

Uh oh!

sven1977 Mar 8, 2024

Uh oh!

sven1977 Mar 8, 2024

Uh oh!

sven1977 Mar 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,157 @@
		from ray.rllib.algorithms.callbacks import DefaultCallbacks

Conversation

sven1977 commented Mar 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

sven1977 Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Mar 11, 2024

Choose a reason for hiding this comment

Uh oh!

sven1977 Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

sven1977 Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

sven1977 Mar 8, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sven1977 commented Mar 7, 2024 •

edited

Loading