[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack. by sven1977 · Pull Request #44706 · ray-project/ray

sven1977 · 2024-04-12T14:20:42Z

Cleanup examples folder 04:

Curriculum example moved to new API stack.
checkpoint-by-custom-criteria example moved to new API stack.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…nup_examples_folder_04

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…nup_examples_folder_04

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980

LGTM. Very happy about the curriculum example.

simonsays1980 · 2024-04-12T15:19:45Z

rllib/examples/checkpoints/checkpoint_by_custom_criteria.py

+
+For debugging, use the following additional command line options
+`--no-tune --num-env-runners=0`
+which should allow you to set breakpoints anywhere in the RLlib code and


Works also with tune, but --local-mode :)

Absolutely! I'm always afraid, we are going to get rid of Ray local-mode at some point. Also, for any number of Learner workers > 0, local mode doesn't work (not sure why, actually).

simonsays1980 · 2024-04-12T15:28:55Z

rllib/examples/checkpoints/checkpoint_by_custom_criteria.py

-    ckpt = results.get_best_result(metric=policy_loss_key, mode="min").checkpoint
-    print("Lowest pol-loss: {}".format(ckpt))
+    best_result = results.get_best_result(metric=policy_loss_key, mode="min")
+    ckpt = best_result.checkpoint


We could also ask here for the best checkpoint along the training path best_result.get_best_checkpoint(metric=policy_loss_key, mode="min")

Ah, cool, so ckpt = best_result.checkpoint returns the very last checkpoint only?

And if the last is not the best one, it's better to do:
best_result.get_best_checkpoint(metric=policy_loss_key, mode="min")
??

This actually doesn't seem to work well with nested keys.
If I do best_result.get_best_checkpoint(policy_loss_key, mode="min"), I get:

RuntimeError: Invalid metric name ('info', 'learner', 'default_policy', 'learner_stats', 'policy_loss')! You may choose from the following metrics: dict_keys(['custom_metrics', 'episode_media', 'info', 'sampler_results', 'episode_reward_max', 'episode_reward_min', 'episode_reward_mean', 'episode_len_mean', 'episodes_this_iter', 'episodes_timesteps_total', 'policy_reward_min', 'policy_reward_max', 'policy_reward_mean', 'hist_stats', 'sampler_perf', 'num_faulty_episodes', 'connector_metrics', 'num_healthy_workers', 'num_in_flight_async_reqs', 'num_remote_worker_restarts', 'num_agent_steps_sampled', 'num_agent_steps_trained', 'num_env_steps_sampled', 'num_env_steps_trained', 'num_env_steps_sampled_this_iter', 'num_env_steps_trained_this_iter', 'num_env_steps_sampled_throughput_per_sec', 'num_env_steps_trained_throughput_per_sec', 'timesteps_total', 'num_steps_trained_this_iter', 'agent_timesteps_total', 'timers', 'counters', 'done', 'episodes_total', 'training_iteration', 'trial_id', 'date', 'timestamp', 'time_this_iter_s', 'time_total_s', 'pid', 'hostname', 'node_ip', 'config', 'time_since_restore', 'iterations_since_restore', 'perf', 'experiment_tag']).

simonsays1980 · 2024-04-12T15:29:04Z

rllib/examples/checkpoints/checkpoint_by_custom_criteria.py

-
-    ray.shutdown()
+    best_result = results.get_best_result(metric=vf_loss_key, mode="max")
+    ckpt = best_result.checkpoint


Here as well

simonsays1980 · 2024-04-12T16:09:14Z

rllib/examples/curriculum/curriculum_learning.py

-        param_space=config.to_dict(),
-        run_config=air.RunConfig(stop=stop, verbose=2),
+    run_rllib_example_script_experiment(
+        base_config, args, stop=stop, success_metric={"task_solved": 1.0}


Very nice example 👍

…nup_examples_folder_04

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added 5 commits April 10, 2024 22:42

wip

7728726

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

68dc2f6

…nup_examples_folder_04

wip

ac1ba10

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

e1d1058

…nup_examples_folder_04

wip

44e436b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from ArturNiederfahrenhorst, avnishn, kouroshHakha, maxpumperla and simonsays1980 as code owners April 12, 2024 14:20

sven1977 assigned simonsays1980 Apr 12, 2024

sven1977 added rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples rllib-newstack labels Apr 12, 2024

simonsays1980 approved these changes Apr 12, 2024

View reviewed changes

sven1977 added 3 commits April 13, 2024 19:02

Merge branch 'master' of https://github.com/ray-project/ray into clea…

fc19603

…nup_examples_folder_04

wip

2645a8b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

3bdac7f

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 merged commit f1f0ced into ray-project:master Apr 14, 2024

sven1977 deleted the cleanup_examples_folder_04 branch April 14, 2024 10:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack.#44706

[RLlib] Cleanup `examples` folder 04: Curriculum and checkpoint-by-custom-criteria examples moved to new API stack.#44706
sven1977 merged 8 commits intoray-project:masterfrom
sven1977:cleanup_examples_folder_04

sven1977 commented Apr 12, 2024 •

edited

Loading

Uh oh!

simonsays1980 left a comment

Uh oh!

simonsays1980 Apr 12, 2024

Uh oh!

sven1977 Apr 13, 2024

Uh oh!

simonsays1980 Apr 12, 2024

Uh oh!

sven1977 Apr 13, 2024

Uh oh!

sven1977 Apr 14, 2024

Uh oh!

simonsays1980 Apr 12, 2024

Uh oh!

simonsays1980 Apr 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sven1977 commented Apr 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

simonsays1980 left a comment

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

sven1977 Apr 13, 2024

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

sven1977 Apr 13, 2024

Choose a reason for hiding this comment

Uh oh!

sven1977 Apr 14, 2024

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sven1977 commented Apr 12, 2024 •

edited

Loading