[RLlib] Added functionality to add `infos` and `extra_model_outputs` to the sample output of `PrioritizedEpisodeReplayBuffer`. by simonsays1980 · Pull Request #43496 · ray-project/ray

simonsays1980 · 2024-02-28T12:35:49Z

Why are these changes needed?

So far PrioritizedEpisodeReplayBuffer had a functionality to add infos to the sample of this buffer, but not one to add also extra_model_outputs. This PR adds the functionality together with a corresponding test case.

Note, the extra_model_outputs are extracted as a dict and will be added to the batch in this form per row (similar to infos). Later in post-processing the variables from this dicitonary can be extracted in a corresponding learner connector. Furthermore, while infos are extracted at the end of n_step, the extra_model_outputs usually refer to a corresponding action which comes from the first timestep in the n_step tuple. Henceforth, we take the extra_model_outputs from the same timestep.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…atch when sampling from 'PrioritizedEpisodeReplayBuffer'. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

…e-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

rllib/utils/replay_buffers/tests/test_prioritized_episode_replay_buffer.py

sven1977 · 2024-03-06T12:50:58Z

rllib/utils/replay_buffers/prioritized_episode_replay_buffer.py

+        if include_extra_model_outputs:
+            ret.update(
+                {
+                    "extra_model_outputs": np.array(extra_model_outputs),


Not sure this is a good idea just np'ing stuff like this. This often leads to these unwieldy object arrays that have unpredictable behavior (the same is true for np'ing the infos above, we should just keep them as a list of infos-dicts in the returned batch).

We usually separate these sub-columns in extra_model_outputs in our batches. Can we do that here, too?

ret.update( { k: batch(v) for k, v in extra_model_outputs.items() } )

The final batch (returned from sample) should have columns at the top level, e.g. OBS or ACTION_DIST_INPUTS.
Under each of these columns should be a (possibly nested) struct of numpy array leafs (or simply a numpy array if no complex space/struct). All leafs should have the shape (B, T?, ...), where T might be 0 or 1.

Let me know, if I'm making a thinking-mistake here. :)

Hey @sven1977 thanks for the review! Yes this was somehow still ambiguous how to deal with the extra model outputs. I can batch the items from this field such that each of the keys in extra_model_outputs defines a new column in the batch.

@sven1977 following your logic above it might also make sense to keep the other "batch" columns here as lists such that they can be batched in a standard way in the connectors?

Signed-off-by: Sven Mika <sven@anyscale.io>

… {(eps_id,): [1.3, 4.23 ...], ...}, ...}. Furthermore, implemented a tracker for the maximum tree index to sum weights during sampling faster. Implemented testing for 'sample_with_keys'. Naming was chosen such that we can deprecate the old 'sample' as soon as initial review is done. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

…er' of github.com:simonsays1980/ray into extra-model-outputs-for-prioritized-episode-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977

LGTM! Let's merge this, then time-it, whether the saved time to create the batch in the buffer is eaten up by the additional batching step required in the Learner Connector (I don't think that would be the case).

Awesome PR @simonsays1980 ! :)

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

simonsays1980 added 4 commits February 28, 2024 13:26

Added funcitonality to add 'infos' and 'extra_model_outputs' to the b…

077398c

…atch when sampling from 'PrioritizedEpisodeReplayBuffer'. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Added a doc string for the new argument 'include_extra_model_outputs'.

0b35e61

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' into extra-model-outputs-for-prioritized-episod…

e5c0011

…e-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' into extra-model-outputs-for-prioritized-episod…

c26bb91

…e-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

simonsays1980 changed the title ~~Added funcitonality to add infos and'extra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer.~~ Added funcitonality to add infos andextra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer. Mar 1, 2024

simonsays1980 changed the title ~~Added funcitonality to add infos andextra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer.~~ Added funcitonality to add infos and extra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer. Mar 1, 2024

simonsays1980 added 3 commits March 4, 2024 18:35

Merge branch 'master' into extra-model-outputs-for-prioritized-episod…

16efa0a

…e-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' into extra-model-outputs-for-prioritized-episod…

2886d2d

…e-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' into extra-model-outputs-for-prioritized-episod…

e48fdba

…e-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977 changed the title ~~Added funcitonality to add infos and extra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer.~~ [RLlib] Added funcitonality to add infos and extra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer. Mar 6, 2024

sven1977 marked this pull request as ready for review March 6, 2024 12:42

sven1977 requested review from ArturNiederfahrenhorst, avnishn, kouroshHakha, maxpumperla and sven1977 as code owners March 6, 2024 12:42

sven1977 reviewed Mar 6, 2024

View reviewed changes

rllib/utils/replay_buffers/tests/test_prioritized_episode_replay_buffer.py Outdated Show resolved Hide resolved

sven1977 reviewed Mar 6, 2024

View reviewed changes

sven1977 and others added 3 commits March 6, 2024 13:51

Apply suggestions from code review

235b50a

Signed-off-by: Sven Mika <sven@anyscale.io>

Merge branch 'extra-model-outputs-for-prioritized-episode-replay-buff…

7c443dd

…er' of github.com:simonsays1980/ray into extra-model-outputs-for-prioritized-episode-replay-buffer Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977 approved these changes Mar 11, 2024

View reviewed changes

sven1977 self-assigned this Mar 11, 2024

sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Mar 11, 2024

simonsays1980 added 2 commits March 11, 2024 15:33

Changed docstring for 'sample_with_keys'.

0e19571

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merged master

cf39c61

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977 merged commit ec68337 into ray-project:master Mar 11, 2024

sven1977 changed the title ~~[RLlib] Added funcitonality to add infos and extra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer.~~ [RLlib] Added functionality to add infos and extra_model_outputs to the sample output of PrioritizedEpisodeReplayBuffer. Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Added functionality to add `infos` and `extra_model_outputs` to the sample output of `PrioritizedEpisodeReplayBuffer`.#43496

[RLlib] Added functionality to add `infos` and `extra_model_outputs` to the sample output of `PrioritizedEpisodeReplayBuffer`.#43496
sven1977 merged 12 commits intoray-project:masterfrom
simonsays1980:extra-model-outputs-for-prioritized-episode-replay-buffer

simonsays1980 commented Feb 28, 2024 •

edited

Loading

Uh oh!

Uh oh!

sven1977 Mar 6, 2024

Uh oh!

simonsays1980 Mar 6, 2024

Uh oh!

simonsays1980 Mar 7, 2024

Uh oh!

sven1977 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simonsays1980 commented Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

Uh oh!

sven1977 Mar 6, 2024

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Mar 6, 2024

Choose a reason for hiding this comment

Uh oh!

simonsays1980 Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

sven1977 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simonsays1980 commented Feb 28, 2024 •

edited

Loading