Tune tbxlogger add images#37822
Conversation
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
|
I applied a similar fix with the TensorboardX logger. In my case, the media metrics are being compiled into a list within the |
|
I also tried to apply a similar modification to TBXLoggerCallback. When the image in episode.media gets processed by the JSONLoggerCallback, significant delays (minutes!) are introduced by the JSON logger. This is because the JSON logger needs to rewrite the log file at each logging call (see #21416) . I worked around this by disabling other Logger Callbacks by setting Also I think it needs to be decided how to handle the case when images are logged by multiple episodes at one time in def on_train_result(
self,
*,
algorithm: "Algorithm",
result: dict,
**kwargs,
) -> None:
"""Called at the end of Algorithm.train().
Args:
algorithm: Current Algorithm instance.
result: Dict of results returned from Algorithm.train() call.
You can mutate this object to add additional metrics.
kwargs: Forward compatibility placeholder.
"""
if 'trajectory' in result['episode_media'].keys():
result['episode_media']['myimage'] = result['episode_media']['myimage'][0]My modification of the TBXLoggerCallback.log_trial_result() function looks like this. I decided to make it more strict with the numpy array, to not automatically interpret any 3D array as image: ...
elif (isinstance(value, list) and len(value) > 0) or (
isinstance(value, np.ndarray) and value.size > 0
):
valid_result[full_attr] = value
# Check for list of images:
if all(isinstance(v, np.ndarray) and v.ndim == 3 and v.shape[0] in [1, 3] for v in value):
if len(value) == 1:
# only one image
self._trial_writer[trial].add_image(full_attr, value[0], global_step=step)
else:
# Multiple images - stack them as tensorboard requires
imgs = np.stack(value)
self._trial_writer[trial].add_images(full_attr, imgs, global_step=step)
continue
# Check for list of videos:
if all(isinstance(v, np.ndarray) and v.ndim == 5 and v.shape[2] in [1,3] for v in value):
video = np.concatenate(value, axis=1)
self._trial_writer[trial].add_video(
full_attr, video, global_step=step, fps=20)
continue
# Cover either a single video or a single image
if isinstance(value, np.ndarray) and value.size > 0:
# Video - Must have 5 dimensions in NTCHW format:
# C must be either 1 for grayscale of 3 for RGB
if value.ndim == 5 and value.shape[2] in [1, 3]:
self._trial_writer[trial].add_video(
full_attr, value, global_step=step, fps=20
)
continue
# Image - Must have 3 dimensions in CHW format.
# C must be either 1 for grayscale of 3 for RGB
if value.ndim == 3 and value.shape[0] in [1, 3]:
self._trial_writer[trial].add_image(
full_attr, value, global_step=step
)
continue
try:
... |
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
e10cf2c to
378aac4
Compare
| ) | ||
| continue | ||
|
|
||
| # Must be multi-image |
There was a problem hiding this comment.
Dumb question: What if this is a single video (t, w, h, c)?
There was a problem hiding this comment.
Following the definition of add_video() only 5-dimensional inputs are accepted for this function.
Do we anywhere pass data in a lower dimensional array into this function?
There was a problem hiding this comment.
@sven1977 Do you see any cases where this setup of separating arrays by dimension could fall on our feet?
Signed-off-by: Sven Mika <sven@anyscale.io>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
…ray into tune-tbxlogger-add-images Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Why are these changes needed?
This PR enables users to provide in the
resultdicitonaries also image arrays that can be presented on TensorBoard like in the following example:Images can be provided either as singleton in form of an
np.ndarraywith dimensions(3, H, W)or in form of a4-dnp.ndarraywith dimensions(N, 3, H, W)(in this case images get concatenated horizontally).Related issue number
#21954
As this issue is a P1Issue that should be fixed within a few weeks
rllib
RLlib related issues
the corresponding solution involves storing the images to the
episode.mediaas this attribute of the episode is not summarized or appended in themetrics.collect_episodes()function.Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.