Skip to content

[BugFix] Fix behavior or partial, nested dones in PEnv and TEnv #2959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 16, 2025

Conversation

vmoens
Copy link
Collaborator

@vmoens vmoens commented May 16, 2025

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request May 16, 2025
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 16, 2025
Copy link

pytorch-bot bot commented May 16, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2959

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 6 Pending, 2 Unrelated Failures

As of commit 9185c7a with merge base 36f34da (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@vmoens vmoens added the Environments Adds or modifies an environment wrapper label May 16, 2025
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request May 16, 2025
Copy link
Contributor

@thomasbbrunner thomasbbrunner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for the quick fix.

test/test_env.py Outdated
Comment on lines 2964 to 2991
if done_at_root:
assert parallel_env._simple_done
assert transformed_env._simple_done
# We expect each env to have reached a done state once.
assert parallel_td["next", "done"].sum().item() == 2
# We expect env[0] to have been reset and executed 2 steps.
# We expect env[1] to have just been reset (0 steps).
assert parallel_env._counter() == [2, 0]
assert parallel_td["next", "done"].sum().item() == 2

# We expect each env to have reached a done state once.
assert transformed_td["next", "done"].sum().item() == 2
assert_allclose_td(transformed_td, parallel_td, intersection=True)
# We expect env[0] to have been reset and executed 2 steps.
# We expect env[1] to have just been reset (0 steps).
# We only expect env[0] to have reached a done state.
else:
assert not parallel_env._simple_done
assert not transformed_env._simple_done
assert ("next", "done") not in parallel_td
assert ("next", "done") not in transformed_td
assert parallel_td["next", "agent_1", "done"].sum().item() == 2
assert parallel_env._counter() == [2, 0]
assert parallel_td["next", "agent_1", "done"].sum().item() == 2
assert transformed_td["next", "agent_1", "done"].sum().item() == 2
assert_allclose_td(transformed_td, parallel_td, intersection=True)

assert transformed_env._counter() == [2, 0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a mismatch in the asserts and comments and some asserts are duplicated. Maybe this helps:

Suggested change
if done_at_root:
assert parallel_env._simple_done
assert transformed_env._simple_done
# We expect each env to have reached a done state once.
assert parallel_td["next", "done"].sum().item() == 2
# We expect env[0] to have been reset and executed 2 steps.
# We expect env[1] to have just been reset (0 steps).
assert parallel_env._counter() == [2, 0]
assert parallel_td["next", "done"].sum().item() == 2
# We expect each env to have reached a done state once.
assert transformed_td["next", "done"].sum().item() == 2
assert_allclose_td(transformed_td, parallel_td, intersection=True)
# We expect env[0] to have been reset and executed 2 steps.
# We expect env[1] to have just been reset (0 steps).
# We only expect env[0] to have reached a done state.
else:
assert not parallel_env._simple_done
assert not transformed_env._simple_done
assert ("next", "done") not in parallel_td
assert ("next", "done") not in transformed_td
assert parallel_td["next", "agent_1", "done"].sum().item() == 2
assert parallel_env._counter() == [2, 0]
assert parallel_td["next", "agent_1", "done"].sum().item() == 2
assert transformed_td["next", "agent_1", "done"].sum().item() == 2
assert_allclose_td(transformed_td, parallel_td, intersection=True)
assert transformed_env._counter() == [2, 0]
# We expect env[0] to have been reset and executed 2 steps.
# We expect env[1] to have just been reset (0 steps).
assert parallel_env._counter() == [2, 0]
assert transformed_env._counter() == [2, 0]
if done_at_root:
assert parallel_env._simple_done
assert transformed_env._simple_done
# We expect each env to have reached a done state once.
assert parallel_td["next", "done"].sum().item() == 2
assert transformed_td["next", "done"].sum().item() == 2
assert_allclose_td(transformed_td, parallel_td, intersection=True)
else:
assert not parallel_env._simple_done
assert not transformed_env._simple_done
assert ("next", "done") not in parallel_td
assert ("next", "done") not in transformed_td
assert parallel_td["next", "agent_1", "done"].sum().item() == 2
assert transformed_td["next", "agent_1", "done"].sum().item() == 2
assert_allclose_td(transformed_td, parallel_td, intersection=True)

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request May 16, 2025
@vmoens vmoens merged commit 9185c7a into gh/vmoens/143/base May 16, 2025
91 of 101 checks passed
vmoens pushed a commit that referenced this pull request May 16, 2025
@vmoens vmoens deleted the gh/vmoens/143/head branch May 16, 2025 10:02
vmoens pushed a commit that referenced this pull request May 16, 2025
ghstack-source-id: e36d1c8
Pull-Request-resolved: #2959
(cherry picked from commit 6ae8d43)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Environments Adds or modifies an environment wrapper
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants