Skip to content

Drl ppo #487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 20, 2024
Merged

Drl ppo #487

merged 4 commits into from
Nov 20, 2024

Conversation

kim-mskw
Copy link
Contributor

Pull Request

Description

This PR merges the implementation of mini-batch sampling. This solves the torch problem with redundant iteration over the identical tensors. Also, it implements hyperparameter tuning and action changes as a result of the discussion between @kim-mskw and @adiwied.

This now results in stable single-unit learning with the PPO

image

Changes Proposed

  • removed action clipping as it introduces non-stationarity
  • adjusted learning hyperparameters
  • implemented mini-batch sampling

Testing

Algorithm tested with example_02a local and in docker

Checklist

Not applicable yet as we only merge to branch on assume not yet main.

@kim-mskw kim-mskw merged commit 8b3196a into assume-framework:drl-ppo Nov 20, 2024
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants