Add overlong reward shaping for DAPO. #947

copybara-service · 2026-01-06T21:39:22Z

Add overlong reward shaping for DAPO.
Refactor rl_learner.compute_reward to use reward_manager
Enable logging algo_config at learner init.

Refactor rl_learner.compute_reward to use reward_manager Enable logging `algo_config` at learner init. PiperOrigin-RevId: 852838075

copybara-service bot requested review from abheesht17, hgao327, jiangyangmu, lc5211, sizhit2, tianshub and wang2yn84 as code owners January 6, 2026 21:39

copybara-service bot had a problem deploying to testing January 6, 2026 21:39 Failure

copybara-service bot force-pushed the test_852838075 branch from bf3cef0 to 5ac4f1c Compare January 7, 2026 22:13

copybara-service bot had a problem deploying to testing January 7, 2026 22:14 Failure

copybara-service bot force-pushed the test_852838075 branch from 5ac4f1c to 349687d Compare January 13, 2026 22:48

copybara-service bot had a problem deploying to testing January 13, 2026 22:48 Failure

Add overlong reward shaping for DAPO.

1b3d6ac

Refactor rl_learner.compute_reward to use reward_manager Enable logging `algo_config` at learner init. PiperOrigin-RevId: 852838075

copybara-service bot force-pushed the test_852838075 branch from 349687d to 1b3d6ac Compare January 13, 2026 23:15

copybara-service bot had a problem deploying to testing January 13, 2026 23:15 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add overlong reward shaping for DAPO. #947

Add overlong reward shaping for DAPO. #947

Uh oh!

copybara-service bot commented Jan 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add overlong reward shaping for DAPO. #947

Are you sure you want to change the base?

Add overlong reward shaping for DAPO. #947

Uh oh!

Conversation

copybara-service bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

copybara-service bot commented Jan 6, 2026 •

edited

Loading