API Change
This release fixes a critical issue introduced in v0.1.4 that prevented correct functionality.
Users are strongly recommended to upgrade to v0.1.5.
# old:
rl_trainer = GrpoLearner(
grpo_config=grpo_config,
)
# new:
rl_trainer = GrpoLearner(
algo_config=grpo_config,
)
What's Changed
- Remove grpo helper. by @copybara-service[bot] in #771
- Fix the GitHub source links in example notebooks on dpo, grpo and qlora by @rajasekharporeddy in #775
- adding support for cns file downloads in tunix cli by @copybara-service[bot] in #762
- Developing on v0.1.5 now by @wang2yn84 in #776
- Replace
grpo_configwithalgo_configwhile callingGRPOLearnerin GRPO Demo notebook by @rajasekharporeddy in #778 - Lazy load transformers by @copybara-service[bot] in #779
- Fix first_micro_batch_rollout_time by @copybara-service[bot] in #783
Full Changelog: v0.1.4...v0.1.5