v0.11
What's Changed
- ci: Remove workflow that doesn't utilize training library (medium, -mp) by @booxter in #478
- Obey the FSDP sharding option default by @Maxusmusti in #486
- Change default internal sharding strategy to HYBRID_SHARD by @Maxusmusti in #488
- chore: Update the large e2e job to use fallback logic for selecting EC2 instances by @courtneypacheco in #491
- moves deepspeed requirements into their own file; add deepspeed extras by @JamesKunstle in #455
- chore: introduce dummy workflow by @cdoern in #497
- ci: Search for necessary instance for smoke job in multiple AZs by @booxter in #481
- ci: Fix -sdk fake workflow failure on actionlint by @booxter in #501
- build(deps): Bump actions/setup-python from 5.5.0 to 5.6.0 by @dependabot in #493
- use instructlab
constraints-dev.txtin e2e test by @ktdreyer in #499 - build(deps): Bump step-security/harden-runner from 2.11.1 to 2.12.0 by @dependabot in #490
- ci: Use tox-current-env to reuse prepared venv with torch by @booxter in #482
- fix: extend nccl timeout by @cdoern in #507
- always log storage by @RobotSail in #510
- deps: Remove caps on ROCm dependencies by @courtneypacheco in #517
- ci: don't trigger pull_request_target job on its own workflow by @booxter in #519
- Enable pylint 'unused-argument' check by @fynnsu in #528
New Contributors
Full Changelog: v0.10.0...v0.11