v0.11.1
What's Changed
- Add general logging implementation by @fynnsu in #500
- docs: add CI documentation by @nathan-weinberg in #555
- fix: Use default torch timeout for nccl watchdog unless overridden by @booxter in #521
- fix: Fix markdown-lint violations by @booxter in #559
- ci: add 3.12 smoke workflow flavor by @booxter in #535
- adds barriers after checkpoint saving by @JamesKunstle in #566
- ci: Fix smoke failures due to
prenot available in local actions by @booxter in #565 - Checkout correct branch on
pull_request_targettrigger by @fynnsu in #549 - Logging Fixes & Enhancements by @RobotSail in #571
- docs: Remove badge for a no longer existing job by @booxter in #542
- uses
__name__in logging.getLogger by @JamesKunstle in #573 - ci: stop reporting results to slack by @ktdreyer in #574
- CI: Constrain all dependencies; introduce a Monday workflow to update pins by @booxter in #558
- ci: Run jobs on constraints-dev.txt change by @booxter in #580
- chore: update constraints-dev.txt (2025-05-30) by @courtneypacheco in #579
- remove old Deepspeed-native code by @JamesKunstle in #567
- add DCO.txt by @ktdreyer in #588
- ci: Disable dependabot for pip dependencies by @booxter in #587
- feat: refactor main_ds.py (1/n) Model class by @cdoern in #572
- ci: do not require DCO job by @ktdreyer in #595
- 'granite-3.3-2b-instruct' for smoketest; smaller smoke dataset by @JamesKunstle in #590
- fixes unit tests requiring cuda by @JamesKunstle in #586
- chore: update constraints-dev.txt (2025-06-02) by @courtneypacheco in #584
- ci: Cover more test dependencies with pins by @booxter in #581
- ci: Introduce python 3.12 e2e large job flavor by @booxter in #563
- Implicit distributed backend selection by @booxter in #516
- ci: Fix incorrect indent in workflow steps by @booxter in #599
- feat: refactor main_ds.py (2/n) Accelerator class by @cdoern in #594
- chore: update constraints-dev.txt (2025-06-09) by @courtneypacheco in #602
- feat: add medium e2e CI job for each PR by @cdoern in #551
- test: fix e2e target by @cdoern in #610
- chore: update constraints-dev.txt (2025-06-16) by @courtneypacheco in #612
- Remove Dolomite support by @booxter in #616
- Revert "test: fix e2e target" by @bbrowning in #620
- ci: Remove harden-runner steps from jobs by @booxter in #617
- test: disable per-PR test by @cdoern in #631
- fix edge case for qwen3 data processing by @RobotSail in #626
- uncap accelerate in
requirements-cuda.txtby @ktdreyer in #628 - chore: update constraints-dev.txt (2025-06-30) by @courtneypacheco in #623
- Fix a mistake in formatting a floating-point value by @mtake in #639
- Add a tutorial for fine-tuning and interpolation by @mtake in #640
New Contributors
- @bbrowning made their first contribution in #620
- @mtake made their first contribution in #639
Full Changelog: v0.11...v0.11.1