Skip to content

v0.3.0 - Granite 4, Mamba, Env var support, and Memory Estimation

Latest

Choose a tag to compare

@Maxusmusti Maxusmusti released this 14 Oct 21:56
· 5 commits to main since this release
e6c8cca

This release introduces memory profiling capabilities, enhanced distributed training orchestration, and support for Granite 4 and Mamba models. Backend implementations have been updated to instructlab-training v0.12.1 and mini-trainer v0.3.0.

What's New

Memory Profiling API (Experimental)

  • New memory estimation tool for fine-tuning workloads
  • Reports per-GPU VRAM requirements (parameters, optimizer state, gradients, activations, outputs)
  • Supports both SFT and OSFT algorithms
  • Returns low/expected/high memory bounds for better resource planning
  • Includes Liger-kernel-aware adjustments
  • Example notebook and documentation included

Enhanced Distributed Training

  • Automatic torchrun configuration from environment variables
  • Full compatibility with Kubeflow and other orchestration systems
  • Support for auto and gpu process count specifications
  • Centralized launch parameter handling with hierarchical priority
  • Improved validation with clear conflict warnings and error messages
  • Flexible argument types (string or integer) for multi-node parameters
  • Explicit master address and port configuration options

Model Support Expansion

  • Granite 4 support (transformers>=4.57.0)
  • Mamba model support with optional CUDA acceleration (mamba-ssm[causal-conv1d]>=2.2.5)
  • Enhanced compatibility through dependency updates

Infrastructure Improvements

  • Uncapped NumPy for better forward compatibility
  • Minimum Numba version raised to 0.62.0
  • Liger kernel pinned to >=0.5.10 for stability
  • Updated backend implementations (instructlab-training>=0.12.1, rhai-innovation-mini-trainer>=0.3.0)

What's Changed

  • Pinning liger-kernal version by @Fiona-Waters in #9
  • Adding min dependencies for Granite 4 / Mamba support by @Maxusmusti in #14
  • uncap numpy and raise minimum numba version by @RobotSail in #15
  • Adding basic API for memory profiling (src/training_hub/profiling) by @mazam-lab in #11
  • feat(traininghub): Use torchrun environment variables for default configuration by @szaher in #13
  • Update backend implementation dep versions in pyproject.toml by @Maxusmusti in #19

New Contributors

Full Changelog: v0.2.0...v0.3.0