-
Notifications
You must be signed in to change notification settings - Fork 32.4k
Open
4 / 74 of 7 issues completedLabels
Description
RFC: Trainer improvements
The Trainer class is a core component of the Transformers library, and we're looking to make it even better.
We're gathering inputs on potential improvements, new features, and pain points you've experienced with the Trainer class.
We're particularly interested in feedback on:
- Training Performance: Speed optimizations, memory efficiency, distributed training improvements
- API & Usability: API design, documentation, ease of use, common pain points
- Features & Functionality: Missing features, callbacks, logging, evaluation, checkpointing
- Integration: Better integration with other libraries or tools
- Debugging & Monitoring: Better error messages, debugging tools, training diagnostics
- Developer Experience: Contributing to Trainer, extending functionality, custom trainers
💭 Questions to Guide Your Feedback
For Users
- What's the most frustrating aspect of using Trainer?
- What feature would save you the most time?
- Are there any workarounds you regularly implement that should be built-in?
- What use cases does Trainer not support well?
- How could the documentation be improved?
For Developers
- What would make it easier to extend or customize Trainer?
- Are there architectural changes that would improve maintainability?
- What integration points with other libraries should we prioritize?
- Where do you see technical debt or areas needing refactoring?
📝 How to Contribute Your Ideas
Please share your thoughts by:
- Commenting below with your feature requests or improvement ideas
- Describing your use case - help us understand the problem you're trying to solve
- Being specific - concrete examples and scenarios are extremely helpful
- Upvoting ideas from others that you'd also find valuable
🤝 What Happens Next
- We'll review all suggestions and feedback
- Popular and high-impact ideas will be prioritized
- Contributors interested in implementing features are welcome to volunteer!
🙏 Thank You
Your feedback drives the evolution of the Transformers library. Every perspective helps us build better tools for the ML community. Special thanks to @stas00 for getting this conversation started !
List of PRs created thanks to this thread
Surface-Level Unbloat
- Simplify TrainingArguments docstring #43568 : better TrainingArguments docstring to easily navigate between the different arguments
- [Trainer] Move sort and rotate checkpoints to standalone functions #43736 : move sort and rotate checkpoints to standalone functions
- [Trainer] Move NEFTune impl to standalone functions #43714 : move netftune to standalone functions
- [Trainer] Move optimizer cls init to trainer_optimizer.py #43738: move optimizers initialization to trainer_optimizers.py file
- Minor changes trainer #43744: bunch of minor changes in Trainer
Methods
Initialization
- Refactor trainer init #43807: Trainer init
- Update TrainingArguments #43806: TrainingArguments init + test
Reroder
- Reorder Trainer methods #43914: Reroder all Trainer methods
Refactor tests
- Refactor trainer data_collator and callbacks tests #43776: data collator and callbacks
- Update common tests Trainer #44260: other common trainer tests
- Update distributed tests #44338: Distributed tests
- Add testing guide for agents for trainer tests #44328: Template for AGENTS when it comes to testing
Priority List
- soon
Reactions are currently unavailable