Difficulty: Hard
Description:
We should investigate replacing the standard Muon optimizer with NorMuon. This variant modifies the Newton-Schulz iteration steps to effectively balance the spectral norm of the weight matrices. This has shown significant convergence speedups in later speedrun records.
Task:
- Port the NorMuon optimization logic (specifically the update step and axis handling).
- (optional) Run a baseline training run with standard Muon vs. NorMuon.
- (optional) Perform a hyperparameter sweep on the learning rate, as NorMuon often requires different tuning than standard Muon.
References:
🛠️ General Instructions
1. Environment Setup
To get started with development, follow these steps:
- Fork this repository - Click the "Fork" button at the top right.
- Clone your fork:
git clone FORK_URL_HERE
cd 5-dollar-llm
(You may also clone it with our coding IDE)
Note: If you have already forked/cloned, please ensure you sync your fork with this repo & pull the latest changes to your local before starting - we make frequent changes)*
- Install dependencies:
pip install -r requirements.txt
2. Write your code
3. Verification & Testing
- Debug Mode: To quickly check if your code runs without errors (on CPU or GPU), use the debug script:
- Performance Test (optional): We will run the experiments anyways, but you may also run it (specify new name so you don't overwrite the baseline):
python train_moe.py --experiment_name amp_speed_test
(Note: This will use the GPU24GBMoEModelConfig by default)
4. Submission
Once finished, please create a Pull Request into the development branch, preferrably notify us on Discord as well.
No experiment is guaranteed to yield improvement, however, you will be credited for you work in any case.
Difficulty: Hard
Description:
We should investigate replacing the standard Muon optimizer with NorMuon. This variant modifies the Newton-Schulz iteration steps to effectively balance the spectral norm of the weight matrices. This has shown significant convergence speedups in later speedrun records.
Task:
References:
🛠️ General Instructions
1. Environment Setup
To get started with development, follow these steps:
git clone FORK_URL_HERE cd 5-dollar-llm2. Write your code
3. Verification & Testing
4. Submission
Once finished, please create a Pull Request into the
developmentbranch, preferrably notify us on Discord as well.