Skip to content

Commit bf2889c

Browse files
authored
Merge pull request #120 from EmelyanenkoK/pr-118
Drop MLP blocks in first layer. Includes WR Changes from PR#109, PR#117 and PR#118
2 parents 265d55d + 29f3978 commit bf2889c

22 files changed

Lines changed: 56688 additions & 5 deletions

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ To run the current record, run the following commands.
3838
git clone https://github.com/KellerJordan/modded-nanogpt.git && cd modded-nanogpt
3939
pip install -r requirements.txt
4040
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu126 --upgrade
41-
# downloads only the first 800M training tokens to save time
42-
python data/cached_fineweb10B.py 8
41+
# downloads only the first 900M training tokens to save time
42+
python data/cached_fineweb10B.py 9
4343
./run.sh
4444
```
4545

0 commit comments

Comments
 (0)