Skip to content

Clean SP8192 LegalTTT reproduction metadata#7

Merged
teslaeco merged 1 commit intomainfrom
codex/task-title-5jezva
Apr 18, 2026
Merged

Clean SP8192 LegalTTT reproduction metadata#7
teslaeco merged 1 commit intomainfrom
codex/task-title-5jezva

Conversation

@teslaeco
Copy link
Copy Markdown
Member

Motivation

  • Final hygiene cleanup to ensure metadata files are plain UTF-8 / ASCII-safe and free of bidi/control characters without changing any reported metrics, logs, claims, or folder structure.

Description

  • Rewrote submission.json as clean pretty-printed JSON (expanded the seeds array) and ensured README.md is plain UTF-8 text; preserved all numeric metrics, seeds, dates, hardware, compliance flags, logs, train_gpt.py, folder name, and the root leaderboard.

Testing

  • Validated submission.json parses with python json.load, ran a byte-level scan for bidi/control characters on README.md and submission.json (no issues), verified required per-seed val_bpb values and dataset/tokenizer markers exist in each train_seed*.log, observed the upstream train_gpt.py reference was missing so no replacement was applied, and confirmed git diff --stat shows only submission.json changed (1 file changed, 5 insertions(+), 1 deletion(-)).

Codex Task

@teslaeco teslaeco merged commit 3a1430b into main Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant