Add mirrored-recurrence MLX non-record submission by cschubiner · Pull Request #84 · openai/parameter-golf

cschubiner · 2026-03-19T13:44:44Z

Adds a non-record mirrored-recurrence submission under records/track_non_record_16mb/2026-03-19_MirrorRecurrence_MLX_M5Max_sp1024.

Summary:

Apple Silicon MLX local run on M5 Max
18 logical layers with 9 unique transformer blocks via mirrored recurrence
Intended as an interesting non-record experiment under the 16MB cap
Includes submission materials in the record folder

This PR only adds the new records folder for the submission.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6853b49028

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T13:52:11Z

+
+Configuration:
+- Hardware: Apple `M5 Max`, MLX `0.31.1`
+- Data: published `fineweb10B_sp1024` export, full validation split, `1/195` training shards


Specify the exact single training shard used

This README says the run used the published fineweb10B_sp1024 export with 1/195 train shards, but it never identifies which shard was kept or how DATA_PATH was prepared. The checked-in train.log shows the actual run only saw one shard (train_shards:1/195), so rerunning the documented command against a normal fineweb10B_sp1024 export will train on all 195 shards and produce a materially different experiment. As written, the submission is not reproducible.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-19T13:52:11Z

+            self.encoder_schedule = [i % self.unique_layers for i in range(self.num_encoder_layers)]
+            mirrored_decoder = list(reversed(self.encoder_schedule[: self.num_skip_weights]))
+            trailing_decoder = [i % self.unique_layers for i in range(self.num_skip_weights, self.num_decoder_layers)]
+            self.decoder_schedule = mirrored_decoder + trailing_decoder


Reject UNIQUE_LAYERS settings that leave blocks unused

This schedule only uses every allocated block when UNIQUE_LAYERS <= NUM_LAYERS // 2 or UNIQUE_LAYERS == NUM_LAYERS. For configurations that the constructor currently accepts in between those ranges—for example NUM_LAYERS=18, UNIQUE_LAYERS=12—encoder_schedule is still 0..8 and decoder_schedule becomes 8..0, so blocks[9:] are dead parameters that never participate in the forward pass but still count toward the 16 MB budget. Either the schedule needs to cover all unique blocks or those settings should be rejected.

Useful? React with 👍 / 👎.

MatoTeziTanka · 2026-04-11T20:17:20Z

Community Review — Add mirrored-recurrence MLX non-record submission

Compliance: NEEDS AUTHOR ACTION — train_gpt.py fails to import on CT2038 (Python 3.10 / torch 2.10.0+cpu)

What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with:

ModuleNotFoundError: No module named 'mlx'

A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:

PEP 701 f-string nesting — e.g. log(f" {cat}: {", ".join(...)}") is valid Python 3.12+ but invalid Python 3.10 because the inner ", " re-enters the outer double-quote context. One-character fix: ', ' instead of ", ". See PR Record: SP8192 + Improved Parallel Residuals + Muon 0.97 + LR 0.03 + Legal TTT — val_bpb 1.07785 (3-seed mean) #1541 / Record: SP8192 + Triple Recurrence + Banking + Fused MLP + Muon 0.97 — val_bpb 1.0778 (3-seed mean) #1523 for reference.
Missing flash_attn variants — e.g. from flash_attn_interface import flash_attn_varlen_func when the wrapper script only stubs flash_attn_func. Not a PR defect on H100s, but the eval image / CPU preflight path needs a guarded import.
Local compiled extension — e.g. import cutlass_evt_fusion from a records/*/cutlass_evt_fusion/ subfolder that isn't on the import path at smoke time. Usually an import-order issue inside the script.
Actual syntax error — typo, missing bracket, etc.

Recommendation: Could you run python3 -c "import py_compile; py_compile.compile('train_gpt.py')" on your records-folder train_gpt.py under Python 3.10 specifically? The eval image is Python 3.10 per Issue #17 / the README, so any parse error on 3.10 blocks the submission at import time before any of the scored-eval logic runs.

Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step.

Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — ModuleNotFoundError: No module named 'mlx'. Classification via classify_prs.py AST-based classifier; full compliance audit deferred until the import issue is resolved. Auto-drafted from a template and spot-checked before posting.

Add mirror recurrence non-record submission

6853b49

chatgpt-codex-connector Bot reviewed Mar 19, 2026

View reviewed changes

notapplica mentioned this pull request Mar 19, 2026

Parameter Golf Formerly Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes. Now disabled #140

Closed

AnirudhaRamesh mentioned this pull request Mar 24, 2026

Autoresearch over a Collective + Distributed Candidate Pool — M4 Air, val_bpb=1.9263 (non-record) #597

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mirrored-recurrence MLX non-record submission#84

Add mirrored-recurrence MLX non-record submission#84
cschubiner wants to merge 1 commit intoopenai:mainfrom
cschubiner:codex/parameter-golf-mlx-local-submission

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 19, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 19, 2026

Uh oh!

MatoTeziTanka commented Apr 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

MatoTeziTanka commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Community Review — Add mirrored-recurrence MLX non-record submission

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MatoTeziTanka commented Apr 11, 2026 •

edited

Loading