add scheduler by nataliakokoromyti · Pull Request #52 · Zaneham/BarraCUDA

nataliakokoromyti · 2026-03-01T00:29:44Z

The scheduler strips s_waitcnt after every load, builds a DAG from register deps (RAW/WAW), schedules by critical path priority, and re-inserts waits just before the first consumer of each load. So two independent loads issue back to back instead of load wait load wait. Runs on vregs between isel and regalloc, per-block only, barriers split into epochs. Should address #9.

Zaneham · 2026-03-02T21:30:01Z

Hello @nataliakokoromyti!

I'll get to working on reviewing this soon I'm just having a couple days of low tempo with BarraCUDA now the hype train has moved on.

I'll have a look at this tonight and thanks again for your submission!

Zaneham

Hey!! Thanks for this, the overall design is really solid. Stripping waits, DAG from register deps, critical-path scheduling, re-inserting waits at consumers. The architecture is exactly right and the epoch-based barrier handling is clean :-) A few things need fixing before we can merge though!

Correctness issues:

Missing WAR (Write-After-Read) on special registers. The DAG tracks RAW and WAW but not WAR. Vregs are SSA (single-def) so they're fine, but VCC/SCC get redefined. If isel emits v_cmp (writes VCC) then v_cndmask (reads VCC) then another v_cmp (writes VCC), there's no edge preventing the second cmp from being scheduled before the cndmask. It's like a librarian reshelving your book while you're still reading it. Need to track last_use alongside last_def and add use->def edges.
Missing implicit SCC read on s_cselect_b32. Isel emits it as emit2(dst, src0, src1) with no SCC operand, so the scheduler has no RAW edge from the SCC-writing cmp. Same issue for any instruction that implicitly reads SCC.
Missing implicit SCC write on most SOP2 instructions. s_and_b32, s_or_b32, s_lshl_b32, etc. all set SCC but only s_add_u32/s_sub_u32 and SOPC are tracked.

Safety violations (we're pretty strict about these!):

Unbounded loop. The main scheduling loop is for (;;) with a break. All loops need bounded iteration counts -- something like for (uint32_t guard = 0; guard < SCHED_MAX_BLOCK * 2; guard++). Runaway loops in a compiler are like runaway trains except the train is on fire and the tracks are your user's deadline.
Missing bounds checks on s_final, s_ready, and s_output. The fn index during wait reinsertion, nready in the scheduling loop, and out_pos in amdgpu_sched() all increment without overflow guards. If they exceed the array size we get silent memory corruption.
add_edge silently drops edges at SCHED_MAX_DEPS (16). If a register has more than 16 consumers, dependency edges vanish and the scheduler could misorder those instructions. It's like air traffic control just forgetting about plane 17 because the whiteboard ran out of space. Should either bump the limit or pin the node as unschedulable when it overflows.

Housekeeping:

File naming. Other files in src/amdgpu/ use short names (isel.c, emit.c, encode.c). Would be great to rename to sched.c/sched.h to match.
Merge conflict. The branch is based on pre-tensix-reorg so the Makefile has src/tensix_isel.c etc. which moved to src/tensix/isel.c. Will need a rebase on master.

The core scheduling logic is great work though!! Really looking forward to getting this merged once the fixes are in :-) If you have any questions or need any help please feel free to reach out. Thanks again!

nataliakokoromyti force-pushed the add-instsched branch from 2c1aca3 to e63c858 Compare March 1, 2026 00:36

Zaneham reviewed Mar 3, 2026

View reviewed changes

nataliakokoromyti force-pushed the add-instsched branch 4 times, most recently from 90b9af4 to 610595f Compare March 3, 2026 22:14

add scheduler

75e8a99

nataliakokoromyti force-pushed the add-instsched branch from 610595f to 75e8a99 Compare March 3, 2026 22:15

nataliakokoromyti requested a review from Zaneham March 3, 2026 22:17

Zaneham merged commit 92b5d19 into Zaneham:master Mar 5, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add scheduler#52

add scheduler#52
Zaneham merged 1 commit intoZaneham:masterfrom
nataliakokoromyti:add-instsched

nataliakokoromyti commented Mar 1, 2026

Uh oh!

Zaneham commented Mar 2, 2026

Uh oh!

Zaneham left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nataliakokoromyti commented Mar 1, 2026

Uh oh!

Zaneham commented Mar 2, 2026

Uh oh!

Zaneham left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Zaneham left a comment •

edited

Loading