Skip to content

add scheduler#52

Merged
Zaneham merged 1 commit intoZaneham:masterfrom
nataliakokoromyti:add-instsched
Mar 5, 2026
Merged

add scheduler#52
Zaneham merged 1 commit intoZaneham:masterfrom
nataliakokoromyti:add-instsched

Conversation

@nataliakokoromyti
Copy link
Copy Markdown
Contributor

The scheduler strips s_waitcnt after every load, builds a DAG from register deps (RAW/WAW), schedules by critical path priority, and re-inserts waits just before the first consumer of each load. So two independent loads issue back to back instead of load wait load wait. Runs on vregs between isel and regalloc, per-block only, barriers split into epochs. Should address #9.

@Zaneham
Copy link
Copy Markdown
Owner

Zaneham commented Mar 2, 2026

Hello @nataliakokoromyti!

I'll get to working on reviewing this soon I'm just having a couple days of low tempo with BarraCUDA now the hype train has moved on.

I'll have a look at this tonight and thanks again for your submission!

Copy link
Copy Markdown
Owner

@Zaneham Zaneham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey!! Thanks for this, the overall design is really solid. Stripping waits, DAG from register deps, critical-path scheduling, re-inserting waits at consumers. The architecture is exactly right and the epoch-based barrier handling is clean :-) A few things need fixing before we can merge though!

Correctness issues:

  1. Missing WAR (Write-After-Read) on special registers. The DAG tracks RAW and WAW but not WAR. Vregs are SSA (single-def) so they're fine, but VCC/SCC get redefined. If isel emits v_cmp (writes VCC) then v_cndmask (reads VCC) then another v_cmp (writes VCC), there's no edge preventing the second cmp from being scheduled before the cndmask. It's like a librarian reshelving your book while you're still reading it. Need to track last_use alongside last_def and add use->def edges.

  2. Missing implicit SCC read on s_cselect_b32. Isel emits it as emit2(dst, src0, src1) with no SCC operand, so the scheduler has no RAW edge from the SCC-writing cmp. Same issue for any instruction that implicitly reads SCC.

  3. Missing implicit SCC write on most SOP2 instructions. s_and_b32, s_or_b32, s_lshl_b32, etc. all set SCC but only s_add_u32/s_sub_u32 and SOPC are tracked.

Safety violations (we're pretty strict about these!):

  1. Unbounded loop. The main scheduling loop is for (;;) with a break. All loops need bounded iteration counts -- something like for (uint32_t guard = 0; guard < SCHED_MAX_BLOCK * 2; guard++). Runaway loops in a compiler are like runaway trains except the train is on fire and the tracks are your user's deadline.

  2. Missing bounds checks on s_final, s_ready, and s_output. The fn index during wait reinsertion, nready in the scheduling loop, and out_pos in amdgpu_sched() all increment without overflow guards. If they exceed the array size we get silent memory corruption.

  3. add_edge silently drops edges at SCHED_MAX_DEPS (16). If a register has more than 16 consumers, dependency edges vanish and the scheduler could misorder those instructions. It's like air traffic control just forgetting about plane 17 because the whiteboard ran out of space. Should either bump the limit or pin the node as unschedulable when it overflows.

Housekeeping:

  1. File naming. Other files in src/amdgpu/ use short names (isel.c, emit.c, encode.c). Would be great to rename to sched.c/sched.h to match.

  2. Merge conflict. The branch is based on pre-tensix-reorg so the Makefile has src/tensix_isel.c etc. which moved to src/tensix/isel.c. Will need a rebase on master.

The core scheduling logic is great work though!! Really looking forward to getting this merged once the fixes are in :-) If you have any questions or need any help please feel free to reach out. Thanks again!

@nataliakokoromyti nataliakokoromyti force-pushed the add-instsched branch 4 times, most recently from 90b9af4 to 610595f Compare March 3, 2026 22:14
@Zaneham Zaneham merged commit 92b5d19 into Zaneham:master Mar 5, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants