[SM90] Change register allocation for TileN=208 to avoid spills #2219

tridao · 2025-04-04T16:44:16Z

With the usual register allocation (producer 40, consumer 232) compiling Gemm with tile shape 256 x 208 (cooperative) or 128 x 208 (pingpong) show lots of register spilling (e.g. ~3000 bytes spill). For this case we can change the register allocation to producer 24, consumer 240, which avoids spills.
Cc @thakkarV @hwu36

With the usual register allocation (producer 40, consumer 232) compiling Gemm with tile shape 256 x 208 (cooperative) or 128 x 208 (pingpong) show lots of register spilling (e.g. ~3000 bytes spill). For this case we can change the register allocation to producer 24, consumer 240, which avoids spills.

thakkarV · 2025-04-04T16:50:00Z

This is awesome! Thank you :D

thakkarV · 2025-04-04T16:50:18Z

@hwu36 @Junkai-Wu

thakkarV · 2025-04-04T16:50:55Z

Were you able to AB test perf results by any chance?

tridao · 2025-04-04T16:53:20Z

Perfwise nothing changes for the usual tile shapes since we keep reg allocation the same for those. For tile shape 256 x 208 it's obv a lot faster without the massive spill (5-10x). In some niche cases where 256 x 208 fits just right and there's minimal wave quantization 256 x 208 can be a good choice.

…IA#2219) With the usual register allocation (producer 40, consumer 232) compiling Gemm with tile shape 256 x 208 (cooperative) or 128 x 208 (pingpong) show lots of register spilling (e.g. ~3000 bytes spill). For this case we can change the register allocation to producer 24, consumer 240, which avoids spills.

hwu36 approved these changes Apr 21, 2025

View reviewed changes

hwu36 merged commit ade6376 into NVIDIA:main Apr 21, 2025

tridao deleted the tridao/regalloc-208 branch June 8, 2025 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SM90] Change register allocation for TileN=208 to avoid spills #2219

[SM90] Change register allocation for TileN=208 to avoid spills #2219

Uh oh!

tridao commented Apr 4, 2025 •

edited

Loading

Uh oh!

thakkarV commented Apr 4, 2025

Uh oh!

thakkarV commented Apr 4, 2025

Uh oh!

thakkarV commented Apr 4, 2025

Uh oh!

tridao commented Apr 4, 2025

Uh oh!

Uh oh!

[SM90] Change register allocation for TileN=208 to avoid spills #2219

[SM90] Change register allocation for TileN=208 to avoid spills #2219

Uh oh!

Conversation

tridao commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thakkarV commented Apr 4, 2025

Uh oh!

thakkarV commented Apr 4, 2025

Uh oh!

thakkarV commented Apr 4, 2025

Uh oh!

tridao commented Apr 4, 2025

Uh oh!

Uh oh!

tridao commented Apr 4, 2025 •

edited

Loading