Fix [Spark unit test CI]: defer torch._dynamo.disable to avoid import-time crash in CI by kahyunnam · Pull Request #3290 · flashinfer-ai/flashinfer

kahyunnam · 2026-05-11T22:04:44Z

📌 Description

feat(moe): add SM120 W4A16 b12x kernels #3271 (feat(moe): add SM120 W4A16 b12x kernels) added a @torch._dynamo.disable decorator to current_cuda_stream() in cute_dsl/utils.py. This eagerly imports torch._dynamo at module load time, which triggers getpass.getuser() during cache-dir initialization. This crashes in CI containers running as unmapped UIDs (e.g. Spark runners with -u $(id -u):$(id -g) mapping to UID 996, which has no /etc/passwd entry).
Replaces the eager decorator with a self-replacing lazy wrapper that defers torch._dynamo.disable to the first call of current_cuda_stream(), with zero overhead on subsequent calls.

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

Performance
- Improved module import performance by deferring CUDA stream initialization until first use, reducing startup overhead.

… in CI The @torch._dynamo.disable decorator on current_cuda_stream() triggered torch._dynamo import at module load time, which initializes torch._inductor's cache directory via getpass.getuser(). This fails in CI containers running with -u $(id -u):$(id -g) when the UID has no /etc/passwd entry (KeyError: 'getpwuid(): uid not found: 996'). Use a self-replacing lazy wrapper so torch._dynamo.disable is applied on first call rather than at import time. Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai · 2026-05-11T22:04:58Z

📝 Walkthrough

Walkthrough

Optimizes module initialization by deferring torch._dynamo.disable wrapper creation until first call to current_cuda_stream(), reducing import-time overhead while preserving function behavior.

Changes

Lazy torch._dynamo Initialization

Layer / File(s)	Summary
Deferred Wrapper Creation `flashinfer/cute_dsl/utils.py`	`current_cuda_stream()` is refactored to defer `torch._dynamo.disable` application. A private `_current_cuda_stream_impl()` holds the CUDA stream retrieval logic. The public `current_cuda_stream()` wraps itself with the decorator on first invocation, then calls the wrapped function, eliminating import-time `torch._dynamo` initialization.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

kaixih
aleozlx
yzh119
jimmyzho

Poem

🐰 Ah, what cunning deferment!
Load the module, light as air,
Wrap it once, when first you dare,
No torch._dynamo at import,
CUDA streams, lazily caught.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: deferring torch._dynamo.disable to fix CI crashes in Spark unit tests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description comprehensively explains the problem and solution, including the root cause of the CI crash and the implementation approach.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

kahyunnam · 2026-05-11T22:05:01Z

/bot run

gemini-code-assist

Code Review

This pull request refactors the current_cuda_stream function in flashinfer/cute_dsl/utils.py to implement a lazy wrapper for the torch._dynamo.disable decorator. This change prevents torch._dynamo from being imported at module load time, which addresses potential failures in container environments running with unmapped UIDs. I have no feedback to provide as no review comments were submitted.

flashinfer-bot · 2026-05-11T22:06:42Z

GitLab MR !661 has been created, and the CI pipeline #50966858 is currently running. I'll report back once the pipeline job completes.

nv-yunzheq

Approve, please merge it after CI comes clean

kahyunnam self-assigned this May 11, 2026

kahyunnam added arch: DGX Spark v0.6.12 labels May 11, 2026

kahyunnam requested review from aleozlx, bkryu, cyx-6, jimmyzho, nv-yunzheq, saltyminty, samuellees, sricketts, yongwww, yyihuang and yzh119 as code owners May 11, 2026 22:04

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

nv-yunzheq approved these changes May 11, 2026

View reviewed changes

kahyunnam added v0.6.11 release blocker label for 0.6.11 and removed v0.6.12 labels May 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix [Spark unit test CI]: defer torch._dynamo.disable to avoid import-time crash in CI#3290

Fix [Spark unit test CI]: defer torch._dynamo.disable to avoid import-time crash in CI#3290
kahyunnam wants to merge 1 commit into
flashinfer-ai:mainfrom
kahyunnam:knam/fix-defer-torch-dynamo-disable

kahyunnam commented May 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

kahyunnam commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

flashinfer-bot commented May 11, 2026

Uh oh!

nv-yunzheq left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kahyunnam commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

kahyunnam commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

flashinfer-bot commented May 11, 2026

Uh oh!

nv-yunzheq left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kahyunnam commented May 11, 2026 •

edited

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading