Skip to content

[Bug]test_bmm_fp8_reference_correctness fails with cos_sim=-0.0019 < 0.99 #3188

@cindyzxq

Description

@cindyzxq

Found this issue in nightly unit-test. https://gitlab-master.nvidia.com/dl/flashinfer/flashinfer-ci/-/pipelines/49533466
Failed configurations:
❌ GB200
Jobs: unit_test_gb200_pretyche: [cu129] | unit_test_gb200_pretyche: [cu130]

  • tests.trace.test_reference_correctness

❌ GB300
Jobs: unit_test_gb300_lyris: [cu129] | unit_test_gb300_lyris: [cu130]

  • tests.trace.test_reference_correctness

Summary

tests/trace/test_reference_correctness.py::test_bmm_fp8_reference_correctness introduced in #2931 (commit
24c4aee) fails on SM100/SM103 (Blackwell) GPUs with:

AssertionError: cos_sim=-0.0019 < 0.99

Error log:

2026-04-27T10:43:50.983982Z 01O tests/trace/test_reference_correctness.py .....................s..s..s.. [ 43%]
2026-04-27T10:43:50.983982Z 01O ...........ss........................sF                                  [100%]
2026-04-27T10:43:50.983983Z 01O 
2026-04-27T10:43:50.983983Z 01O =================================== FAILURES ===================================
2026-04-27T10:43:50.983984Z 01O E   AssertionError: cos_sim=-0.0011 < 0.99
2026-04-27T10:43:50.983984Z 01O     assert -0.0010547435376793146 > 0.99
2026-04-27T10:43:50.983985Z 01O      +  where -0.0010547435376793146 = <built-in method item of Tensor object at 0xfffa45187570>()
2026-04-27T10:43:50.983985Z 01O      +    where <built-in method item of Tensor object at 0xfffa45187570> = tensor(-0.0011, device='cuda:0').item
2026-04-27T10:43:50.983986Z 01O /workspace/flashinfer/tests/trace/test_reference_correctness.py:57: AssertionError: cos_sim=-0.0011 < 0.99
2026-04-27T10:43:50.984096Z 01O - generated xml file: /tmp/junit/tests_trace_test_reference_correctness.py.1030663895.xml -
2026-04-27T10:43:50.984101Z 01O =========================== short test summary info ============================
2026-04-27T10:43:50.984102Z 01O FAILED tests/trace/test_reference_correctness.py::test_bmm_fp8_reference_correctness - AssertionError: cos_sim=-0.0011 < 0.99
2026-04-27T10:43:50.984103Z 01O assert -0.0010547435376793146 > 0.99
2026-04-27T10:43:50.984104Z 01O  +  where -0.0010547435376793146 = <built-in method item of Tensor object at 0xfffa45187570>()
2026-04-27T10:43:50.984105Z 01O  +    where <built-in method item of Tensor object at 0xfffa45187570> = tensor(-0.0011, device='cuda:0').item
2026-04-27T10:43:50.984106Z 01O ============= 1 failed, 62 passed, 6 skipped in 236.50s (0:03:56) ==============
2026-04-27T10:43:50.984106Z 01O ❌ FAILED: tests/trace/test_reference_correctness.py
2026-04-27T10:43:50.984107Z 01O 

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions