Skip to content

fix(autoware_tensorrt_plugins): avoid tv::zeros and tv::empty#12378

Merged
veqcc merged 17 commits intoautowarefoundation:mainfrom
veqcc:fix/drive-thor-compatibility
Apr 9, 2026
Merged

fix(autoware_tensorrt_plugins): avoid tv::zeros and tv::empty#12378
veqcc merged 17 commits intoautowarefoundation:mainfrom
veqcc:fix/drive-thor-compatibility

Conversation

@veqcc
Copy link
Copy Markdown
Contributor

@veqcc veqcc commented Mar 24, 2026

Description

This PR removes tv::zeros and tv::empty.
This is because they call cudaMalloc inside tensorview, which leads to engine build error on NVIDIA DRIVE AGX Thor.

  • in implicit_gemm_plugin.cpp , the tv::zeros call is just moved to the constructor
  • in get_indices_pairs_implicit_gemm_plugin.cpp , it is a little bit complex
    • Calc additional workspace size for tv::zeros and tv::empty
    • Call tv::from_blob instead of them at runtime
    • Pre-allocate for thrust workspace

Related links

Parent Issue:

  • Link

How was this PR tested?

I have checked compile/engine-build/execution worked well on both x86-64 machine and DRIVE Thor.
Through TIER IV internal evaluator, there is no regression on inference precision.

Notes for reviewers

None.

Interface changes

None.

Effects on system behavior

None.

Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
@github-actions github-actions bot added the component:perception Advanced sensor data processing and environment understanding. (auto-assigned) label Mar 24, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 24, 2026

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

pre-commit-ci-lite bot and others added 6 commits March 24, 2026 07:42
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
@veqcc veqcc self-assigned this Mar 24, 2026
veqcc and others added 2 commits March 24, 2026 18:37
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
@veqcc veqcc marked this pull request as ready for review March 24, 2026 09:41
@veqcc veqcc added the run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Mar 24, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 0% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (f983578) to head (a08a58d).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...ins/src/get_indices_pairs_implicit_gemm_plugin.cpp 0.00% 31 Missing ⚠️
...ware_tensorrt_plugins/src/implicit_gemm_plugin.cpp 0.00% 3 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (f983578) and HEAD (a08a58d). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (f983578) HEAD (a08a58d)
daily 1 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main   #12378       +/-   ##
===========================================
- Coverage   18.73%    0.00%   -18.74%     
===========================================
  Files        1904       96     -1808     
  Lines      129969     3497   -126472     
  Branches    43951        0    -43951     
===========================================
- Hits        24355        0    -24355     
+ Misses      85621     3497    -82124     
+ Partials    19993        0    -19993     
Flag Coverage Δ
daily ?
full-suite 0.00% <0.00%> (-18.74%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

veqcc and others added 6 commits April 3, 2026 17:51
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
Signed-off-by: Ryuta Kambe <ryuta.kambe@tier4.jp>
@veqcc
Copy link
Copy Markdown
Contributor Author

veqcc commented Apr 3, 2026

@amadeuszsz
Thank you for your reviews!! I have addressed all your comments 👍

Copy link
Copy Markdown
Contributor

@amadeuszsz amadeuszsz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@amadeuszsz
Copy link
Copy Markdown
Contributor

@veqcc
Before merging, could you please make sure that BEVFusion works after latest changes? Checking x86 (desktop) is enough.

@veqcc veqcc added run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) and removed run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) labels Apr 8, 2026
@veqcc
Copy link
Copy Markdown
Contributor Author

veqcc commented Apr 8, 2026

@amadeuszsz

Before merging, could you please make sure that BEVFusion works after latest changes? Checking x86 (desktop) is enough.

I have tested the following on x86-64 machine

  • source build success
    colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release --packages-up-to autoware_bevfusion
  • engine build success
    ros2 launch autoware_bevfusion bevfusion.launch.xml build_only:=true
  • launch success
    ros2 launch autoware_bevfusion bevfusion.launch.xml + ros2 bag play some_rosbag and /perception/detection/objects is published

@veqcc
Copy link
Copy Markdown
Contributor Author

veqcc commented Apr 8, 2026

@amadeuszsz
I have also checked there is no regression by this PR through TIER IV internal evaluator!

Copy link
Copy Markdown
Contributor

@KSeangTan KSeangTan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, please address comments accordingly

@veqcc veqcc merged commit a2f5077 into autowarefoundation:main Apr 9, 2026
39 of 40 checks passed
@github-project-automation github-project-automation bot moved this from To Triage to Done in Software Working Group Apr 9, 2026
@veqcc veqcc deleted the fix/drive-thor-compatibility branch April 9, 2026 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:perception Advanced sensor data processing and environment understanding. (auto-assigned) run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants