[Bug] _scaled_dot_product_attention__tensorrt error when exporting DETR model to onnx-trt

### Checklist

- [x] I have searched related issues but cannot get the expected help.
- [x] 2. I have read the [FAQ documentation](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/faq.md) but cannot get the expected help.
- [x] 3. The bug has not been fixed in the latest version.

### Describe the bug

When exporting DETR model using provided checkpoint to Onnx and TensorRT, it gives error with _scaled_dot_product_attention__tensorrt. 

I am able to run faster r-cnn export example. Transformer model is only giving error. 
Any guidance to fix this issue would be appreciated. Thanks!

### Reproduction

python mmdeploy/tools/deploy.py mmdeploy/configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py ../mmdetection/configs/detr/detr_r50_8xb2-150e_coco.py checkpoints/detr_r50_8xb2-150e_coco_20221023_153551-436d03e8.pth ../mmdetection/demo/demo.jpg --work-dir mmdeploy_model/detr --device cuda

No modification is made and the checkpoint/config are downloaded from https://github.com/open-mmlab/mmdetection/tree/3.x/configs/detr. 

### Environment

```Shell
08/09 16:55:32 - mmengine - [4m[37mINFO[0m - **********Environmental information**********
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - sys.platform: linux
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - Python: 3.8.10 (default, May 26 2023, 14:05:08) [GCC 9.4.0]
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - CUDA available: True
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - numpy_random_seed: 2147483648
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - GPU 0: NVIDIA GeForce GTX 1080 Ti
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - CUDA_HOME: /usr/local/cuda
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - NVCC: Cuda compilation tools, release 12.0, V12.0.140
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - GCC: x86_64-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - PyTorch: 1.14.0a0+44dac51
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - PyTorch compiling details: PyTorch built with:
  - GCC 9.4
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.1-Product Build 20201104 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.0 (Git Hash N/A)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: NO AVX
  - CUDA Runtime 12.0
  - NVCC architecture flags: -gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_90,code=compute_90
  - CuDNN 8.7  (built against CUDA 11.8)
  - Magma 2.6.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.0, CUDNN_VERSION=8.7.0, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS=-fno-gnu-unique -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=1.14.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

08/09 16:55:33 - mmengine - [4m[37mINFO[0m - TorchVision: 0.15.0a0
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - OpenCV: 4.5.5
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - MMEngine: 0.8.4
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - MMCV: 2.0.0
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - MMCV Compiler: GCC 9.4
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - MMCV CUDA Compiler: 12.0
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - MMDeploy: 1.2.0+
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - 

08/09 16:55:33 - mmengine - [4m[37mINFO[0m - **********Backend information**********
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - tensorrt:	8.5.3.1
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - tensorrt custom ops:	Available
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - ONNXRuntime:	1.14.1
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - ONNXRuntime-gpu:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - ONNXRuntime custom ops:	Available
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - pplnn:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - ncnn:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - snpe:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - openvino:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - torchscript:	1.14.0a0+44dac51
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - torchscript custom ops:	NotAvailable
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - rknn-toolkit:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - rknn-toolkit2:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - ascend:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - coreml:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - tvm:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - vacc:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - 

08/09 16:55:33 - mmengine - [4m[37mINFO[0m - **********Codebase information**********
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmdet:	3.0.0
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmseg:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmpretrain:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmocr:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmagic:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmdet3d:	1.1.1
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmpose:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmrotate:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmaction:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmrazor:	None
08/09 16:55:33 - mmengine - [4m[37mINFO[0m - mmyolo:	None
```


### Error traceback

```Shell
08/09 16:58:19 - mmengine - [4m[37mINFO[0m - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
08/09 16:58:20 - mmengine - [5m[4m[33mWARNING[0m - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
08/09 16:58:20 - mmengine - [5m[4m[33mWARNING[0m - Failed to search registry with scope "mmdet" in the "mmdet_tasks" registry tree. As a workaround, the current "mmdet_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
Loads checkpoint by local backend from path: checkpoints/detr_r50_8xb2-150e_coco_20221023_153551-436d03e8.pth
The model and loaded state dict do not match exactly

unexpected key in source state_dict: data_preprocessor.mean, data_preprocessor.std

08/09 16:58:21 - mmengine - [5m[4m[33mWARNING[0m - DeprecationWarning: get_onnx_config will be deprecated in the future. 
08/09 16:58:21 - mmengine - [4m[37mINFO[0m - Export PyTorch model to ONNX: mmdeploy_model/detr/end2end.onnx.
08/09 16:58:22 - mmengine - [5m[4m[33mWARNING[0m - Can not find mmdet.models.utils.transformer.PatchMerging.forward, function rewrite will not be applied
========== Diagnostic Run torch.onnx.export version 1.14.0a0+44dac51 ===========
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

<frozen importlib._bootstrap>:219: RuntimeWarning: scipy._lib.messagestream.MessageStream size changed, may indicate binary incompatibility. Expected 56 from C header, got 64 from PyObject
/usr/local/lib/python3.8/dist-packages/mmdeploy/core/optimizers/function_marker.py:160: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  ys_shape = tuple(int(s) for s in ys.shape)
/mmdetection/mmdet/models/detectors/detr.py:91: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  img_h, img_w = img_shape_list[img_id]
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/pytorch2onnx.py", line 98, in torch2onnx
    export(
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/onnx/export.py", line 131, in export
    torch.onnx.export(
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 506, in export
    _export(
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 1533, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/onnx/optimizer.py", line 27, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 1113, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 989, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py", line 893, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 1260, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 127, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py", line 118, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1467, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/apis/onnx/export.py", line 123, in wrapper
    return forward(*arg, **kwargs)
  File "/mmdetection/mmdet/models/detectors/base.py", line 94, in forward
    return self.predict(inputs, data_samples)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/codebase/mmdet/models/detectors/base_detr.py", line 89, in detection_transformer__predict
    return __predict_impl(self, batch_inputs, data_samples, rescale)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/core/optimizers/function_marker.py", line 266, in g
    rets = f(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmdeploy/codebase/mmdet/models/detectors/base_detr.py", line 22, in __predict_impl
    head_inputs_dict = self.forward_transformer(img_feats, data_samples)
  File "/mmdetection/mmdet/models/detectors/base_detr.py", line 218, in forward_transformer
    encoder_outputs_dict = self.forward_encoder(**encoder_inputs_dict)
  File "/mmdetection/mmdet/models/detectors/detr.py", line 135, in forward_encoder
    memory = self.encoder(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1467, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/mmdetection/mmdet/models/layers/transformer/detr_layers.py", line 60, in forward
    query = layer(query, query_pos, key_padding_mask, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1467, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/mmdetection/mmdet/models/layers/transformer/detr_layers.py", line 206, in forward
    query = self.self_attn(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1467, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/mmengine/mmengine/utils/misc.py", line 395, in new_func
    output = old_func(*args, **kwargs)
  File "/mmcv/mmcv/cnn/bricks/transformer.py", line 542, in forward
    out = self.attn(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1480, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1467, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/activation.py", line 1164, in forward
    attn_output, attn_output_weights = F.multi_head_attention_forward(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 5186, in multi_head_attention_forward
    attn_output, attn_output_weights = _scaled_dot_product_attention(
TypeError: _scaled_dot_product_attention__tensorrt() takes from 3 to 5 positional arguments but 7 were given
08/09 16:58:23 - mmengine - [5m[4m[31mERROR[0m - /usr/local/lib/python3.8/dist-packages/mmdeploy/apis/core/pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] _scaled_dot_product_attention__tensorrt error when exporting DETR model to onnx-trt #2340

Checklist

Describe the bug

Reproduction

Environment

Error traceback

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] _scaled_dot_product_attention__tensorrt error when exporting DETR model to onnx-trt #2340

Description

Checklist

Describe the bug

Reproduction

Environment

Error traceback

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions