❓ [Question] TensorRT Export Failure with Large Input Sizes

## ❓ Question



I'm trying to export a torch model that processes large inputs (e.g., 8192x2048). I have noticed that `torch_tensorrt.compile` fails with inputs greater than 4096x2048 (I haven't tried them all, only powers of 2). Specifically, the conversion fails for convolution and ReLU operations with a "No valid tactics" and "Illegal memory access" error:
```
[1A2024-11-29 16:56:42,307 - torch_tensorrt [TensorRT Conversion Context] - ERROR - [scopedCudaResources.cpp::~ScopedCudaStream::55] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
2024-11-29 16:56:42,311 - torch_tensorrt [TensorRT Conversion Context] - ERROR - IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node [CONVOLUTION]-[aten_ops.convolution.default]-[teacher.3/convolution_5] + [RELU]-[aten_ops.relu.default]-[teacher.4/relu_4].)
2024-11-29 16:56:42,312 - [MODEL EXPORT] - ERROR - TensorRT export failed: 
Traceback (most recent call last):
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/tools/launchers.py", line 398, in <module>
    export(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/tools/launchers.py", line 298, in export
    trt_model = torch_tensorrt.compile(model, **compile_spec)
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/_compile.py", line 269, in compile
    trt_graph_module = dynamo_compile(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 288, in compile
    trt_gm = compile_module(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 464, in compile_module
    trt_module = convert_module(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 142, in convert_module
    interpreter_result = interpret_module_to_result(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 121, in interpret_module_to_result
    interpreter_result = interpreter.run()
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 635, in run
    assert serialized_engine
AssertionError
```



Here attached is the script and full output log: [issue.zip](https://github.com/user-attachments/files/17961259/issue.zip)

## Environment

 - PyTorch Version (e.g., 1.0): 2.5.1+cu121
 - TorchTensorRT Version: 2.5.0
 - CPU Architecture: AMD EPYC 7543 32-Core Processor
 - OS (e.g., Linux): Ubuntu 22.04.5 LTS
 - How you installed PyTorch (`conda`, `pip`, `libtorch`, source): pip
 - Python version: 3.10.12
 - CUDA version: Cuda compilation tools, release 12.1, V12.1.66 Build cuda_12.1.r12.1/compiler.32415258_0
 - GPU models and configuration: NVIDIA A100-SXM4-80GB, on SLURM with MIG enabled.

Is there any limit to the input size when converting using torch_tensorrt? Any solution to this problem?

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

❓ [Question] TensorRT Export Failure with Large Input Sizes #3307

❓ Question

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

❓ [Question] TensorRT Export Failure with Large Input Sizes #3307

Description

❓ Question

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions