Skip to content

❓ [Question] TensorRT Export Failure with Large Input Sizes #3307

Open
@AndreaBrg

Description

@AndreaBrg

❓ Question

I'm trying to export a torch model that processes large inputs (e.g., 8192x2048). I have noticed that torch_tensorrt.compile fails with inputs greater than 4096x2048 (I haven't tried them all, only powers of 2). Specifically, the conversion fails for convolution and ReLU operations with a "No valid tactics" and "Illegal memory access" error:

[1A2024-11-29 16:56:42,307 - torch_tensorrt [TensorRT Conversion Context] - ERROR - [scopedCudaResources.cpp::~ScopedCudaStream::55] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
2024-11-29 16:56:42,311 - torch_tensorrt [TensorRT Conversion Context] - ERROR - IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node [CONVOLUTION]-[aten_ops.convolution.default]-[teacher.3/convolution_5] + [RELU]-[aten_ops.relu.default]-[teacher.4/relu_4].)
2024-11-29 16:56:42,312 - [MODEL EXPORT] - ERROR - TensorRT export failed: 
Traceback (most recent call last):
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/tools/launchers.py", line 398, in <module>
    export(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/tools/launchers.py", line 298, in export
    trt_model = torch_tensorrt.compile(model, **compile_spec)
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/_compile.py", line 269, in compile
    trt_graph_module = dynamo_compile(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 288, in compile
    trt_gm = compile_module(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 464, in compile_module
    trt_module = convert_module(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 142, in convert_module
    interpreter_result = interpret_module_to_result(
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 121, in interpret_module_to_result
    interpreter_result = interpreter.run()
  File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 635, in run
    assert serialized_engine
AssertionError

Here attached is the script and full output log: issue.zip

Environment

  • PyTorch Version (e.g., 1.0): 2.5.1+cu121
  • TorchTensorRT Version: 2.5.0
  • CPU Architecture: AMD EPYC 7543 32-Core Processor
  • OS (e.g., Linux): Ubuntu 22.04.5 LTS
  • How you installed PyTorch (conda, pip, libtorch, source): pip
  • Python version: 3.10.12
  • CUDA version: Cuda compilation tools, release 12.1, V12.1.66 Build cuda_12.1.r12.1/compiler.32415258_0
  • GPU models and configuration: NVIDIA A100-SXM4-80GB, on SLURM with MIG enabled.

Is there any limit to the input size when converting using torch_tensorrt? Any solution to this problem?

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions