Open
Description
❓ Question
I'm trying to export a torch model that processes large inputs (e.g., 8192x2048). I have noticed that torch_tensorrt.compile
fails with inputs greater than 4096x2048 (I haven't tried them all, only powers of 2). Specifically, the conversion fails for convolution and ReLU operations with a "No valid tactics" and "Illegal memory access" error:
[1A2024-11-29 16:56:42,307 - torch_tensorrt [TensorRT Conversion Context] - ERROR - [scopedCudaResources.cpp::~ScopedCudaStream::55] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
2024-11-29 16:56:42,311 - torch_tensorrt [TensorRT Conversion Context] - ERROR - IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node [CONVOLUTION]-[aten_ops.convolution.default]-[teacher.3/convolution_5] + [RELU]-[aten_ops.relu.default]-[teacher.4/relu_4].)
2024-11-29 16:56:42,312 - [MODEL EXPORT] - ERROR - TensorRT export failed:
Traceback (most recent call last):
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/tools/launchers.py", line 398, in <module>
export(
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/tools/launchers.py", line 298, in export
trt_model = torch_tensorrt.compile(model, **compile_spec)
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/_compile.py", line 269, in compile
trt_graph_module = dynamo_compile(
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 288, in compile
trt_gm = compile_module(
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 464, in compile_module
trt_module = convert_module(
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 142, in convert_module
interpreter_result = interpret_module_to_result(
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_conversion.py", line 121, in interpret_module_to_result
interpreter_result = interpreter.run()
File "/nfs/home/bragagnolo/qinstinct-fabric-inspection/.venv/lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 635, in run
assert serialized_engine
AssertionError
Here attached is the script and full output log: issue.zip
Environment
- PyTorch Version (e.g., 1.0): 2.5.1+cu121
- TorchTensorRT Version: 2.5.0
- CPU Architecture: AMD EPYC 7543 32-Core Processor
- OS (e.g., Linux): Ubuntu 22.04.5 LTS
- How you installed PyTorch (
conda
,pip
,libtorch
, source): pip - Python version: 3.10.12
- CUDA version: Cuda compilation tools, release 12.1, V12.1.66 Build cuda_12.1.r12.1/compiler.32415258_0
- GPU models and configuration: NVIDIA A100-SXM4-80GB, on SLURM with MIG enabled.
Is there any limit to the input size when converting using torch_tensorrt? Any solution to this problem?
Thanks.