(installation-guide)= (containers)=
There are multiple ways to install and run TensorRT LLM. The options below are ordered from simplest to most involved. Before installing, check the Supported Hardware page to ensure your GPU is compatible.
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
Pre-built TensorRT LLM releases are available as container images on NGC. This is the simplest way to obtain TensorRT LLM.
Replace x.y.z with the desired version tag. Browse the available tags on NGC to find the latest release.
docker pull nvcr.io/nvidia/tensorrt-llm/release:x.y.z
docker run --rm -it --ipc host --gpus all --ulimit memlock=-1 --ulimit stack=67108864 -p 8000:8000 nvcr.io/nvidia/tensorrt-llm/release:x.y.z{{container_tag_admonition}}
Sanity check the installation by running the following inside the container:
python3 -c "import tensorrt_llm"(linux)=
Tested on Ubuntu 24.04.
Before the pre-built Python wheel can be installed via pip, a few
prerequisites must be put into place:
Install CUDA Toolkit 13.1 following the CUDA Installation Guide for Linux
and make sure CUDA_HOME environment variable is properly set.
The cuda-compat-13-1 package may be required depending on your system's NVIDIA GPU
driver version. For additional information, refer to the CUDA Forward Compatibility.
# By default, PyTorch CUDA 12.8 package is installed. Install PyTorch CUDA 13.0 package to align with the CUDA version used for building TensorRT LLM wheels.
pip3 install torch==2.10.0 torchvision --index-url https://download.pytorch.org/whl/cu130
sudo apt-get -y install libopenmpi-dev
# Optional step: Only required for disagg-serving
sudo apt-get -y install libzmq3-devInstead of manually installing the prerequisites as described
above, it is also possible to use the pre-built [TensorRT LLM Develop container
image hosted on NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/devel)
(see [here](containers) for information on container tags).
Once all prerequisites are in place, TensorRT LLM can be installed as follows:
pip3 install --ignore-installed pip setuptools wheel && pip3 install tensorrt_llmNote: The TensorRT LLM wheel on PyPI is built with PyTorch 2.10.0. This version may be incompatible with the NVIDIA NGC PyTorch 25.12 container, which uses a more recent PyTorch build from the main branch. If you are using this container or a similar environment, please install the pre-built wheel located at
/app/tensorrt_llminside the TensorRT LLM NGC Release container instead.
:language: python
:linenos:
There are some known limitations when you pip install the pre-built TensorRT LLM wheel package.
-
MPI in the Slurm environment
If you encounter an error while running TensorRT LLM in a Slurm-managed cluster, you need to reconfigure the MPI installation to work with Slurm. The setup method depends on your Slurm configuration, please check with your admin. This is not TensorRT LLM specific, but rather a general MPI+Slurm issue.
The application appears to have been direct launched using "srun", but OMPI was not built with SLURM support. This usually happens when OMPI was not configured --with-slurm and we weren't able to discover a SLURM installation in the usual places. -
Prevent
pipfrom replacing existing PyTorch installationOn certain systems, particularly Ubuntu 22.04, users installing TensorRT LLM would find that their existing, CUDA 13.0 compatible PyTorch installation (e.g.,
torch==2.9.0+cu130) was being uninstalled bypip. It was then replaced by a CUDA 12.8 version (torch==2.9.0), causing the TensorRT LLM installation to be unusable and leading to runtime errors.The solution is to create a
pipconstraints file, lockingtorchto the currently installed version. Here is an example of how this can be done manually:CURRENT_TORCH_VERSION=$(python3 -c "import torch; print(torch.__version__)") echo "torch==$CURRENT_TORCH_VERSION" > /tmp/torch-constraint.txt pip3 install --ignore-installed pip setuptools wheel && pip3 install tensorrt_llm -c /tmp/torch-constraint.txt
For developers who wish to modify, customize, or contribute to TensorRT LLM, see Build from Source.