Triton Inference Server on ROCm installation

Triton Inference Server on ROCm installation#

2026-03-31

3 min read time

Applies to Linux

System requirements#

To use Triton Inference Server 25.12, you need the following prerequisites:

ROCm version: 7.2.0
Operating system: Ubuntu 24.04
GPU platform: AMD Instinct™ MI300X, MI355X
Python: 3.12

Install Triton Inference Server#

To install Triton Inference Server on ROCm, you have the following options:

Use a prebuilt Docker image with Triton Inference Server pre-installed (recommended)
Build from source

Use a prebuilt Docker image with Triton Inference Server pre-installed#

Docker is the recommended method to set up a Triton Inference Server environment for model serving, as it avoids dependency conflicts. The tested, prebuilt image includes Triton Inference Server, Python, ROCm, and all other requirements.

Pull the Docker image; this docker image is built on Ubuntu 24.04 and ROCm 7.2.0 with ONNX runtime and Python backends enabled.
```
docker pull rocm/tritoninferenceserver:tritoninferenceserver-25.12.amd1_rocm7.2_ubuntu24.04_py3.12
```

Start a Docker container using the image.

docker run \
--name tritonserver_container \
--device=/dev/kfd \
--device=/dev/dri \
--ipc=host \
-it \
-p 8000:8000 \
-p 8001:8001 \
-p 8002:8002 \
--net=host \
-e ORT_MIGRAPHX_MODEL_CACHE_PATH=/migraphx_cache \
-e ORT_MIGRAPHX_CACHE_PATH=/migraphx_cache \
-v /path/to/your/model_repository/on/host:/models \
-v /path/to/your/migraphx_cache_save_dir/on/host:/migraphx_cache \
tritoninferenceserver:tritoninferenceserver-25.12.amd1_rocm7.2_ubuntu24.04_py3.12 \
tritonserver --model-repository=/models --exit-on-error=false

The prebuilt image contains the Triton Server executable, required shared libraries, backends, and repository agents in the following locations:
- Triton Inference Server executable: /opt/tritonserver/bin
- Shared libraries: /opt/tritonserver/lib
- Backends: /opt/tritonserver/backends
- Repository agents: /opt/tritonserver/repoagents

Build from source#

Triton Inference Server on ROCm can be run directly by setting up a Docker container from scratch.

Clone the ROCm/triton-inference-server-server repository and enter the directory.

git clone -b rocm7.2_r25.12 https://github.com/ROCm/triton-inference-server-server.git
cd triton-inference-server-server
bash scripts/build_ubuntu24.04_rocm_72_base.sh

Build the Docker image.

cd triton-inference-server-server
python3 build.py \
--no-container-pull \
--enable-logging \
--enable-stats \
--enable-tracing \
--enable-rocm \
--enable-metrics \
--verbose \
--endpoint=grpc \
--endpoint=http \
--backend=onnxruntime \
--backend=python \
--linux-distro=ubuntu

Ensure your build options are set as follows:
- --enable-rocm: Enable ROCm support.
- --endpoint: Build with HTTP and gRPC endpoints.
- --backend=onnxruntime / --backend=python: Build backends into the server.
- --linux-distro: Build on Ubuntu 24.04.

4. The above settings build a Triton server with both ONNX Runtime and Python backends enabled. After the build completes, you can run the Docker container using the same command shown in the Use a prebuilt Docker image with Triton Inference Server pre-installed section.

Test the Triton Inference Server installation#

After launching Triton using the docker run command, you should see the model repository load successfully. Your models will also show as “READY” in the server logs.