Triton Inference Server on ROCm installation#
2026-03-31
3 min read time
System requirements#
To use Triton Inference Server 25.12, you need the following prerequisites:
Install Triton Inference Server#
To install Triton Inference Server on ROCm, you have the following options:
Use a prebuilt Docker image with Triton Inference Server pre-installed#
Docker is the recommended method to set up a Triton Inference Server environment for model serving, as it avoids dependency conflicts. The tested, prebuilt image includes Triton Inference Server, Python, ROCm, and all other requirements.
Pull the Docker image; this docker image is built on Ubuntu 24.04 and ROCm 7.2.0 with ONNX runtime and Python backends enabled.
docker pull rocm/tritoninferenceserver:tritoninferenceserver-25.12.amd1_rocm7.2_ubuntu24.04_py3.12
Start a Docker container using the image.
docker run \ --name tritonserver_container \ --device=/dev/kfd \ --device=/dev/dri \ --ipc=host \ -it \ -p 8000:8000 \ -p 8001:8001 \ -p 8002:8002 \ --net=host \ -e ORT_MIGRAPHX_MODEL_CACHE_PATH=/migraphx_cache \ -e ORT_MIGRAPHX_CACHE_PATH=/migraphx_cache \ -v /path/to/your/model_repository/on/host:/models \ -v /path/to/your/migraphx_cache_save_dir/on/host:/migraphx_cache \ tritoninferenceserver:tritoninferenceserver-25.12.amd1_rocm7.2_ubuntu24.04_py3.12 \ tritonserver --model-repository=/models --exit-on-error=false
The prebuilt image contains the Triton Server executable, required shared libraries, backends, and repository agents in the following locations:
Triton Inference Server executable:
/opt/tritonserver/binShared libraries:
/opt/tritonserver/libBackends:
/opt/tritonserver/backendsRepository agents:
/opt/tritonserver/repoagents
Build from source#
Triton Inference Server on ROCm can be run directly by setting up a Docker container from scratch.
Clone the ROCm/triton-inference-server-server repository and enter the directory.
git clone -b rocm7.2_r25.12 https://github.com/ROCm/triton-inference-server-server.git cd triton-inference-server-server bash scripts/build_ubuntu24.04_rocm_72_base.sh
Build the Docker image.
cd triton-inference-server-server python3 build.py \ --no-container-pull \ --enable-logging \ --enable-stats \ --enable-tracing \ --enable-rocm \ --enable-metrics \ --verbose \ --endpoint=grpc \ --endpoint=http \ --backend=onnxruntime \ --backend=python \ --linux-distro=ubuntu
Ensure your build options are set as follows:
--enable-rocm: Enable ROCm support.--endpoint: Build with HTTP and gRPC endpoints.--backend=onnxruntime/--backend=python: Build backends into the server.--linux-distro: Build on Ubuntu 24.04.
4. The above settings build a Triton server with both ONNX Runtime and Python backends enabled. After the build completes, you can run the Docker container using the same command shown in the Use a prebuilt Docker image with Triton Inference Server pre-installed section.
Test the Triton Inference Server installation#
After launching Triton using the docker run command, you should see the model repository load
successfully. Your models will also show as “READY” in the server logs.