vLLM Linux Docker Image

vLLM Linux Docker Image#

Virtual Large Language Model (vLLM) is a fast and easy-to-use library for LLM inference and serving, providing greater optimizations and performance.

For additional information, visit the AMD vLLM GitHub page.

Note that this is a benchmarking demo/example. Installation for other vLLM models/configurations may differ.

Additional information#

Ensure Docker is installed on your system. Refer to https://docs.docker.com/engine/install/ for more information.
This docker supports gfx1151 and gfx1150. Refer to the compatibility matrix for more information.
This example highlights use of the AMD vLLM Docker using deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. Other models vLLM supported models can be used too.

Download and install Docker image#

Download Docker image#

Select the applicable Ubuntu version to download the compatible Docker image before starting.

Ubuntu 24.04/Python 3.12/PyTorch 2.8

docker pull rocm/vllm-dev:rocm7.1_navi_ubuntu24.04_py3.12_pytorch_2.8_vllm_0.10.2rc1

Note
For more information, see rocm/vllm-dev.

Installation#

Follow these steps to build a vLLM Docker image and benchmark a model.

Start the Docker container.

docker run -it \
  --privileged \
  --device=/dev/kfd \
  --device=/dev/dri \
  --network=host \
  --group-add sudo \
  -w /app/vllm/ \
  --name <container_name> \
<image_name> \
  /bin/bash

Note
You can find the <image_name> by running docker images. The container_name is user defined. Ensure to name your Docker using this value.

Run benchmarks with the Docker container.

cd benchmarks
python3 benchmark_latency.py --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Additional Usage#

vLLM is optimized to serve LLMs faster and more efficiently, especially for applications requiring high throughput and scalability. See Quickstart - vLLM for more information.
To run offline inference, see Quickstart - vLLM for more information.

Note
If you experience errors with torch.distributed, running export GLOO_SOCKET_IFNAME=lo may resolve the issue.