Ray on ROCm installation#
2025-09-09
6 min read time
Ray is a unified framework for scaling AI and Python applications from your laptop to a full cluster, without changing your code. Ray consists of a core distributed runtime and a set of AI libraries for simplifying machine learning computations.
Ray is a general-purpose framework that runs many types of workloads efficiently. Any Python application can be scaled with Ray, without extra infrastructure.
For hardware, software, and third-party framework compatibility between ROCm and Ray, see the following resources:
Note
Ray is supported on ROCm 6.4.1.
Install Ray#
To install Ray on ROCm, you have the following options:
Using a prebuilt Docker image with Ray pre-installed (recommended)
Using a prebuilt Docker image with Ray pre-installed#
Docker is the recommended method to set up a Ray environment, and it avoids potential installation issues. The tested, prebuilt image includes Ray, ROCm, and other dependencies.
Pull the Docker image
docker pull rocm/ray:ray-2.48.0.post0_rocm6.4.1_ubuntu24.04_py3.12_pytorch2.6.0
Note
For specific versions of Ray, review the periodically pushed Docker images at ROCm Ray on Docker Hub.
Additional Docker images are available at ROCm Ray on Docker Hub. These contain the latest ROCm version but might use an older version of Ray.
Launch and connect to the container
docker run -it -d --network=host --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size 64G \ --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $(pwd):/host_dir \ -w /app --name rocm_ray rocm/ray:ray-2.48.0.post0_rocm6.4.1_ubuntu24.04_py3.12_pytorch2.6.0 /bin/bash docker attach rocm_ray
Tip
The
--shm-size
parameter allocates shared memory for the container. Adjust it based on your system’s resources if needed.Replace
$(pwd)
with the absolute path to the directory you want to mount inside the container.
Build your own Docker image#
If you prefer to use the ROCm Ubuntu image or already have a ROCm Ubuntu container, follow these steps to install Ray in the container.
Pull the ROCm Ubuntu Docker image. For example, use the following command to pull the ROCm Ubuntu image:
docker pull rocm/pytorch:rocm6.4.1_ubuntu24.04_py3.12_pytorch_release_2.6.0
Launch the Docker container. After pulling the image, launch a container using this command:
docker run -it -d --network=host --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size 64G \ --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $(pwd):/host_dir \ --name rocm_ray rocm/pytorch:rocm6.4.1_ubuntu24.04_py3.12_pytorch_release_2.6.0 /bin/bash docker attach rocm_ray
Activate the conda environment
Install from Ray nightly wheels. Inside the running container, install the required version of Ray with ROCm support using pip:
pip install -U "ray[all] @ https://s3-us-west-2.amazonaws.com/ray-wheels/master/005c372262e050d5745f475e22e64305fa07f8b8/ray-3.0.0.dev0-cp312-cp312-manylinux2014_x86_64.whl"
Verify the installed Ray version. Check whether the correct version of Ray is installed.
pip3 freeze | grep ray
Expected output:
memray==1.17.2 ray @ https://s3-us-west-2.amazonaws.com/ray-wheels/master/005c372262e050d5745f475e22e64305fa07f8b8/ray-3.0.0.dev0-cp312-cp312-manylinux2014_x86_64.whl#sha256=e8f457f1bb8009b1e2744733c269fc54f3ec78e3705e16a2f88a8305720efe1b
Verify the installation of ROCm Ray. See Test the Ray installation.
Install Ray on bare metal or a custom container#
Follow these steps if you prefer to install ROCm manually on your host system or in a custom container.
Install ROCm. Follow the ROCm installation guide to install ROCm on your system.
Once installed, verify your ROCm installation using:
rocm-smi
========================================== ROCm System Management Interface ========================================== ==================================================== Concise Info ==================================================== Device [Model : Revision] Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% Name (20 chars) (Junction) (Socket) (Mem, Compute) ====================================================================================================================== 0 [0x74a1 : 0x00] 50.0°C 170.0W NPS1, SPX 131Mhz 900Mhz 0% auto 750.0W 0% 0% AMD Instinct MI300X 1 [0x74a1 : 0x00] 51.0°C 176.0W NPS1, SPX 132Mhz 900Mhz 0% auto 750.0W 0% 0% AMD Instinct MI300X 2 [0x74a1 : 0x00] 50.0°C 177.0W NPS1, SPX 132Mhz 900Mhz 0% auto 750.0W 0% 0% AMD Instinct MI300X 3 [0x74a1 : 0x00] 53.0°C 176.0W NPS1, SPX 132Mhz 900Mhz 0% auto 750.0W 0% 0% AMD Instinct MI300X ====================================================================================================================== ================================================ End of ROCm SMI Log =================================================
Install the required version of Ray with ROCm support using pip:
pip install -U "ray[all] @ https://s3-us-west-2.amazonaws.com/ray-wheels/master/005c372262e050d5745f475e22e64305fa07f8b8/ray-3.0.0.dev0-cp312-cp312-manylinux2014_x86_64.whl"
Verify the installed Ray version. Check whether the correct version of Ray and its ROCm plugins are installed.
pip3 freeze | grep ray
Build Ray from source#
Follow the Building Ray from Source guide to build Ray with ROCm support from source.
Test the Ray installation#
Ray unit tests are optional for validating your installation if you used a prebuilt Docker image from AMD ROCm Docker Hub.
To run unit tests manually and validate your installation fully, follow these steps:
After launching the container, test whether Ray detects ROCm devices as expected.
python3 -c "import ray; ray.init(); print(ray.cluster_resources())"
If the setup is successful, the output should list all available ROCm devices.
Expected output (e.g. on MI300x node):
{'memory': 1420360912896.0, 'GPU': 8.0, 'accelerator_type:AMD-Instinct-MI300X-OAM': 1.0, 'node:10.7.39.110': 1.0, 'CPU': 384.0, 'node:__internal_head__': 1.0, 'object_store_memory': 200000000000.0}