PyTorch on ROCm installation

PyTorch on ROCm installation#

2025-12-03

13 min read time

Applies to Linux

PyTorch is an open-source tensor library designed for deep learning. PyTorch on ROCm provides mixed-precision and large-scale training using AMD MIOpen and RCCL libraries.

This topic covers setup instructions and the necessary files to build, test, and run PyTorch with ROCm support in a Docker environment. To learn more about PyTorch on ROCm, including its use cases, recommendations, as well as hardware and software compatibility, see PyTorch compatibility.

Install PyTorch#

To install PyTorch for ROCm, you have the following options:

Use a prebuilt Docker image with PyTorch pre-installed (recommended)
- Docker image support
Use a wheels package
Use the PyTorch upstream Dockerfile

Use a prebuilt Docker image with PyTorch pre-installed#

The recommended setup to get a PyTorch environment is through Docker, as it avoids potential installation issues. The tested, prebuilt image includes PyTorch, ROCm, and other dependencies. See Docker image support. To install ROCm on bare metal, follow ROCm installation overview.

Download the latest public PyTorch Docker image.
```
docker pull rocm/pytorch:latest
```
Important

The rocm/pytorch:latest and rocm/pytorch:latest-release tags point to a Docker image with the latest ROCm-tested release of PyTorch.

The rocm/pytorch:latest-release-preview tag points to a more recent PyTorch version with limited testing on ROCm.

You can download Docker images with specific ROCm, PyTorch, and operating system versions. See the available tags on Docker Hub.

Start a Docker container using the image.

docker run -it \
    --cap-add=SYS_PTRACE \
    --security-opt seccomp=unconfined \
    --device=/dev/kfd \
    --device=/dev/dri \
    --group-add video \
    --ipc=host \
    --shm-size 8G \
    rocm/pytorch:latest

Note

This will automatically download the image if it does not exist on the host. You can also pass the -v argument to mount any data directories from the host onto the container.

Docker image support#

AMD validates and publishes ready-made PyTorch images with ROCm backends on Docker Hub. The following Docker image tags and associated inventories are validated for ROCm 7.1.1.

PyTorch 2.9.1

Python 3.12

Docker pull tag

docker pull rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.9.1

Additional software components

Ubuntu	Apex	torchvision	UCX	Open MPI
24.04	1.9.0+rocm7.1.1.git07c3ee53	0.24.0	1.16.0+ds-5ubuntu1	4.1.6-7ubuntu2

See rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.9.1 on Docker Hub.

Python 3.10

Docker pull tag

docker pull rocm/pytorch:rocm7.1.1_ubuntu22.04_py3.10_pytorch_release_2.9.1

Additional software components

Ubuntu	Apex	torchvision	UCX	Open MPI
22.04	1.9.0+rocm7.1.1.git07c3ee53	0.24.0	1.12.1~rc2-1	4.1.2-2ubuntu1

See rocm/pytorch:rocm7.1.1_ubuntu22.04_py3.10_pytorch_release_2.9.1 on Docker Hub.

PyTorch 2.8.0

Python 3.12

Docker pull tag

docker pull rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.8.0

Additional software components

Ubuntu	Apex	torchvision	UCX	Open MPI
24.04	1.8.0a0+rocm7.1.1.git3f26640c	0.23.0	1.16.0+ds-5ubuntu1	4.1.6-7ubuntu2

See rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.8.0 on Docker Hub.

Python 3.10

Docker pull tag

docker pull rocm/pytorch:rocm7.1.1_ubuntu22.04_py3.10_pytorch_release_2.8.0

Additional software components

Ubuntu	Apex	torchvision	UCX	Open MPI
22.04	1.8.0a0+rocm7.1.1.git3f26640c	0.23.0	1.12.1~rc2-1	4.1.2-2ubuntu1

See rocm/pytorch:rocm7.1.1_ubuntu22.04_py3.10_pytorch_release_2.8.0 on Docker Hub.

PyTorch 2.7.1

Python 3.12

Docker pull tag

docker pull rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.7.1

Additional software components

Ubuntu	Apex	torchvision	UCX	Open MPI
24.04	1.7.0+rocm7.1.1.git7a57becf	0.22.1	1.16.0+ds-5ubuntu1	4.1.6-7ubuntu2

See rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.7.1 on Docker Hub.

Python 3.10

Docker pull tag

docker pull rocm/pytorch:rocm7.1.1_ubuntu22.04_py3.10_pytorch_release_2.7.1

Additional software components

Ubuntu	Apex	torchvision	UCX	Open MPI
22.04	1.7.0+rocm7.1.1.git7a57becf	0.22.1	1.12.1~rc2-1	4.1.2-2ubuntu1

See rocm/pytorch:rocm7.1.1_ubuntu22.04_py3.10_pytorch_release_2.7.1 on Docker Hub.

Use a wheels package#

PyTorch supports the ROCm platform by providing tested wheels packages. To access this feature, go to pytorch.org/get-started/locally/. For the correct wheels command, you must select Linux, Python, pip, and ROCm in the matrix.

Note

The available ROCm release varies between the PyTorch Build of Stable or Nightly. More recent releases are generally available through the Nightly builds.

Choose one of the following three options:

Option 1:
1. Download a base Docker image with the correct ROCm version.
  
  Base OS
  
  Docker Image
  
  Ubuntu 22.04
  
  rocm/dev-ubuntu-22.04
  
  Ubuntu 24.04
  
  rocm/dev-ubuntu-24.04
2. Pull the selected image.
```
docker pull rocm/dev-ubuntu-22.04:latest
```
3. Start a Docker container using the downloaded image.
```
docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/dev-ubuntu-22.04:latest
```
Option 2:
1. Select a base OS Docker image. Check System requirements (Linux).
2. Pull selected base OS image (Ubuntu 22.04, for example).
```
docker pull ubuntu:22.04
```
3. Start a Docker container using the downloaded image.
```
docker run -it --device=/dev/kfd --device=/dev/dri --group-add video ubuntu:22.04
```
4. Install ROCm using the directions in the ROCm installation overview section.
Option 3:

Install on bare metal. Check System requirements (Linux) and install ROCm using the directions in the ROCm installation overview section.

Install the required dependencies for the wheels package.

sudo apt update
sudo apt install libjpeg-dev python3-dev python3-pip
pip3 install wheel setuptools

Install torch, torchvision, and torchaudio, as specified in the installation matrix.

Note

The following command uses the ROCm 7.0 PyTorch wheel. If you want a different version of ROCm, modify the command accordingly.
```
 pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm7.0
```
(Optional) Use MIOpen kdb files with ROCm PyTorch wheels.

PyTorch uses MIOpen for machine learning primitives, which are compiled into kernels at runtime. Runtime compilation causes a small warm-up phase when starting PyTorch, and MIOpen kdb files contain precompiled kernels that can speed up application warm-up phases.

MIOpen kdb files can be used with ROCm PyTorch wheels. However, the kdb files need to be placed in a specific location with respect to the PyTorch installation path. A helper script simplifies this task by taking the ROCm version and GPU architecture as inputs. This works for Ubuntu.

You can download the helper script here: install_kdb_files_for_pytorch_wheels.sh, or use:
```
wget https://raw.githubusercontent.com/wiki/ROCm/pytorch/files/install_kdb_files_for_pytorch_wheels.sh
```
After installing ROCm PyTorch wheels, run the following code:
```
#Optional: replace 'gfx90a' with your architecture and 6.2.4 with your preferred ROCm version
export GFX_ARCH=gfx90a

#Optional
export ROCM_VERSION=6.2.4

./install_kdb_files_for_pytorch_wheels.sh
```

Build PyTorch from source#

Use the rocm/pytorch:latest image, uninstall the preinstalled PyTorch package, and rebuild PyTorch from source. This ensures compatibility with your specific ROCm version, GPU architecture, and project requirements.

Download the latest PyTorch Docker image.
```
docker pull rocm/pytorch:latest
```

Start a Docker container using the downloaded image.

docker run -it \
    --cap-add=SYS_PTRACE \
    --security-opt seccomp=unconfined \
    --device=/dev/kfd \
    --device=/dev/dri \
    --group-add video \
    --ipc=host \
    --shm-size 8G \
    rocm/pytorch:latest

Uninstall the pre-installed PyTorch inside the container. Otherwise, the prebuilt ROCm PyTorch from the container might conflict with your source build.
```
pip3 uninstall -y torch torchvision torchaudio
```

Clone the PyTorch repository.

cd ~
git clone https://github.com/pytorch/pytorch.git
cd pytorch
git submodule update --init --recursive

(Optional) Set your ROCm architecture.

By default, PyTorch builds for a broad set of AMD architectures. To speed up compilation, you can target only your GPU architecture.

To determine your architecture:
```
rocminfo | grep gfx
```
Then set the PYTORCH_ROCM_ARCH environment variable:
```
export PYTORCH_ROCM_ARCH=<uarch>
```
Replace <uarch> with the result from rocminfo (for example, gfx90a, gfx1030). See System requirements (Linux) for the list of AMD GPU architectures.
Build and install PyTorch following the instructions in pytorch/pytorch.

Use the PyTorch upstream Dockerfile#

If you don’t want to use a prebuilt base Docker image, you can build a custom base Docker image using scripts from the PyTorch repository. This uses a standard Docker image from operating system maintainers and installs all the required dependencies, including:

ROCm
torchvision
Conda packages
The compiler toolchain

Clone the PyTorch repository.

cd ~
git clone https://github.com/pytorch/pytorch.git
cd pytorch
git submodule update --init --recursive

Build the PyTorch Docker image.

cd .ci/docker
./build.sh pytorch-linux-<os-version>-rocm<rocm-version>-py<python-version> -t rocm/pytorch:build_from_dockerfile

Where:

<os-version> = ubuntu20.04 (or focal), ubuntu22.04 (or jammy)
<rocm-version> = 6.0, 6.1, 6.2
<python-version> = 3.8 - 3.11

To verify that your image was successfully created, run:

docker image ls rocm/pytorch:build_from_dockerfile

If successful, the output looks like this:

REPOSITORY    TAG                       IMAGE ID         CREATED           SIZE
rocm/pytorch  build_from_dockerfile     17071499be47     2 minutes ago     32.8GB

Start a Docker container using the image with the mounted PyTorch folder.

docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
--user root --device=/dev/kfd --device=/dev/dri \
--group-add video --ipc=host --shm-size 8G \
-v ~/pytorch:/pytorch rocm/pytorch:build_from_dockerfile

You can also pass the -v argument to mount any data directories from the host onto the container.

Go to the PyTorch directory.
```
cd /pytorch
```
Set ROCm architecture.

To determine your AMD architecture, run:
```
rocminfo | grep gfx
```
The result looks like this (for gfx1030 architecture):
```
Name:                    gfx1030
Name:                    amdgcn-amd-amdhsa--gfx1030
```
Set the PYTORCH_ROCM_ARCH environment variable to specify the architectures you want to build PyTorch for.
```
export PYTORCH_ROCM_ARCH=<uarch>
```
where <uarch> is the architecture reported by the rocminfo command.
Build PyTorch.
```
.ci/pytorch/build.sh
```
This converts PyTorch CUDA sources to HIP and builds the PyTorch framework.

To check if your build is successful, run:
```
echo $? # should return 0 if success
```

Test the PyTorch installation#

You can use PyTorch unit tests to validate your PyTorch installation. If you used a prebuilt PyTorch Docker image from AMD ROCm Docker Hub or installed an official wheels package, validation tests are not necessary.

If you want to manually run unit tests to validate your PyTorch installation fully, follow these steps:

Import the torch package in Python to test if PyTorch is installed and accessible.

Note

Do not run the following command from the PyTorch home directory.
```
python3 -c 'import torch' 2> /dev/null && echo 'Success' || echo 'Failure'
```
Check if the GPU is accessible from PyTorch. In the PyTorch framework, torch.cuda is a generic way to access the GPU. This can only access an AMD GPU if one is available.
```
python3 -c 'import torch; print(torch.cuda.is_available())'
```
Run unit tests to validate the PyTorch installation fully.

Note

You must run the following command from the PyTorch home directory.
```
PYTORCH_TEST_WITH_ROCM=1 python3 test/run_test.py --verbose \
--include test_nn test_torch test_cuda test_ops \
test_unary_ufuncs test_binary_ufuncs test_autograd
```
This command ensures that the required environment variable is set to skip certain unit tests for ROCm. This also applies to wheel installs in a non-controlled environment.

Note

Make sure your PyTorch source code corresponds to the PyTorch wheel or the installation in the Docker image. Incompatible PyTorch source code can give errors when running unit tests.

Some tests may be skipped, as appropriate, based on your system configuration. ROCm doesn’t support all PyTorch features; tests that evaluate unsupported features are skipped. Other tests might be skipped, depending on the host or GPU memory and the number of available GPUs.

If the compilation and installation are correct, all tests will pass.
(Optional) Run individual unit tests.
```
PYTORCH_TEST_WITH_ROCM=1 python3 test/test_nn.py --verbose
```
You can replace test_nn.py with any other test set.

Run a PyTorch example#

The PyTorch examples repository provides basic examples that exercise the functionality of your framework.

Two of our favorite testing databases are:

MNIST (Modified National Institute of Standards and Technology): A database of handwritten digits that can be used to train a Convolutional Neural Network for handwriting recognition.
ImageNet: A database of images that can be used to train a network for visual object recognition.

MNIST PyTorch example#

Clone the PyTorch examples repository.

git clone https://github.com/pytorch/examples.git

Go to the MNIST example folder.
```
cd examples/mnist
```

Follow the instructions in the README.md file in this folder to install the requirements. Then run:

python3 main.py

This generates the following output:

...
Train Epoch: 14 [58240/60000 (97%)]     Loss: 0.010128
Train Epoch: 14 [58880/60000 (98%)]     Loss: 0.001348
Train Epoch: 14 [59520/60000 (99%)]     Loss: 0.005261

Test set: Average loss: 0.0252, Accuracy: 9921/10000 (99%)

ImageNet PyTorch example#

Clone the PyTorch examples repository (if you didn’t already do this in the preceding MNIST example).
```
git clone https://github.com/pytorch/examples.git
```
Go to the ImageNet example folder.
```
cd examples/imagenet
```
Follow the instructions in the README.md file in this folder to install the Requirements. Then run:
```
python3 main.py
```

Troubleshooting#

What to do if you get the following error when trying to run PyTorch:
```
hipErrorNoBinaryForGPU: Unable to find code object for all current devices!
```
The error denotes that the installation of PyTorch and/or other dependencies or libraries do not support the current GPU. To workaround this issue, use the following steps:
1. Confirm that the hardware supports the ROCm stack. Refer to System requirements (Linux) and System requirements (Windows).
2. Determine the gfx target.
```
rocminfo | grep gfx
```
3. Check if PyTorch is compiled with the correct gfx target.
```
TORCHDIR=$( dirname $( python3 -c 'import torch; print(torch.__file__)' ) )
roc-obj-ls -v $TORCHDIR/lib/libtorch_hip.so # check for gfx target
```
  Note
  
  Recompile PyTorch with the right gfx target if compiling from the source if the hardware is not supported.
What if you are unable to access Docker or GPU in user accounts?

Ensure that the user is added to docker, video, and render Linux groups as described in Configuring permissions for GPU access.
Can you install PyTorch directly on bare metal?

Bare-metal installation of PyTorch is supported through wheels. For more information, see Use a wheels package.
How do you profile PyTorch workloads?

Use the PyTorch Profiler as described in PyTorch Profiler to profile GPU kernels on ROCm.

Base OS	Docker Image
Ubuntu 22.04	rocm/dev-ubuntu-22.04
Ubuntu 24.04	rocm/dev-ubuntu-24.04

PyTorch on ROCm installation

Contents

PyTorch on ROCm installation#

Install PyTorch#

Use a prebuilt Docker image with PyTorch pre-installed#

Docker image support#

Use a wheels package#

Build PyTorch from source#

Use the PyTorch upstream Dockerfile#

Test the PyTorch installation#

Run a PyTorch example#

MNIST PyTorch example#

ImageNet PyTorch example#

Troubleshooting#