Radeon Limitations and recommended settings

Radeon Limitations and recommended settings#

This section provides information on software and configuration limitations.

Note
For ROCm on Instinct known issues, refer to AMD ROCm Documentation

For OpenMPI limitations, see ROCm UCX OpenMPI on Github

7.2 release known issues#

Linux#

Known issues#

Visual corruption and abnormal colors may be observed while generating videos using the ComfyUI Wan 2.2 TI2V 5B model on some AMD Graphics Products, such as the Radeon™ RX 9070 GRE and Radeon™ AI PRO R9700.
Intermittent errors or segmentation faults may occur while running JAX workloads.
Failures or instability may be observed while running SD3.5XL or FLUX inference workloads on configs with lower system memory (e.g., 32 GB). Users experiencing this issue are recommended to try the --lowvram and --disable-pinned-memory parameters in the run command.

Limitations#

AO Triton with PyTorch 2.9 is disabled by default for AMD Radeon RX 7000 series graphics products, and must be enabled manually.

Multi-GPU configuration#

AMD has identified common errors when running ROCm™ on Radeon™ multi-GPU configuration at this time, along with the applicable recommendations.

See mGPU known issues and limitations for a complete list of mGPU known issues and limitations.

Windows#

Note
The following Windows known issues and limitations are applicable to the 7.2 release. Only Pytorch is currently available on Windows - the rest of the ROCm stack is only supported on Linux.
AMD is aware and actively working on resolving these issues for future releases.

Note
If you encounter errors related to missing .dll libraries, install Visual C++ 2015-2022 Redistributables.

Known issues#

Disable the following Windows security features as they can interfere with ROCm functionality:
- Turn off WDAG (Windows Defender Application Guard)
  - Control Panel > Programs > Programs and Features > Turn Windows features on or off > Clear “Microsoft Defender Application Guard”
- Turn off SAC (Smart App Control)
  - Settings > Privacy & security > Windows Security > App & browser control > Smart App Control settings > Off

Limitations#

No ML training support.
Only Python 3.12 is supported.
For ComfyUI, adding the --lowvram and --disable-pinned-memory parameters may help with lower-memory configs.
On Windows, only Pytorch is supported, not the entire ROCm stack.
On Windows, the latest version of transformers should be installed, via pip install. Some older versions of transformers (<4.55.5) might not be supported.
On Windows, only LLM batch sizes of 1 are officially supported.

WSL#

Known issues#

Intermittent script failure may be observed while running Llama 3 inference workloads with vLLM in WSL2. End users experiencing this issue are recommended to follow vLLM setup instructions here.
Intermittent script failure or driver timeout may be observed while running Stable Diffusion 3 inference workloads with JAX.
Lower than expected performance may be observed while running inference workloads with JAX in WSL2.
Intermittent script failure may be observed while running Resnet50, BERT, or InceptionV3 training workloads with ONNX runtime.
Output error message (resource leak) may be observed while running Llama 3.2 workloads with vLLM.
Output error message (VaMgr) may be observed while running PyTorch workloads in WSL2.
Intermittent script failure or driver timeout may be observed while running Stable Diffusion inference workloads with TensorFlow.
Intermittent application crash may be observed while running Stable Diffusion workloads with ComfyUI and MIGraphX on Radeon™ RX 9060 series graphics products.
Intermittent script failure may occur while running Stable Diffusion 2 workloads with PyTorch and MIGraphX
Intermittent script failure may occur while running LLM workloads with PyTorch on Radeon™ PRO W7700 graphics products.
Lower than expected performance (compared to native Linux) may be observed while running inference workloads (eg. Llama2, BERT) in WSL2.

Important!
Radeon™ PRO Series graphics cards are not designed nor recommended for datacenter usage. Use in a datacenter setting may adversely affect manageability, efficiency, reliability, and/or performance. GD-239.

Important!
ROCm is not officially supported on any mobile SKUs.

WSL recommended settings#

Optimizing GPU utilization
WSL overhead is a noted bottleneck for GPU utilization. Increasing the batch size of operations will load the GPU more optimally, reducing time required for AI workloads. Optimal batch sizes vary by model, and macro-parameters.

ROCm support in WSL environments#

AMD-smi support

Due to WSL architectural limitations for native Linux User Kernel Interface (UKI), amd-smi is not supported.

Issue	Limitations
UKI does not currently support amd-smi	No current support for: Active compute processes GPU utilization Modifiable state features

ROCm-profiler support

Not currently supported.

Debugger

Not currently supported.

Running PyTorch in virtual environments

Running PyTorch in virtual environments requires a manual libhsa-runtime64.so update.

When using the WSL usecase and hsa-runtime-rocr4wsl-amdgpu package (installed with PyTorch wheels), users are required to update to a WSL compatible runtime lib.

Solution:

Enter the following commands:

location=`pip show torch | grep Location | awk -F ": " '{print $2}'`
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so

Radeon Limitations and recommended settings

Contents

Radeon Limitations and recommended settings#

7.2 release known issues#

Linux#

Known issues#

Limitations#

Multi-GPU configuration#

Windows#

Known issues#

Limitations#

WSL#

Known issues#

WSL recommended settings#

ROCm support in WSL environments#