6.4.4 release known issues#

Note
ROCm 6.4.4 is a preview release, meaning that stability and performance are not yet optimized. Furthermore, only Pytorch is currently available on Windows - the rest of the ROCm stack is only supported on Linux.
AMD is aware and actively working on resolving these issues for future releases.

Linux#

Known issues#

  • Increased memory consumption may be observed while running LLM FP8 inference workloads with PyTorch on Radeon™ RX 9060 XT graphics products.

  • Intermittent script failure may be observed while running Stable Diffusion FP16 inference workloads on some Radeon™ RX 7000 series graphics products.

  • Intermittent script failure may be observed while running text-to-image inference workloads with PyTorch.

  • Lower than expected performance may be observed while running Llama 3 8B inference workloads with Llama.cpp. Users experiencing this issue are recommended to try an older version of Llama.cpp as a temporary workaround.

  • Intermittent script failure may be observed while running Stable Diffusion workloads with ONNX Runtime and MIGraphX. Users experiencing this issue are recommended to use MXR files instead of rebuilding the model as a temporary workaround.

  • Intermittent system or application crash may be observed while running Luxmark in conjunction with other compute workloads on Radeon™ RX 9000 series graphics products. Users experiencing this issue are recommended to shut down other applications and workloads while running Luxmark.

  • Intermittent script failure (out of memory) may be observed while running high-memory LLM workloads with multiple GPUs on Radeon RX 9060 serires graphics products.

  • Intermittent script failure may be observed while running BERT training workloads with JAX.

  • Intermittent script failure may be observed while running Stable Diffusion 2.1 FP16 workloads with JAX.

  • Intermittent script failure (out of memory) may be observed while running Llama2 FP16 workloads with ONNX Runtime and MIGraphX.

  • Intermittent system or application crash may be observed while running TensorFlow ResNet50 training workloads.

  • Increased memory consumption may be observed while running TensorFlow Resnet50 training workloads

Multi-GPU configuration#

AMD has identified common errors when running ROCm™ on Radeon™ multi-GPU configuration at this time, along with the applicable recommendations.

See mGPU known issues and limitations for a complete list of mGPU known issues and limitations.

Windows#

Note
If you encounter errors related to missing .dll libraries, install Visual C++ 2015-2022 Redistributables.

Known issues#

  • If you encounter an error relating to Application Control Policy blocking DLL loading, check that Smart App Control is OFF. Note that to re-enable Smart App Control, you will need to reinstall windows. A future release will fix this requirement.

  • Intermittent application crash or driver timeout may be observed while running inference workloads with PyTorch on Windows while also running other applications (such as games or web browsers).

  • Failure to launch may be observed after installation while running ComfyUI with Smart App Control enabled.

Limitations#

  • No backward pass support (essential for ML training).

  • Only Python 3.12 is supported.

  • On Windows, only Pytorch is supported, not the entire ROCm stack.

  • On Windows, the latest version of transformers should be installed, via pip install. Some older versions of transformers (<4.55.5) might not be supported.

  • On Windows, only LLM batch sizes of 1 are officially supported.

  • On Windows, the torch.distributed module is currently not supported (may impact A1111 as an example. Some functions from diffusers and accelerate module may get affected).

WSL#

Known issues#

  • Intermittent script failure may be observed while running Llama 3 inference workloads with vLLM in WSL2. End users experiencing this issue are recommended to follow vLLM setup instructions here.

  • Intermittent script failure or driver timeout may be observed while running Stable Diffusion 3 inference workloads with JAX.

  • Lower than expected performance may be observed while running inference workloads with JAX in WSL2.

  • Intermittent script failure may be observed while running Resnet50, BERT, or InceptionV3 training workloads with ONNX runtime.

  • Output error message (resource leak) may be observed while running Llama 3.2 workloads with vLLM.

  • Output error message (VaMgr) may be observed while running PyTorch workloads in WSL2.

  • Intermittent script failure or driver timeout may be observed while running Stable Diffusion inference workloads with TensorFlow.

  • Intermittent application crash may be observed while running Stable Diffusion workloads with ComfyUI and MIGraphX on Radeon™ RX 9060 series graphics products.

  • Intermittent script failure may occur while running Stable Diffusion 2 workloads with PyTorch and MIGraphX

  • Intermittent script failure may occur while running LLM workloads with PyTorch on Radeon™ PRO W7700 graphics products.

  • Lower than expected performance (compared to native Linux) may be observed while running inference workloads (eg. Llama2, BERT) in WSL2.

Important!
Radeon™ PRO Series graphics cards are not designed nor recommended for datacenter usage. Use in a datacenter setting may adversely affect manageability, efficiency, reliability, and/or performance. GD-239.

Important!
ROCm is not officially supported on any mobile SKUs.

ROCm support in WSL environments#

Due to WSL architectural limitations for native Linux User Kernel Interface (UKI), rocm-smi is not supported.

Issue

Limitations

UKI does not currently support rocm-smi

No current support for:
Active compute processes
GPU utilization
Modifiable state features

Not currently supported.

Not currently supported.

Running PyTorch in virtual environments

Running PyTorch in virtual environments requires a manual libhsa-runtime64.so update.

When using the WSL usecase and hsa-runtime-rocr4wsl-amdgpu package (installed with PyTorch wheels), users are required to update to a WSL compatible runtime lib.

Solution:

Enter the following commands:

location=`pip show torch | grep Location | awk -F ": " '{print $2}'`
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so