ROCm 7.0.2 release notes#

2025-10-10

20 min read time

Applies to Linux

The release notes provide a summary of notable changes since the previous ROCm release.

Note

If you’re using AMD Radeon GPUs or Ryzen APUs in a workstation setting with a display connected, see the Use ROCm on Radeon and Ryzen documentation to verify compatibility and system requirements.

Release highlights#

The following are notable new features and improvements in ROCm 7.0.2. For changes to individual components, see Detailed component changes.

Supported hardware, operating system, and virtualization changes#

ROCm 7.0.2 adds support for the RDNA4 architecture-based AMD Radeon RX 9060. For more information about supported AMD hardware, see Supported GPUs (Linux).

ROCm 7.0.2 adds support for the following operating systems and kernel versions:

  • Debian 13 (kernel: 6.12)

  • Oracle Linux 10 (kernel: 6.12.0 [UEK])

  • RHEL 10.0 (kernel: 6.12.0-55)

For more information about supported operating systems, see Supported operating systems and install instructions.

Virtualization support#

Virtualization support remains unchanged in this release. For more information, see Virtualization Support.

User space, driver, and firmware dependent changes#

The software for AMD Datacenter GPU products requires maintaining a hardware and software stack with interdependencies between the GPU and baseboard firmware, AMD GPU drivers, and the ROCm user space software.

ROCm Version

GPU

PLDM Bundle (Firmware)

AMD GPU Driver (amdgpu)

AMD GPU
Virtualization Driver (GIM)

ROCm 7.0.2 MI355X 01.25.15.04
01.25.13.09
30.10.2
30.10.1
30.10
8.4.1.K
MI350X 01.25.15.04
01.25.13.09
30.10.2
30.10.1
30.10
MI325X 01.25.04.02
01.25.03.03
30.10.2
30.10.1
30.10
6.4.z where z (0-3)
6.3.y where y (1-3)
MI300X 01.25.05.00 (or later)[1]
01.25.03.12
30.10.2
30.10.1
30.10
6.4.z where z (0–3)
6.3.y where y (0–3)
6.2.x where x (1–4)
8.4.1.K
MI300A BKC 26
BKC 25
Not Applicable
MI250X IFWI 47
MI250 MU3 w/ IFWI 73
MI210 MU3 w/ IFWI 73 8.4.0.K
MI100 VBIOS D3430401-037 Not Applicable

[1]: PLDM bundle 01.25.05.00 will be available by October 31, 2025.

AMD Instinct MI300X GPU resiliency improvement#

Multimedia Engine Reset is now supported in AMD GPU Driver (amdgpu) 30.10.2 for AMD Instinct MI300X GPUs. This finer-grain GPU resiliency feature allows recovery from faults related to VCN or JPEG without requiring a full GPU reset, thereby improving system stability and fault tolerance. Note that VCN queue reset functionality requires PLDM bundle 01.25.05.00 (or later) firmware.

New OS support in ROCm dependent on AMD GPU Driver#

ROCm support for RHEL 10.0 and Oracle 10 requires AMD GPU Driver 30.10.2 or later.

RAG AI support enabled for ROCm#

In September 2025, Retrieval-Augmented Generation (RAG) was added to the ROCm platform. Use RAG to build and deploy end-to-end AI pipelines on AMD GPUs. It enhances the accuracy and reliability of a large language model (LLM) by exposing it to up-to-date, relevant information. When queried, RAG retrieves relevant data from its knowledge base and uses it in conjunction with the query to generate accurate and informed responses. This approach minimizes hallucinations (the creation of false information) while also enabling the model to access current information not present in its original training data. For more information, see the ROCm-RAG documentation.

gsplat support enabled for ROCm#

Gaussian splatting (gsplat) is an open-source library for GPU-accelerated differentiable rasterization of 3D Gaussians with Python bindings. This ROCm-enabled release of gsplat is built on top of PyTorch for ROCm, enabling innovators in computer graphics, machine learning, and 3D vision to leverage GPU acceleration with AMD Instinct GPUs. With gsplat, you can build, research, and innovate with Gaussian splatting. To install gsplat on ROCm, see installation instructions.

Introducing ROCm Life Science (ROCm-LS) toolkit#

The ROCm Life Science (ROCm-LS) toolkit is an open-source software collection for high-performance life science and healthcare applications built on the core ROCm platform. It helps you accelerate life science processing and analyze workloads on AMD GPUs. ROCm-LS is in an early access state. Running production workloads is not recommended. For more information, see the AMD ROCm-LS documentation.

ROCm-LS provides the following tools to build a complete workflow for life science acceleration on AMD GPUs:

  • The hipCIM library provides powerful support for GPU-accelerated I/O operations, coupled with an array of computer vision and image processing primitives designed for N-dimensional image data in fields such as biomedical imaging. For more information, see the hipCIM documentation.

  • MONAI for AMD ROCm, a ROCm-enabled version of MONAI, is built on top of PyTorch for AMD ROCm, helping healthcare and life science innovators to leverage GPU acceleration with AMD Instinct GPUs for high-performance inference and training of medical AI applications. For more information, see the MONAI for AMD ROCm documentation.

Deep learning and AI framework updates#

ROCm provides a comprehensive ecosystem for deep learning development. For more information, see Deep learning frameworks for ROCm and the Compatibility matrix for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm.

Updated framework support#

ROCm 7.0.0 introduces several newly supported versions of Deep learning and AI frameworks:

PyTorch#

ROCm 7.0.2 enables support for PyTorch 2.8.

New frameworks#

AMD ROCm has officially added support for the following Deep learning and AI frameworks:

  • FlashInfer is a library and kernel generator for Large Language Models (LLMs) that provides a high-performance implementation of graphics processing units (GPUs) kernels. FlashInfer focuses on LLM serving and inference, as well as advanced performance across diverse scenarios. It is supported on ROCm 6.4.1. For more information, see FlashInfer compatibility.

  • llama.cpp is an open-source framework for Large Language Model (LLM) inference that runs on both central processing units (CPUs) and graphics processing units (GPUs). It is written in plain C/C++, providing a simple, dependency-free setup. It is now supported on ROCm 7.0.0 and 6.4.x. For more information, see llama.cpp compatibility.

ROCm Offline Installer Creator updates#

The ROCm Offline Installer Creator 7.0.2 includes the following features and improvements:

  • Added support for RHEL 10.0, Oracle Linux 10, and Debian 13.

  • Added support for creating an offline installer for Debian 12 when the kernel version of the target operating system differs from the operating system of the host creating the installer.

  • Removed the restriction requiring the kernels for the host and target systems to match when creating a ROCm-only (no AMD GPU Driver) offline installer.

See ROCm Offline Installer Creator for more information.

ROCm Runfile Installer updates#

The ROCm Runfile Installer 7.0.2 adds the following features and improvements:

  • Added support for RHEL 10.0, Oracle Linux 10, and Debian 13.

  • Minor fixes for the untar mode. For more information, see ROCm Runfile Installer.

ROCm documentation updates#

ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.

ROCm components#

The following table lists the versions of ROCm components for ROCm 7.0.2, including any version changes from 7.0.1 to 7.0.2. Click the component’s updated version to go to a list of its changes.

Click to go to the component’s source code on GitHub.

Category Group Name Version
Libraries Machine learning and computer vision Composable Kernel 1.1.0
MIGraphX 2.13.0
MIOpen 3.5.0
MIVisionX 3.3.0
rocAL 2.3.0
rocDecode 1.0.0
rocJPEG 1.1.0
rocPyDecode 0.6.0
RPP 2.0.0
Communication RCCL 2.26.6 ⇒ 2.26.6
rocSHMEM 3.0.0
Math hipBLAS 3.0.0 ⇒ 3.0.2
hipBLASLt 1.0.0
hipFFT 1.0.20
hipfort 0.7.0
hipRAND 3.0.0
hipSOLVER 3.0.0
hipSPARSE 4.0.1
hipSPARSELt 0.2.4
rocALUTION 4.0.0
rocBLAS 5.0.0 ⇒ 5.0.2
rocFFT 1.0.34
rocRAND 4.0.0
rocSOLVER 3.30.0 ⇒ 3.30.1
rocSPARSE 4.0.2 ⇒ 4.0.3
rocWMMA 2.0.0
Tensile 4.44.0
Primitives hipCUB 4.0.0
hipTensor 2.0.0
rocPRIM 4.0.0 ⇒ 4.0.1
rocThrust 4.0.0
Tools System management AMD SMI 26.0.0 ⇒ 26.0.2
ROCm Data Center Tool 1.1.0
rocminfo 1.0.0
ROCm SMI 7.8.0
ROCm Validation Suite 1.2.0
Performance ROCm Bandwidth Test 2.6.0
ROCm Compute Profiler 3.2.3
ROCm Systems Profiler 1.1.0 ⇒ 1.1.1
ROCProfiler 2.0.0
ROCprofiler-SDK 1.0.0
ROCTracer 4.1.0
Development HIPIFY 20.0.0
ROCdbgapi 0.77.3 ⇒ 0.77.4
ROCm CMake 0.14.0
ROCm Debugger (ROCgdb) 16.3
ROCr Debug Agent 2.1.0
Compilers HIPCC 1.1.1
llvm-project 20.0.0
Runtimes HIP 7.0.0 ⇒ 7.0.2
ROCr Runtime 1.18.0

Detailed component changes#

The following sections describe key changes to ROCm components.

Note

For a historical overview of ROCm component updates, see the ROCm consolidated changelog.

AMD SMI (26.0.1)#

Added#

  • Added bad_page_threshold_exceeded field to amd-smi static --ras, which compares retired pages count against bad page threshold. This field displays True if retired pages exceed the threshold, False if within threshold, or N/A if threshold data is unavailable. Note that sudo is required to have the bad_page_threshold_exceeded field populated.

Removed#

  • Removed gpuboard and baseboard temperatures enums in amdsmi Python Library.

    • AmdSmiTemperatureType had issues with referencing the correct attribute. As such, the following duplicate enums have been removed:

      • AmdSmiTemperatureType.GPUBOARD_NODE_FIRST

      • AmdSmiTemperatureType.GPUBOARD_VR_FIRST

      • AmdSmiTemperatureType.BASEBOARD_FIRST

Resolved Issues#

  • Fixed attribute error in amd-smi monitor on Linux Guest systems, where the violations argument caused CLI to break.

  • Fixed certain output in amd-smi monitor when GPUs are partitioned.

    • It fixes the amd-smi monitor such as: amd-smi monitor -Vqt, amd-smi monitor -g 0 -Vqt -w 1, amd-smi monitor -Vqt --file /tmp/test1, etc. These commands will now be able to display as normal in partitioned GPU scenarios.

  • Fixed an issue where using amd-smi ras --folder <folder_name> was forcing the created folder’s name to be lowercase. This fix also allows all string input options to be case insensitive.

  • Fixed an issue of some processes not being detected by AMD SMI despite making use of KFD resources. This fix, with the addition of KFD Fallback for process detection, ensures that all KFD processes will be detected.

  • Multiple CPER issues were fixed.

    • Issue of being unable to query for additional CPERs after 20 were generated on a single device.

    • Issue where the RAS HBM CRC read was failing due to an incorrect AFID value.

    • Issue where RAS injections were not consistently producing related CPERs.

HIP (7.0.2)#

Added#

  • Support for the hipMemAllocationTypeUncached flag, enabling developers to allocate uncached memory. This flag is now supported in the following APIs:

    • hipMemGetAllocationGranularity determines the recommended allocation granularity for uncached memory.

    • hipMemCreate allocates memory with uncached properties.

Resolved issues#

  • A compilation failure affecting applications that compile kernels using hiprtc with the compiler option std=c++11.

  • A permission-related error occurred during the execution of hipLaunchHostFunc. This API is now supported and permitted to run during stream capture, aligning its behavior with CUDA.

  • A numerical error during graph capture of kernels that rely on a remainder in globalWorkSize, in frameworks like MIOpen and PyTorch, where the grid size is not a multiple of the block size. To ensure correct replay behavior, HIP runtime now stores this remainder in hip::GraphKernelNode during hipExtModuleLaunchKernel capture, enabling accurate execution and preventing corruption.

  • A page fault occurred during viewport rendering while running the file undo.blend in Blender. The issue was resolved by the HIP runtime, which reused the same context during image creation.

  • Resolved a segmentation fault in gpu_metrics, which is used in threshold logic for command submission patches to GPU device(s) during CPU synchronization.

hipBLAS (3.0.2)#

Added#

  • Enabled support for gfx1150, gfx1151, gfx1200, and gfx1201 AMD hardware.

RCCL (2.26.6)#

Added#

  • Enabled double-buffering in reduceCopyPacks to trigger pipelining, especially to overlap bf16 arithmetic.

  • Added --force-reduce-pipeline as an option that can be passed to the install.sh script. Passing this option will enable software-triggered pipelining bfloat16 reductions (that is, all_reduce, reduce_scatter, and reduce).

rocBLAS (5.0.2)#

Added#

  • Enabled gfx1150 and gfx1151.

  • The ROCBLAS_USE_HIPBLASLT_BATCHED variable to independently control the batched hipblaslt backend. Set ROCBLAS_USE_HIPBLASLT_BATCHED=0 to disable batched GEMM use of the hipblaslt backend.

Resolved issues#

  • Set the imaginary portion of the main diagonal of the output matrix to zero in syrk and herk.

ROCdbgapi (0.77.4)#

Added#

  • ROCdbgapi documentation link in the README.md file.

ROCm Systems Profiler (1.1.1)#

Resolved issues#

  • Fixed an issue where ROC-TX ranges were displayed as two separate events instead of a single spanning event.

rocPRIM (4.0.1)#

Resolved issues#

  • Fixed compilation issue when using rocprim::texture_cache_iterator.

  • Fixed a HIP version check used to determine whether hipStreamLegacy is supported. This resolves runtime errors that occur when hipStreamLegacy is used in ROCm 7.0.0 and later.

rocSPARSE (4.0.3)#

Resolved issues#

  • Fixed an issue causing premature deallocation of internal buffers while still in use.

rocSOLVER (3.30.1)#

Optimized#

Improved the performance of:

  • LARFT and downstream functions such as GEQRF and ORMTR.

  • LARF and downstream functions such as GEQR2.

  • ORMTR and downstream functions such as SYEVD.

  • GEQR2 and downstream functions such as GEQRF.

ROCm known issues#

ROCm known issues are noted on GitHub. For known issues related to individual components, review the Detailed component changes.

ROCm debugging tools might become unresponsive in SELinux-enabled distributions#

Red Hat Enterprise Linux (RHEL) and related distributions automatically enable a security feature named Security-Enhanced Linux (SELinux), which may prevent ROCm debugging tools, such as ROCgdb, ROCdbgapi, and ROCR Debug Agent, from working correctly.

The problem occurs when attempting to debug a program that contains code that runs on the GPU. The debugging session might become unresponsive while attempting to reach a breakpoint or executing instruction-stepping in device code. ROCgdb will still be responsive and accept interruptions by pressing Control+C, but the breakpoint in device code won’t be hit, and the instruction-stepping operation will not be completed.

The ROCR Debug Agent might also become unresponsive when attempting to capture data from a program that is experiencing queue errors, memory faults, or other triggering events.

For a detailed workaround, see the Installation troubleshooting documentation. This issue will be fixed in a future ROCm release. See GitHub issue #5498.

MIGraphX Python API will fail when running on Python 3.13#

Applications using the MIGraphX Python API will fail when running on Python 3.13 and return the error message AttributeError: module 'migraphx' has no attribute 'parse_onnx'. The issue does not occur when you manually build MIGraphX. For detailed instructions, see Building from source. As a workaround, change the Python version to the one found in the installed location:

ls -l /opt/rocm-7.0.0/lib/libmigraphx_py_*.so

The issue will be resolved in a future ROCm release. See GitHub issue #5500.

Applications using OpenCV might fail due to package incompatibility between the OS#

OpenCV packages built on Ubuntu 24.04 are incompatible with Debian 13 due to a version conflict. As a result, applications, tests, and samples that use OpenCV might fail. To avoid the version conflict, rebuild OpenCV with the version corresponding to Debian 13, then rebuild MIVisionX on top of it. As a workaround, rebuild OpenCV from source, followed by the application that uses OpenCV. This issue will be fixed in a future ROCm release. See GitHub issue #5501.

ROCm upcoming changes#

The following changes to the ROCm software stack are anticipated for future releases.

ROCm Execution Provider (ROCm-EP) deprecation#

ROCm 7.0.2 is the last official AMD-supported distribution of ROCm Execution Provider (ROCm-EP). ROCm EP will be removed from all upcoming ROCm releases. Refer to this Pull Request for more information. Migrate your applications to use the MIGraphX Execution Provider.

ROCm SMI deprecation#

ROCm SMI will be phased out in an upcoming ROCm release and will enter maintenance mode. After this transition, only critical bug fixes will be addressed and no further feature development will take place.

It’s strongly recommended to transition your projects to AMD SMI, the successor to ROCm SMI. AMD SMI includes all the features of the ROCm SMI and will continue to receive regular updates, new functionality, and ongoing support. For more information on AMD SMI, see the AMD SMI documentation.

ROCTracer, ROCProfiler, rocprof, and rocprofv2 deprecation#

Development and support for ROCTracer, ROCProfiler, rocprof, and rocprofv2 are being phased out in favor of ROCprofiler-SDK in upcoming ROCm releases. Starting with ROCm 6.4, only critical defect fixes will be addressed for older versions of the profiling tools and libraries. All users are encouraged to upgrade to the latest version of the ROCprofiler-SDK library and the (rocprofv3) tool to ensure continued support and access to new features. ROCprofiler-SDK is still in beta today and will be production-ready in a future ROCm release.

It’s anticipated that ROCTracer, ROCProfiler, rocprof, and rocprofv2 will reach end-of-life by future releases, aligning with Q1 of 2026.

AMDGPU wavefront size compiler macro deprecation#

Access to the wavefront size as a compile-time constant via the __AMDGCN_WAVEFRONT_SIZE and __AMDGCN_WAVEFRONT_SIZE__ macros are deprecated and will be disabled in a future release. In ROCm 7.0.0 warpSize is only available as a non-constexpr variable. You’re encouraged to update your code if needed to ensure future compatibility.

  • The __AMDGCN_WAVEFRONT_SIZE__ macro and __AMDGCN_WAVEFRONT_SIZE alias will be removed in an upcoming release. It is recommended to remove any use of this macro. For more information, see AMDGPU support.

  • warpSize is only available as a non-constexpr variable. Where required, the wavefront size should be queried via the warpSize variable in device code, or via hipGetDeviceProperties in host code. Neither of these will result in a compile-time constant. For more information, see warpSize.

  • For cases where compile-time evaluation of the wavefront size cannot be avoided, uses of __AMDGCN_WAVEFRONT_SIZE, __AMDGCN_WAVEFRONT_SIZE__, or warpSize can be replaced with a user-defined macro or constexpr variable with the wavefront size(s) for the target hardware. For example:

   #if defined(__GFX9__)
   #define MY_MACRO_FOR_WAVEFRONT_SIZE 64
   #else
   #define MY_MACRO_FOR_WAVEFRONT_SIZE 32
   #endif

Changes to ROCm Object Tooling#

ROCm Object Tooling tools roc-obj-ls, roc-obj-extract, and roc-obj were deprecated in ROCm 6.4, and will be removed in a future release. Functionality has been added to the llvm-objdump --offloading tool option to extract all clang-offload-bundles into individual code objects found within the objects or executables passed as input. The llvm-objdump --offloading tool option also supports the --arch-name option, and only extracts code objects found with the specified target architecture. See llvm-objdump for more information.