ROCm 7.2.3 release notes#
2026-05-04
11 min read time
The release notes provide a summary of notable changes since the previous ROCm release.
Note
If you’re using AMD Radeon™ GPUs or Ryzen™ for graphics workloads, see the Use ROCm on Radeon and Ryzen documentation to verify compatibility and system requirements.
Release highlights#
The following are notable new features and improvements in ROCm 7.2.3. For changes to individual components, see Detailed component changes.
Supported hardware, operating system, and virtualization changes#
Hardware, operating system, and virtualization support remains unchanged in this release.
For more information about:
AMD hardware, see Supported GPUs (Linux).
Operating systems, see Supported operating systems and ROCm installation for Linux.
Virtualization support, see Virtualization support.
User space, driver, and firmware dependent changes#
The software for AMD Data Center GPU products requires maintaining a hardware and software stack with interdependencies among the GPU and baseboard firmware, AMD GPU drivers, and the ROCm user space software. While AMD publishes drivers and ROCm user space components, your server or infrastructure provider publishes the GPU and baseboard firmware by bundling AMD’s firmware releases via the AMD Platform Level Data Model (PLDM) bundle, which includes the Integrated Firmware Image (IFWI).
GPU and baseboard firmware versioning might differ across GPU families.
|
ROCm Version |
GPU |
PLDM Bundle (Firmware) |
AMD GPU Driver (amdgpu) |
AMD GPU |
|---|---|---|---|---|
| ROCm 7.2.3 | MI355X |
01.26.00.02 01.25.17.07 01.25.16.03 |
30.30.x where x (0-3) 30.20.x where x (0-1) 30.10.x where x (0-2) |
8.7.1.K |
| MI350X |
01.26.00.02 01.25.17.07 01.25.16.03 |
30.30.x where x (0-2) 30.20.x where x (0-1) 30.10.x where x (0-2) |
||
| MI325X[1] |
01.25.06.08 01.25.04.02 |
30.30.x where x (0-2) 30.20.x where x (0-1)[1] 30.10.x where x (0-2) 6.4.z where z (0-3) 6.3.3 |
||
| MI300X[2] | 01.25.06.04 01.25.03.12 01.25.02.04 |
30.30.x where x (0-2) 30.20.x where x (0-1) 30.10.x where x (0-2) 6.4.z where z (0–3) 6.3.3 |
8.7.1.K | |
| MI300A | BKC 26.1 | Not Applicable | ||
| MI250X | IFWI 47 (or later) | |||
| MI250 | MU5 w/ IFWI 75 (or later) | |||
| MI210 | MU5 w/ IFWI 75 (or later) | 8.7.1.K | ||
| MI100 | VBIOS D3430401-037 | Not Applicable |
[1]: For AMD Instinct MI325X KVM SR-IOV users, don't use AMD GPU driver (amdgpu) 30.20.0.
[2]: AMD Instinct MI300X KVM SR-IOV with Multi-VF (8 VF) support requires a compatible firmware BKC bundle, which will be released in the coming months.
Improved profiling accuracy for vLLM workloads#
ROCm 7.2.3 improves profiling stability for vLLM workloads traced with PyTorch torch.profiler. The large, sporadic idle gaps that previously appeared between GPU kernels in the trace have been substantially reduced in common configurations, and the traces now more accurately reflect actual runtime behavior. Coverage may vary depending on model and parallelism settings; additional improvements are in progress.
MIGraphX update#
MIGraphX has the following enhancements:
Improved performance of the Gather operator#
Performance for embedding‑heavy inference workloads is improved by merging multiple independent gather operations from similar embedding tables into a single batched operation. Multi‑gather workloads now run more efficiently with fewer kernel launches and reduced memory traffic by adding horizontal fusion for cross-embedding gather operators. These gather operators have been updated to use transpose/reshape/broadcast/slice, enabling better optimization across different backends and data layouts.
ONNX Runtime reliability improvement#
ONNX Runtime workloads accelerated with MIGraphX now provide a more reliable experience through external stream support in the MIGraphX Execution Provider, with improved memory allocation and deallocation for multi-stream inference.
ROCm documentation updates#
ROCm documentation has been updated with ROCm XIO documentation. ROCm XIO provides an API for Accelerator-Initiated IO (XIO) for an AMD GPU __device__ code. It enables AMD GPUs to perform direct IO operations to hardware devices without CPU intervention. ROCm XIO was initially released in April 2026 as an early-access software technology preview. Running production workloads is not recommended.
For more information, see the ROCm XIO documentation and ROCm/rocm-xio GitHub repository.
ROCm components#
The following table lists the versions of ROCm components for ROCm 7.2.3, including any version changes from 7.2.2/7.2.1 to 7.2.3. Click the component’s updated version to go to a list of its changes.
Click to go to the component’s source code on GitHub.
| Category | Group | Name | Version | |
|---|---|---|---|---|
| Libraries | Machine learning and computer vision | Composable Kernel | 1.2.0 | |
| MIGraphX | 2.15.0 ⇒ 2.15.0 | |||
| MIOpen | 3.5.1 | |||
| MIVisionX | 3.5.0 | |||
| rocAL | 2.5.0 | |||
| rocDecode | 1.7.0 | |||
| rocJPEG | 1.4.0 | |||
| rocPyDecode | 0.8.0 | |||
| RPP | 2.2.1 | |||
| Communication | RCCL | 2.27.7 | ||
| rocSHMEM | 3.2.0 | |||
| Math | hipBLAS | 3.2.0 | ||
| hipBLASLt | 1.2.2 | |||
| hipFFT | 1.0.22 | |||
| hipfort | 0.7.1 | |||
| hipRAND | 3.1.0 | |||
| hipSOLVER | 3.2.0 | |||
| hipSPARSE | 4.2.0 | |||
| hipSPARSELt | 0.2.6 | |||
| rocALUTION | 4.1.0 | |||
| rocBLAS | 5.2.0 | |||
| rocFFT | 1.0.36 | |||
| rocRAND | 4.2.0 | |||
| rocSOLVER | 3.32.0 | |||
| rocSPARSE | 4.2.0 | |||
| rocWMMA | 2.2.0 | |||
| Tensile | 4.45.0 | |||
| Primitives | hipCUB | 4.2.0 | ||
| hipTensor | 2.2.0 | |||
| rocPRIM | 4.2.0 | |||
| rocThrust | 4.2.0 | |||
| Tools | System management | AMD SMI | 26.2.2 | |
| ROCm Data Center Tool | 1.2.0 | |||
| rocminfo | 1.0.0 | |||
| ROCm SMI | 7.8.0 | |||
| ROCm Validation Suite | 1.3.0 | |||
| Performance | ROCm Bandwidth Test | 2.6.0 | ||
| ROCm Compute Profiler | 3.4.0 | |||
| ROCm Systems Profiler | 1.3.0 | |||
| ROCProfiler | 2.0.0 | |||
| ROCprofiler-SDK | 1.1.0 | |||
| ROCTracer | 4.1.0 | |||
| Development | HIPIFY | 22.0.0 | ||
| ROCdbgapi | 0.77.4 | |||
| ROCm CMake | 0.14.0 | |||
| ROCm Debugger (ROCgdb) | 16.3 | |||
| ROCr Debug Agent | 2.1.0 | |||
| Compilers | HIPCC | 1.1.1 | ||
| llvm-project | 22.0.0 | |||
| Runtimes | HIP | 7.2.1 | ||
| ROCr Runtime | 1.18.0 | |||
Detailed component changes#
The following sections describe key changes to ROCm components.
Note
For a historical overview of ROCm component updates, see the ROCm consolidated changelog.
MIGraphX (2.15.0)#
Added#
External stream support to the MIGraphX context, allowing external HIP streams to be used during execution.
Ability to return a vector for output alias, supporting operators like
make_tuple.
Changed#
Refactored
move_output_instructions_afterinto the module class.Updated rocMLIR to fix
bert_squadandbert_tfregressions.
Optimized#
Rewrote the
gatheroperator to usetranspose/reshape/broadcast/slicefor improved performance.Horizontally fuse cross-embedding
gatheroperators.Improved tuning for Split-K.
Removed extra assignments and inserts in
find_nop_reshapesto reduce overhead.
Resolved issues#
The following issues have been fixed:
inttobf16/fp16conversion errors.Comparison logic in
find_concat_opto match the correct I/O.shape_transform_descriptor::rebasewhen flattening a broadcasted dimension.An error with
rewrite_reshapes.A gather rewrite crash by validating the strided view element count.
A bug in gather rewrite with NHWC shapes.
A crash in rocMLIR with Inception v3 on RDNA3 architecture-based Radeon GPUs.
Filter zero-argument operators during ONNX parsing to prevent errors.
Conflict for missing
no_broadcastparameter on ROCm 7.2.x.
ROCm known issues#
ROCm known issues are noted on GitHub. For known issues related to individual components, review the Detailed component changes.
Minor performance regression for MIGraphX with int8-quantized models#
You might observe a slight performance regression when running int8-quantized models with MIGraphX. This impact is generally minimal and does not affect correctness. However, workloads sensitive to peak throughput might have reduced performance when compared to non-quantized or alternative execution paths. This issue is currently under investigation and will be fixed in a future ROCm release. See GitHub issue #6195.
ROCm upcoming changes#
The following changes to the ROCm software stack are anticipated for future releases.
ROCTracer, ROCProfiler, rocprof, and rocprofv2 deprecation#
ROCTracer, ROCProfiler, rocprof, and rocprofv2 are deprecated. It’s strongly recommended to upgrade to the latest version of the ROCprofiler-SDK library and the (rocprofv3) tool to ensure continued support and access to new features.
To learn about key feature improvements and benefits of ROCprofiler-SDK over the deprecated ROCProfiler and ROCTracer, see Comparing ROCprofiler-SDK to legacy ROCm profiling tools.
It’s anticipated that ROCTracer, ROCProfiler, rocprof, and rocprofv2 will reach end of support (EoS) by the end of 2026 Q2.
ROCm SMI deprecation#
ROCm SMI will be phased out in an upcoming ROCm release and will enter maintenance mode. After this transition, only critical bug fixes will be addressed and no further feature development will take place.
It’s strongly recommended to transition your projects to AMD SMI, the successor to ROCm SMI. AMD SMI includes all the features of the ROCm SMI and will continue to receive regular updates, new functionality, and ongoing support. For more information on AMD SMI, see the AMD SMI documentation.
Changes to ROCm Object Tooling#
ROCm Object Tooling tools roc-obj-ls, roc-obj-extract, and roc-obj were
deprecated in ROCm 6.4, and will be removed in a future release. Functionality
has been added to the llvm-objdump --offloading tool option to extract all
clang-offload-bundles into individual code objects found within the objects
or executables passed as input. The llvm-objdump --offloading tool option also
supports the --arch-name option, and only extracts code objects found with
the specified target architecture. See llvm-objdump
for more information.