ROCm Optiq (Beta) release history#
2026-03-27
6 min read time
Version |
Release date |
|---|---|
March 26, 2026 |
|
February 11, 2026 |
|
December 10, 2025 |
ROCm Optiq (Beta) 0.3.0#
Added#
ROCm Optiq for visualizing ROCm Compute Profiler’s data. New features include:
Summary View: Shows a high-level overview of the captured data.
Table: lists the top 10 longest-running kernels sorted by Total Execution Time.
Charts: Plot duration and invocation statistics across kernels.
Roofline Chart: plots kernel performance against empirical hardware ceilings to reveal the dominant performance bottleneck for all kernels.
Kernel Details: displays details of each kernel.
Kernel Selection Table: Lists kernels with GPU metrics. Use Add Metric to append additional GPU metric columns. Per-column search box accepts names or metric expressions (for example,
metric > threshold). Click Apply Filters to execute; combine multiple filters to narrow the analysis.Memory Chart: Shows memory transactions and throughput per cache hierarchy level for the selected kernel.
System Speed-of-Light: Displays key kernel-level performance metrics with unit, average, peak, and percentage of peak values.
Kernel Roofline Chart: Shows a kernel-specific roofline analysis to determine whether a kernel is compute-bound or memory-bound. Click the gear icon to access customization options.
Table View: provides a complete list of available metrics for the selected kernel.
Workload Details: Provides contextual information about the workload.
Changed#
Changes in ROCm Optiq for visualizing ROCm System Profiler traces:
System Topology tree was restructured to show hardware and software topologies.
Memory allocation activity tracks are now displayed in Timeline and System Topology Views.
RPD files populate Topology.
Multinode support: Time normalization for multi-node configurations.
Known issues#
Metrics that reference None return N/A#
If a metric expression contains None, ROCm Compute Profiler may ignore the metric value even when it isn’t None. As a result, ROCm-Optiq displays N/A for affected metrics.
System Speed of Light (0200)
VALU Active Threads
LDS Bank Conflicts/Access
vL1D Cache Hit Rate
L2 Cache Hit Rate
L2-Fabric Read Latency
L2-Fabric Write Latency
sL1D Cache Hit Rate
L1I Fetch Latency
Memory Chart (0300)
LDS Latency
VL1 Hit
VL1 Lat
VL1 Coalesce
VL1 Stall
sL1D Hit
sL1D Lat
IL1 Lat
L2 Rd Lat
L2 Wr Lat
Command Processor CPC/CPF (0500)
CPF Utilization
CPF Stall
CPF-L2 Utilization
CPF-L2 Stall
CPF-UTCL1 Stall
CPC SYNC FIFO Full Rate
CPC CANE Stall Rate
CPC ADC Utilization
CPC Utilization
CPC Stall Rate
CPC Packet Decoding Utilization
CPC-Workgroup Manager Utilization
CPC-L2 Utilization
CPC-UTCL1 Stall
CPC-UTCL2 Utilization
Workgroup Manager SPI (0600)
VGPR Writes
SGPR Writes
Not-scheduled Rate (Workgroup Manager)
Not-scheduled Rate (Scheduler-Pipe)
Scheduler-Pipe FIFO Full Rate
Scheduler-Pipe Stall Rate
Scratch Stall Rate
Compute Units Compute Pipeline (1100)
VALU Active Threads
MFMA Instruction Cycles
VMEM Latency
SMEM Latency
Local Data Share LDS (1200)
Bank Conflict Rate
LDS Latency
Bank Conflicts/Access
Scalar L1 Data Cache (1400)
Cache Hit Rate
Vector L1 Data Cache (1600)
Hit rate
Utilization
Coalescing
Stalled on L2 Data
Stalled on L2 Req
Stalled on Address
Stalled on Data
Stalled on Latency FIFO
Stalled on Request FIFO
Stalled on Read Return
Tag RAM Stall (Read)
Tag RAM Stall (Write)
Tag RAM Stall (Atomic)
Cache Hit Rate
Hit Ratio
L2 Cache (1700)
HBM Read Traffic
Remote Read Traffic
Uncached Read Traffic
HBM Write and Atomic Traffic
Remote Write and Atomic Traffic
Atomic Traffic
Uncached Write and Atomic Traffic
Read Latency
Write and Atomic Latency
Atomic Latency
Read Stall
Write Stall
Cache Hit
Read - PCIe Stall
Read - Infinity Fabric Stall
Read - HBM Stall
Write - PCIe Stall
Write - Infinity Fabric Stall
Write - HBM Stall
Write - Credit Starvation
workload_name is missing in sysinfo.csv when using --output-directory#
When you profile with the --output-directory option, the workload_name column in sysinfo.csv might be empty. This can prevent views in the ROCm Compute Profiler analysis database from joining tables based on workload_name, which makes system information unavailable.
ROCm Optiq (Beta) 0.2.0#
Added#
Summary View: Displays the top ten kernels by execution time using pie charts, bar charts, or tables.
Minimap: Provides a compact overview of event density and counter values across the entire trace, enabling rapid navigation of large datasets.
Changed#
Timeline View: Improved navigation and selection. Added context menu option to create a time range filter from a selected event or events.
Advanced Details Panel: Aggregate by Column drop-down groups the results by the selected column. Options to size columns to fit in Event Table and Sample Table. Event Details now shows the function call’s arguments, if available.
Time Range Filtering: Improved time range selection.
Histogram: Shows event density in two display modes: “Normalization: All Tracks” and “Normalization: Visible Tracks”.
Multi-node: Multi-node data and a new multi-database yaml file format are supported.
ROCm Optiq (Beta) 0.1.0#
Initial release of ROCm Optiq (Beta).
Added#
System Topology View: Displays a hierarchical representation of the hardware or system components, such as nodes, processes, as well as the GPU queues, memory operations, threads, and more that belong to them.
Timeline View: Shows CPU and GPU activities, events, and performance metrics in chronological order for a detailed temporal analysis. ROCm Optiq allows you to zoom, filter, and bookmark data for fine-grained inspection. You can correlate GPU workloads with in-application CPU events and performance with hardware resource usage, enabling easy identification and remediation of performance blockers.
Advanced Details Panel: Provides an in-depth view of profiling data, enabling you to analyze performance metrics and event-specific information. It offers SQL-like filters and group-by operations.
Histogram: Shows the event density across all visible tracks and highlights the zoomed-in region to quickly identify hotspots.
Time Range Filtering: Select a specific time interval to filter events and counter samples for focused analysis.
Event Search: Quickly locate target events.