Performance analysis glossary#
2026-02-20
5 min read time
This section provides brief definitions of performance analysis concepts and optimization techniques.
- Active cycle#
An active cycle is a clock cycle in which a compute unit has at least one active wavefront resident. See Wavefront execution states for details.
- Arithmetic bandwidth#
Arithmetic bandwidth is the peak rate at which arithmetic work can be performed, defining the compute roof in roofline models. See Compute-bound performance for details.
- Arithmetic intensity#
Arithmetic intensity is the ratio of arithmetic operations to memory operations in a kernel, and determines performance characteristics. See Arithmetic intensity for intensity analysis.
- Bank conflict#
A bank conflict occurs when multiple threads simultaneously access different addresses in the same LDS bank, serializing accesses. See Bank conflict theory for details.
- Branch efficiency#
Branch efficiency measures how often all threads within a wavefront take the same execution path, quantifying control-flow uniformity. See Branch efficiency for branch analysis.
- Compute-bound#
Compute-bound kernels are limited by the arithmetic bandwidth of the GPU’s compute units rather than memory bandwidth. See Compute-bound performance for compute-bound analysis.
- CU utilization#
CU utilization measures the percentage of time that compute units are actively executing instructions. See CU utilization for utilization analysis.
- Issue efficiency#
Issue efficiency measures how effectively the wavefront scheduler keeps execution pipelines busy by issuing instructions. See Issue efficiency for efficiency metrics.
- Latency hiding#
Latency hiding masks long-latency operations by running many concurrent threads, keeping execution pipelines busy. See Latency hiding mechanisms for details.
- Little’s Law#
Little’s Law relates concurrency, latency, and throughput, determining how much independent work must be in flight to hide latency. See Little’s Law for latency hiding details.
- Memory bandwidth#
Memory bandwidth is the maximum rate at which data can be transferred between memory hierarchy levels, typically measured in bytes per second. See Memory-bound performance for details.
- Memory coalescing#
Memory coalescing improves memory bandwidth by servicing many logical loads or stores with fewer physical memory transactions. See Memory coalescing theory for coalescing patterns.
- Memory-bound#
Memory-bound kernels are limited by memory bandwidth rather than arithmetic bandwidth, typically due to low arithmetic intensity. See Memory-bound performance for memory-bound analysis.
- Occupancy#
Occupancy is the ratio of active wavefronts to the maximum number of wavefronts that can be active on a compute unit. See Occupancy theory for occupancy analysis.
- Overhead#
Overhead latency is the time spent with no useful work being done, often due to CPU-side bottlenecks or kernel launch delays. See Performance bottlenecks for details.
- Peak rate#
Peak rate is the theoretical maximum throughput at which a hardware system can complete work under ideal conditions. See Theoretical performance limits for details.
- Pipe utilization#
Pipe utilization measures how effectively a kernel uses the execution pipelines within each compute unit. See Pipe utilization for utilization details.
- Register pressure#
Register pressure occurs when excessive register demand limits the number of active wavefronts per compute unit, reducing occupancy. See Register pressure theory for details.
- Roofline model#
The roofline model is a visual performance model that determines whether a program is compute-bound or memory-bound. See Roofline model for roofline analysis.
- Wavefront divergence#
Wavefront divergence occurs when threads within a wavefront take different execution paths due to conditional statements. See Branch efficiency for divergence handling details.
- Wavefront execution state#
Wavefront execution states (active, stalled, eligible, selected) describe the scheduling status of wavefronts on AMD GPUs. See Wavefront execution states for state definitions.