Performance analysis glossary#
2026-02-20
5 min read time
This section provides brief definitions of performance analysis concepts and optimization techniques.
- Active cycle#
An active cycle is a clock cycle in which a compute unit has at least one active wavefront resident. See hip:wavefront_execution for details.
- Arithmetic bandwidth#
Arithmetic bandwidth is the peak rate at which arithmetic work can be performed, defining the compute roof in roofline models. See hip:compute_bound for details.
- Arithmetic intensity#
Arithmetic intensity is the ratio of arithmetic operations to memory operations in a kernel, and determines performance characteristics. See hip:arithmetic_intensity for intensity analysis.
- Bank conflict#
A bank conflict occurs when multiple threads simultaneously access different addresses in the same LDS bank, serializing accesses. See hip:bank_conflicts_theory for details.
- Branch efficiency#
Branch efficiency measures how often all threads within a wavefront take the same execution path, quantifying control-flow uniformity. See hip:branch_efficiency for branch analysis.
- Compute-bound#
Compute-bound kernels are limited by the arithmetic bandwidth of the GPU’s compute units rather than memory bandwidth. See hip:compute_bound for compute-bound analysis.
- CU utilization#
CU utilization measures the percentage of time that compute units are actively executing instructions. See hip:cu_utilization for utilization analysis.
- Issue efficiency#
Issue efficiency measures how effectively the wavefront scheduler keeps execution pipelines busy by issuing instructions. See hip:issue_efficiency for efficiency metrics.
- Latency hiding#
Latency hiding masks long-latency operations by running many concurrent threads, keeping execution pipelines busy. See hip:latency_hiding for details.
- Little’s Law#
Little’s Law relates concurrency, latency, and throughput, determining how much independent work must be in flight to hide latency. See hip:littles_law for latency hiding details.
- Memory bandwidth#
Memory bandwidth is the maximum rate at which data can be transferred between memory hierarchy levels, typically measured in bytes per second. See hip:memory_bound for details.
- Memory coalescing#
Memory coalescing improves memory bandwidth by servicing many logical loads or stores with fewer physical memory transactions. See hip:memory_coalescing_theory for coalescing patterns.
- Memory-bound#
Memory-bound kernels are limited by memory bandwidth rather than arithmetic bandwidth, typically due to low arithmetic intensity. See hip:memory_bound for memory-bound analysis.
- Occupancy#
Occupancy is the ratio of active wavefronts to the maximum number of wavefronts that can be active on a compute unit. See hip:occupancy for occupancy analysis.
- Overhead#
Overhead latency is the time spent with no useful work being done, often due to CPU-side bottlenecks or kernel launch delays. See hip:performance_bottlenecks for details.
- Peak rate#
Peak rate is the theoretical maximum throughput at which a hardware system can complete work under ideal conditions. See hip:theoretical_performance_limits for details.
- Pipe utilization#
Pipe utilization measures how effectively a kernel uses the execution pipelines within each compute unit. See hip:pipe_utilization for utilization details.
- Register pressure#
Register pressure occurs when excessive register demand limits the number of active wavefronts per compute unit, reducing occupancy. See hip:register_pressure_theory for details.
- Roofline model#
The roofline model is a visual performance model that determines whether a program is compute-bound or memory-bound. See hip:roofline_model for roofline analysis.
- Wavefront divergence#
Wavefront divergence occurs when threads within a wavefront take different execution paths due to conditional statements. See hip:branch_efficiency for divergence handling details.
- Wavefront execution state#
Wavefront execution states (active, stalled, eligible, selected) describe the scheduling status of wavefronts on AMD GPUs. See hip:wavefront_execution for state definitions.