Shader engine#
On RDNA3-class GPUs, shader engines (SEs) partition the programmable graphics and compute array into repeating slices. Within each SE, the Workgroup Manager (SPI) accepts dispatched kernels and schedules waves onto Workgroup Processors (WGPs). Each WGP maps to two Compute Units (CUs) that share execution resources and execute the scheduled waves. GL0 and GL1 implement the per-SE vector cache hierarchy feeding those CUs.
Follow the nested chapters under Shader engine in the navigation for gfx115x metric tables:
Workgroup Manager (SPI) - SPI / Workgroup Manager: utilization and wave dispatch statistics that sit between the command processor and WGP execution.
Workgroup processor (WGP) - Workgroup Processor: occupancy, waves, instruction mix, and WGP-local instruction/data caches at CU pair granularity.
GL0 (TCP Vector Cache) - GL0 (TCP vector cache): panels from GL0 utilization through the TCP-GL1 boundary.
GL1 - GL1 Cache: utilization, requests, cache performance, and the GL1-GL2 interface.
GPU-wide and per-SE utilization summarized through GRBM is documented separately; see Graphics Register Bus Manager (GRBM).
Note
For Instinct-centric shader-engine metric tabs, see Shader engine (SE).