Shader engine

Shader engine#

On RDNA3 architecture-based GPUs, Shader engines (SEs) partition the programmable graphics and compute array into repeating slices. Within each SE, the Workgroup Manager (SPI) accepts dispatched kernels and schedules waves onto Workgroup Processors (WGPs). Each WGP maps to two Compute Units (CUs) that share execution resources and execute the scheduled waves. GL0 cache and GL1 cache implement the per-SE vector cache hierarchy feeding those CUs.

Follow the nested sections under Shader engine in the navigation for gfx115x metric tables:

Workgroup Manager (SPI): Utilization and wave dispatch statistics that sit between the command processor and WGP execution.
Workgroup processor (WGP): Occupancy, waves, instruction mix, and WGP-local instruction/data caches at CU pair granularity.
GL0 (TCP Vector Cache): Panels from GL0 utilization through the TCP-GL1 boundary.
GL1: Utilization, requests, cache performance, and the GL1-GL2 interface.

GPU-wide and per-SE utilization summarized through GRBM is documented separately; see Graphics Register Bus Manager (GRBM).

Note

For AMD Instinct-centric Shader engine metric tabs, see Shader engine (SE).