GL1

GL1#

GL1 Cache is the shared L1 vector cache inside each shader array on gfx115x (one GL1 per shader array), supplied by GL0 (TCP) and forwarding misses toward GL2. For GL0 panels and Memory Chart rows through the TCP-GL1 boundary, see GL0 (TCP Vector Cache). For downstream GL2 panels after GL1, see GL2 cache; for DRAM / GCEA interfaces beyond GL2, see Graphics Core Efficiency Arbiter (GCEA).

Note

The GL1 Cache is also referred to as GL1C in some contexts. Hardware counter names (for example, GL1C_REQ_sum) retain the GL1C prefix.

GL1 Cache panels#

GL1 Cache utilization#

RDNA 3.5 (gfx115x)

Metric	Description	Unit
GL1 Cache Busy	Percentage of cycles the GL1 cache is actively processing requests. GL1 cache is shared across multiple Workgroup Processors within a Shader Engine. High utilization indicates active memory traffic from the GL0 caches.	Percent
GL1 Cache Starve	Percentage of cycles the GL1 cache had no pending requests. High starvation indicates either compute-bound workloads with minimal memory traffic, or effective GL0 caching reducing traffic to GL1.	Percent

GL1 Cache request statistics#

RDNA 3.5 (gfx115x)

Metric	Description	Unit
Total Requests	Total number of requests received by the GL1 cache from all GL0 caches (TCP instances) within the shader engine. This represents aggregated memory traffic from multiple Workgroup Processors.	Count per Normalization Unit
Read Requests	Number of read requests to the GL1 cache. High read counts indicate memory-intensive load operations that missed in GL0. Compare with miss requests to assess GL1 cache effectiveness.	Count per Normalization Unit
Write Requests	Number of write requests to the GL1 cache. Write traffic includes stores that missed in GL0 and cache writebacks. High write counts may indicate write-intensive workloads.	Count per Normalization Unit
Miss Requests	Number of GL1 cache requests that missed and required fetching from GL2. High miss counts increase memory latency and traffic to GL2. Consider improving data locality at the shader engine level.	Count per Normalization Unit

GL1 Cache performance#

RDNA 3.5 (gfx115x)

Metric	Description	Unit
Hit Rate	Percentage of GL1 cache requests serviced from cache. Higher hit rates reduce traffic to GL2 and improve performance. Low hit rates may indicate working sets exceeding GL1 capacity or poor data locality across Workgroup Processors.	Percent

GL1-GL2 interface#

RDNA 3.5 (gfx115x)

Metric	Description	Unit
GL2 Read Requests	Number of read requests forwarded from GL1 to GL2 cache due to misses. This represents GL1 miss traffic that consumes GL2 bandwidth. High counts may indicate GL1 capacity limitations.	Count per Normalization Unit
GL2 Read 128B Requests	Number of 128-byte read requests forwarded from GL1 to GL2 cache. This represents large cache line fetches for memory-intensive workloads.	Count per Normalization Unit
GL2 Write Requests	Number of write requests forwarded from GL1 to GL2 cache. This includes writebacks and stores that missed in GL1.	Count per Normalization Unit

GL1 Cache stalls#

RDNA 3.5 (gfx115x)

Metric	Description	Unit
GL2 Stall	Cycles the GL1 cache was stalled waiting for GL2 to accept requests. High stall counts indicate GL2 bandwidth saturation or contention, limiting GL1 throughput.	Cycles per Normalization Unit
LFIFO Full Stall	Cycles the GL1 cache was stalled due to the LFIFO (Load FIFO) being full. High stall counts indicate data return path congestion from GL2 to GL1.	Cycles per Normalization Unit

Memory chart: GL1 cache and GL1-GL2 interface#

The following Memory Chart tables align with the on-screen flow through GL1 and the GL1-GL2 interface.

Memory chart - GL1 cache#

RDNA 3.5 (gfx115x)

Metric	Description	Unit
GL1 Cache Utilization	Percentage of cycles the GL1 cache is actively processing requests. GL1 cache is shared across multiple workgroup processors within a shader engine. High utilization indicates active memory traffic through the GL1 cache.	Percent
GL1 Cache Hit Rate	Percentage of L1 cache requests that hit in cache. Higher hit rates reduce traffic to the L2 cache and improve memory access latency. Low hit rates may indicate poor data locality or working sets exceeding L1 capacity.	Percent

Memory chart - GL1-GL2 interface#

RDNA 3.5 (gfx115x)

Metric	Description	Unit
GL1-GL2 Read Requests	Read requests from GL1C to GL2C per normalization unit (GL1C_GL2_REQ_READ_sum).	Requests per Normalization Unit
GL1-GL2 Write Requests	Write requests from GL1C to GL2C per normalization unit (GL1C_GL2_REQ_WRITE_sum).	Requests per Normalization Unit
GL1-GL2 Read Bandwidth	Bytes per second on the GL1C→GL2C read interface (32/64/128 B request bins).	Bytes/s
GL1-GL2 Write Bandwidth	Bytes per second on the GL1C→GL2C write interface (32/64 B request bins).	Bytes/s
GL1-GL2 Read Latency	Average cycles from GL1C read request to response (GL1C_GL2_REQ_READ_LATENCY_sum / GL1C_GL2_REQ_READ_sum) when the denominator is non-zero.	Cycles
GL1-GL2 Write Latency	Average cycles from GL1C write request to completion (GL1C_GL2_REQ_WRITE_LATENCY_sum / GL1C_GL2_REQ_WRITE_sum) when the denominator is non-zero.	Cycles

GL1

Contents

GL1#

GL1 Cache panels#

GL1 Cache utilization#

GL1 Cache request statistics#

GL1 Cache performance#

GL1-GL2 interface#

GL1 Cache stalls#

Memory chart: GL1 cache and GL1-GL2 interface#

Memory chart - GL1 cache#

Memory chart - GL1-GL2 interface#