Command processor (CP)#

The command processor (CP) connects the host and kernel driver to on-GPU scheduling. During the process it pulls work from HSA queues, decodes packets, and dispatches the kernel launches to the front-end (SPI / WGP path). On Instinct GPUs, the profiler often seperates the metrics into command processor fetcher (CPF) and command processor compute (CPC). The gfx1151 analysis panels emphasize CPC and ME (Micro Engine) activity, including utilization, interface utilization, stall cycles, memory requests, and instruction cache.

For the complete CDNA architecture overview and the CPF and CPC metric tabs across MI-series GPUs, see Command processor (CP) under CDNA-CDNA4.

Command processor compute (CPC) — gfx1151#

CPC utilization#

Metric

Description

Unit

CPC Busy

Percentage of CPC always-count cycles the Command Processor Compute was busy processing commands. Computed as 100 * CPC_STAT_BUSY / CPC_ALWAYS_COUNT.

Percent

CPC Idle

Percentage of CPC always-count cycles the Command Processor Compute was idle. Computed as 100 * CPC_STAT_IDLE / CPC_ALWAYS_COUNT.

Percent

CPC Stalled

Percentage of CPC always-count cycles the Command Processor Compute was stalled. Computed as 100 * CPC_STAT_STALL / CPC_ALWAYS_COUNT.

Percent

CPC interface utilization#

Metric

Description

Unit

TCIU Busy

Percentage of CPC always-count cycles the TC (Texture Cache) Interface Unit was busy. Computed as 100 * CPC_TCIU_BUSY / CPC_ALWAYS_COUNT.

Percent

UTCL2 Busy

Percentage of CPC always-count cycles the UTCL2 (Unified Translation Cache L2) interface was busy. Computed as 100 * CPC_UTCL2IU_BUSY / CPC_ALWAYS_COUNT.

Percent

GCRIU Busy

Percentage of CPC always-count cycles the GCR (Graphics Cache Rinse) Interface Unit was busy. Computed as 100 * CPC_GCRIU_BUSY / CPC_ALWAYS_COUNT.

Percent

Micro Engine (ME) stall cycles#

Metric

Description

Unit

ME1 Stall on RCIU Ready

Cycles ME1 was stalled waiting for RCIU (Register Cache Interface Unit) to be ready (CPC_ME1_STALL_WAIT_ON_RCIU_READY).

Cycles per Normalization Unit

ME1 Stall on Memory Read

Cycles ME1 was stalled waiting for memory read completion (CPC_ME1_STALL_WAIT_ON_MEM_READ).

Cycles per Normalization Unit

ME1 Stall on Memory Write

Cycles ME1 was stalled waiting for memory write completion (CPC_ME1_STALL_WAIT_ON_MEM_WRITE).

Cycles per Normalization Unit

ME1 Stall on ROQ Data

Cycles ME1 was stalled waiting for data from the Ring Output Queue (CPC_ME1_STALL_ON_DATA_FROM_ROQ).

Cycles per Normalization Unit

CPC memory requests#

Metric

Description

Unit

TCIU Read Requests

Number of read requests sent to the TC Interface Unit.

Count per Normalization Unit

TCIU Write Requests

Number of write requests sent to the TC Interface Unit.

Count per Normalization Unit

GUS Read Requests

Number of read requests sent to the Global Unified Shader memory interface.

Count per Normalization Unit

GUS Write Requests

Number of write requests sent to the Global Unified Shader memory interface.

Count per Normalization Unit

Micro Engine (ME) instruction cache#

Metric

Description

Unit

Instruction Cache Hits

Number of MEC instruction cache hits.

Count per Normalization Unit

Instruction Cache Misses

Number of MEC instruction cache misses.

Count per Normalization Unit

Instruction Cache Hit Rate

Percentage of MEC instruction cache accesses that hit in the cache.

Percent