PC Sampling

PC Sampling#

ROCprofiler-SDK developer API: PC Sampling
ROCprofiler-SDK developer API 1.1.0
ROCm Profiling API and tools

Enabling PC (Program Counter) Sampling for GPU Activity. More...

Data Structures

struct  rocprofiler_pc_sampling_configuration_t
 (experimental) PC sampling configuration supported by a GPU agent. More...
 
struct  rocprofiler_pc_sampling_hw_id_v0_t
 (experimental) Information about the GPU part where wave was executing at the moment of sampling. More...
 
struct  rocprofiler_pc_t
 (experimental) Sampled program counter. More...
 
struct  rocprofiler_pc_sampling_record_host_trap_v0_t
 (experimental) ROCProfiler Host-Trap PC Sampling Record. More...
 
struct  rocprofiler_pc_sampling_record_stochastic_header_t
 (experimental) The header of the rocprofiler_pc_sampling_record_stochastic_v0_t, indicating what fields of the rocprofiler_pc_sampling_record_stochastic_v0_t instance are meaningful for the sample. More...
 
struct  rocprofiler_pc_sampling_snapshot_v0_t
 (experimental) Data provided by stochastic sampling hardware. More...
 
struct  rocprofiler_pc_sampling_memory_counters_t
 (experimental) Counters of issued but not yet completed instructions. More...
 
struct  rocprofiler_pc_sampling_record_stochastic_v0_t
 (experimental) ROCProfiler Stochastic PC Sampling Record. More...
 
struct  rocprofiler_pc_sampling_record_invalid_t
 (experimental) Record representing an invalid PC Sampling Record. More...
 

Typedefs

typedef rocprofiler_status_t(* rocprofiler_available_pc_sampling_configurations_cb_t) (const rocprofiler_pc_sampling_configuration_t *configs, unsigned long num_config, void *user_data)
 (experimental) Rocprofiler SDK's callback function to deliver the list of available PC sampling configurations upon the call to the rocprofiler_query_pc_sampling_agent_configurations.
 

Enumerations

enum  rocprofiler_pc_sampling_configuration_flags_t {
  ROCPROFILER_PC_SAMPLING_CONFIGURATION_FLAGS_NONE = 0 ,
  ROCPROFILER_PC_SAMPLING_CONFIGURATION_FLAGS_INTERVAL_POW2
}
 (experimental) Enumeration describing values of flags of rocprofiler_pc_sampling_configuration_t. More...
 
enum  rocprofiler_pc_sampling_instruction_type_t {
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_NONE = 0 ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_VALU ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_MATRIX ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_SCALAR ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_TEX ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS_DIRECT ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_FLAT ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_EXPORT ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_MESSAGE ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BARRIER ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BRANCH_NOT_TAKEN ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BRANCH_TAKEN ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_JUMP ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_OTHER ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_NO_INST ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_DUAL_VALU ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BRANCH_TAKEN
}
 (experimental) Enumeration describing type of sampled issued instruction. More...
 
enum  rocprofiler_pc_sampling_instruction_not_issued_reason_t {
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_NONE = 0 ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_NO_INSTRUCTION_AVAILABLE ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ALU_DEPENDENCY ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_WAITCNT ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_INTERNAL_INSTRUCTION ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_BARRIER_WAIT ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_NOT_WIN ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_WIN_EX_STALL ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_OTHER_WAIT ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_SLEEP_WAIT ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ALU_DEPENDENCY ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_INTERNAL_INSTRUCTION ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_NOT_WIN ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_WIN_EX_STALL ,
  ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_OTHER_WAIT
}
 (experimental) Enumeration describing reason for not issuing an instruction. More...
 

Functions

rocprofiler_status_t rocprofiler_configure_pc_sampling_service (rocprofiler_context_id_t context_id, rocprofiler_agent_id_t agent_id, rocprofiler_pc_sampling_method_t method, rocprofiler_pc_sampling_unit_t unit, uint64_t interval, rocprofiler_buffer_id_t buffer_id, int flags)
 (experimental) Function used to configure the PC sampling service on the GPU agent with agent_id.
 
rocprofiler_status_t rocprofiler_query_pc_sampling_agent_configurations (rocprofiler_agent_id_t agent_id, rocprofiler_available_pc_sampling_configurations_cb_t cb, void *user_data)
 (experimental) Query PC Sampling Configuration.
 
const char * rocprofiler_get_pc_sampling_instruction_type_name (rocprofiler_pc_sampling_instruction_type_t instruction_type)
 (experimental) Return the string encoding of rocprofiler_pc_sampling_instruction_type_t value
 
const char * rocprofiler_get_pc_sampling_instruction_not_issued_reason_name (rocprofiler_pc_sampling_instruction_not_issued_reason_t not_issued_reason)
 (experimental) Return the string encoding of rocprofiler_pc_sampling_instruction_not_issued_reason_t value
 

Detailed Description

Enabling PC (Program Counter) Sampling for GPU Activity.


Data Structure Documentation

◆ rocprofiler_pc_sampling_configuration_t

struct rocprofiler_pc_sampling_configuration_t

(experimental) PC sampling configuration supported by a GPU agent.

Definition at line 145 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_configuration_t:
Data Fields
uint64_t flags take values from rocprofiler_pc_sampling_configuration_flags_t
unsigned long max_interval the lowest possible frequency for generating samples using method
rocprofiler_pc_sampling_method_t method Sampling method supported by the GPU agent. Currently, it can take one of the following two values:
unsigned long min_interval the highest possible frequencey for generating samples using method.
uint64_t size Size of this struct.
rocprofiler_pc_sampling_unit_t unit A unit used to specify the interval of the method for samples generation.

◆ rocprofiler_pc_sampling_hw_id_v0_t

struct rocprofiler_pc_sampling_hw_id_v0_t

(experimental) Information about the GPU part where wave was executing at the moment of sampling.

Definition at line 223 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_hw_id_v0_t:
Data Fields
uint64_t chiplet: 6 chiplet index (3 bits allocated by the ROCr runtime)
uint64_t cu_or_wgp_id: 4 Compute unit index on GFX9 or workgroup processor index on GFX10+.
uint64_t microengine_id: 2 ACE (microengine) index.
uint64_t pipe_id: 4 pipe index
uint64_t queue_id: 4 queue id
uint64_t reserved0: 16 Reserved for the future use.
uint64_t shader_array_id: 1 Shared array index.
uint64_t shader_engine_id: 5 shared engine index
uint64_t simd_id: 2 SIMD index.
uint64_t vm_id: 6 virtual memory ID
uint64_t wave_id: 7 wave slot index
uint64_t workgroup_id: 7 thread_group index on GFX9, and workgroup index on GFX10+

◆ rocprofiler_pc_t

struct rocprofiler_pc_t

(experimental) Sampled program counter.

Definition at line 245 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_t:
Data Fields
uint64_t code_object_id id of the loaded code object instance that contains sampled PC. This fields holds the value ROCPROFILER_CODE_OBJECT_ID_NONE if the code object cannot be determined (e.g., sampled PC belongs to code generated by self modifying code).
uint64_t code_object_offset If code_object_id is different than ROCPROFILER_CODE_OBJECT_ID_NONE, then this field contains the offset of the sampled PC relative to the rocprofiler_callback_tracing_code_object_load_data_t.load_base of the code object instance with code_object_id. To calculate the original virtual address of the sampled PC, one can add the value of this field to the rocprofiler_callback_tracing_code_object_load_data_t.load_base. The value of code_object_offset matches the virtual address of the sampled instruction (PC), only if the code_object_id is equal to the ROCPROFILER_CODE_OBJECT_ID_NONE.

◆ rocprofiler_pc_sampling_record_host_trap_v0_t

struct rocprofiler_pc_sampling_record_host_trap_v0_t

(experimental) ROCProfiler Host-Trap PC Sampling Record.

Definition at line 270 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_record_host_trap_v0_t:
Data Fields
rocprofiler_async_correlation_id_t correlation_id API launch call id that matches dispatch ID.
uint64_t dispatch_id originating kernel dispatch ID
uint64_t exec_mask active SIMD lanes when sampled
rocprofiler_pc_sampling_hw_id_v0_t hw_id
See also
rocprofiler_pc_sampling_hw_id_v0_t
rocprofiler_pc_t pc information about sampled program counter
uint32_t reserved0: 24 wave position within the workgroup (0-31)
uint64_t size Size of this struct.
uint64_t timestamp timestamp when sample is generated
uint32_t wave_in_group: 8 wave position within the workgroup (0-31)
rocprofiler_dim3_t workgroup_id wave coordinates within the workgroup

◆ rocprofiler_pc_sampling_record_stochastic_header_t

struct rocprofiler_pc_sampling_record_stochastic_header_t

(experimental) The header of the rocprofiler_pc_sampling_record_stochastic_v0_t, indicating what fields of the rocprofiler_pc_sampling_record_stochastic_v0_t instance are meaningful for the sample.

Definition at line 292 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_record_stochastic_header_t:
Data Fields
uint8_t has_memory_counter: 1 pc sample provides memory counters information via rocprofiler_pc_sampling_memory_counters_t
uint8_t reserved_type: 7

◆ rocprofiler_pc_sampling_snapshot_v0_t

struct rocprofiler_pc_sampling_snapshot_v0_t

(experimental) Data provided by stochastic sampling hardware.

Definition at line 364 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_snapshot_v0_t:
Data Fields
uint32_t arb_state_issue_brmsg: 1 arbiter issued a branch/message instruction
uint32_t arb_state_issue_exp: 1 arbiter issued a export instruction
uint32_t arb_state_issue_flat: 1 arbiter issued a FLAT instruction
uint32_t arb_state_issue_lds: 1 arbiter issued a LDS instruction
uint32_t arb_state_issue_lds_direct: 1 arbiter issued a LDS direct instruction
uint32_t arb_state_issue_matrix: 1 arbiter issued a matrix instruction
uint32_t arb_state_issue_misc: 1 arbiter issued a miscellaneous instruction
uint32_t arb_state_issue_reserved: 1 reserved for the future use
uint32_t arb_state_issue_scalar: 1 arbiter issued a scalar (SALU/SMEM) instruction
uint32_t arb_state_issue_valu: 1 arbiter issued a VALU instruction
uint32_t arb_state_issue_vmem_tex: 1 arbiter issued a texture instruction
uint32_t arb_state_stall_brmsg: 1 branch/message instruction was stalled
uint32_t arb_state_stall_exp: 1 export instruction was stalled
uint32_t arb_state_stall_flat: 1 flat instruction was stalled
uint32_t arb_state_stall_lds: 1 LDS instruction was stalled.
uint32_t arb_state_stall_lds_direct: 1 LDS direct instruction was stalled.
uint32_t arb_state_stall_matrix: 1 matrix instruction was stalled
uint32_t arb_state_stall_misc: 1 miscellaneous instruction was stalled
uint32_t arb_state_stall_scalar: 1 Scalar (SALU/SMEM) instruction was stalled.
uint32_t arb_state_stall_valu: 1 VALU instruction was stalled when a sample was generated.
uint32_t arb_state_stall_vmem_tex: 1 texture instruction was stalled
uint32_t arb_state_state_reserved: 1 reserved for the future use
uint32_t dual_issue_valu: 1 Two VALU instructions were issued for coexecution (MI3xx specific)
uint32_t reason_not_issued: 4 The reason for not issuing an instruction. The field takes one of the value defined in rocprofiler_pc_sampling_instruction_not_issued_reason_t.
uint32_t reserved0: 1 reserved for future use
uint32_t reserved1: 1 reserved for the future use
uint32_t reserved2: 3 reserved for the future use

◆ rocprofiler_pc_sampling_memory_counters_t

struct rocprofiler_pc_sampling_memory_counters_t

(experimental) Counters of issued but not yet completed instructions.

Definition at line 407 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_memory_counters_t:
Data Fields
uint32_t bvh_cnt: 3 Counts the number of VMEM BVH instructions issued but not yet completed.
uint32_t ds_cnt: 6 Counts the number of LDS instructions issued but not yet completed.
uint32_t km_cnt: 5 Counts the number of scalar memory reads and memory instructions issued but not yet completed.
uint32_t load_cnt: 6 Counts the number of VMEM load instructions issued but not yet completed.
uint32_t sample_cnt: 6 Counts the number of VMEM sample instructions issued but not yet completed.
uint32_t store_cnt: 6 Counts the number of VMEM store instructions issued but not yet completed.

◆ rocprofiler_pc_sampling_record_stochastic_v0_t

struct rocprofiler_pc_sampling_record_stochastic_v0_t

(experimental) ROCProfiler Stochastic PC Sampling Record.

Definition at line 434 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_record_stochastic_v0_t:
Data Fields
rocprofiler_async_correlation_id_t correlation_id API launch call id that matches dispatch ID.
uint64_t dispatch_id originating kernel dispatch ID
uint64_t exec_mask active SIMD lanes at the moment of sampling
rocprofiler_pc_sampling_record_stochastic_header_t flags Defines what fields are meaningful for the sample.
rocprofiler_pc_sampling_hw_id_v0_t hw_id
See also
rocprofiler_pc_sampling_hw_id_v0_t
uint8_t inst_type: 5 instruction type, takes a value defined in rocprofiler_pc_sampling_instruction_type_t
rocprofiler_pc_sampling_memory_counters_t memory_counters Counters of issued but not yet completed instructions.
See also
rocprofiler_pc_sampling_memory_counters_t
rocprofiler_pc_t pc information about sampled program counter
uint8_t reserved: 2 reserved 2 bits must be zero
uint64_t size Size of this struct.
rocprofiler_pc_sampling_snapshot_v0_t snapshot Data provided by stochastic sampling hardware.
See also
rocprofiler_pc_sampling_snapshot_v0_t
uint64_t timestamp timestamp when sample is generated
uint32_t wave_count active waves on the CU at the moment of sampling
uint8_t wave_in_group wave position within the workgroup (0-15)
uint8_t wave_issued: 1 wave issued the instruction represented with the PC
rocprofiler_dim3_t workgroup_id wave coordinates within the workgroup

◆ rocprofiler_pc_sampling_record_invalid_t

struct rocprofiler_pc_sampling_record_invalid_t

(experimental) Record representing an invalid PC Sampling Record.

Definition at line 491 of file pc_sampling.h.

+ Collaboration diagram for rocprofiler_pc_sampling_record_invalid_t:
Data Fields
uint64_t size Size of the struct.

Typedef Documentation

◆ rocprofiler_available_pc_sampling_configurations_cb_t

typedef rocprofiler_status_t(* rocprofiler_available_pc_sampling_configurations_cb_t) (const rocprofiler_pc_sampling_configuration_t *configs, unsigned long num_config, void *user_data)

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Rocprofiler SDK's callback function to deliver the list of available PC sampling configurations upon the call to the rocprofiler_query_pc_sampling_agent_configurations.

Parameters
[out]configs- The array of PC sampling configurations supported by the agent at the moment of invoking rocprofiler_query_pc_sampling_agent_configurations.
[out]num_config- The number of configurations contained in the underlying array configs. In case the GPU agent does not support PC sampling, the value is 0.
[in]user_data- client's private data passed via rocprofiler_query_pc_sampling_agent_configurations
Returns
rocprofiler_status_t

Definition at line 185 of file pc_sampling.h.

Enumeration Type Documentation

◆ rocprofiler_pc_sampling_configuration_flags_t

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Enumeration describing values of flags of rocprofiler_pc_sampling_configuration_t.

Definition at line 132 of file pc_sampling.h.

133{
134 ROCPROFILER_PC_SAMPLING_CONFIGURATION_FLAGS_NONE = 0,
135 ROCPROFILER_PC_SAMPLING_CONFIGURATION_FLAGS_INTERVAL_POW2,
136 ROCPROFILER_PC_SAMPLING_CONFIGURATION_FLAGS_LAST
137
138 /// @var ROCPROFILER_PC_SAMPLING_CONFIGURATION_FLAGS_INTERVAL_POW2
139 /// @brief The interval value must be a power of 2.
rocprofiler_pc_sampling_configuration_flags_t
(experimental) Enumeration describing values of flags of rocprofiler_pc_sampling_configuration_t.

◆ rocprofiler_pc_sampling_instruction_not_issued_reason_t

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Enumeration describing reason for not issuing an instruction.

Enumerator
ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_WAITCNT 

waitcnt dependency

ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_BARRIER_WAIT 

waiting on a barrier

ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_SLEEP_WAIT 

wave was sleeping

Definition at line 332 of file pc_sampling.h.

333{
334 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_NONE = 0,
335 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_NO_INSTRUCTION_AVAILABLE,
336 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ALU_DEPENDENCY,
338 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_INTERNAL_INSTRUCTION,
340 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_NOT_WIN,
341 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_WIN_EX_STALL,
342 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_OTHER_WAIT,
344 ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_LAST
345
346 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_NO_INSTRUCTION_AVAILABLE
347 /// @brief No instruction available in the instruction cache.
348 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ALU_DEPENDENCY
349 /// @brief ALU dependency not resolved.
350 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_INTERNAL_INSTRUCTION
351 /// @brief Wave executes an internal instruction.
352 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_NOT_WIN
353 /// @brief The instruction did not win the arbiter.
354 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_ARBITER_WIN_EX_STALL
355 /// @brief Arbiter issued an instruction, but the execution pipe pushed it back from execution.
356 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_OTHER_WAIT
357 /// @brief Other types of wait (e.g., wait for XNACK acknowledgment).
rocprofiler_pc_sampling_instruction_not_issued_reason_t
(experimental) Enumeration describing reason for not issuing an instruction.
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_SLEEP_WAIT
wave was sleeping
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_BARRIER_WAIT
waiting on a barrier
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_NOT_ISSUED_REASON_WAITCNT
waitcnt dependency

◆ rocprofiler_pc_sampling_instruction_type_t

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Enumeration describing type of sampled issued instruction.

Enumerator
ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_VALU 

vector ALU instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_MATRIX 

matrix instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_SCALAR 

scalar (memory) instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_TEX 

texture memory instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS 

LDS memory instruction.

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS_DIRECT 

LDS direct memory instruction.

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_FLAT 

flat memory instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_EXPORT 

export instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_MESSAGE 

message instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BARRIER 

barrier instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_JUMP 

jump instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_OTHER 

other types of instruction

ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_NO_INST 

no instruction issued

Definition at line 302 of file pc_sampling.h.

303{
304 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_NONE = 0,
305 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_VALU, ///< vector ALU instruction
307 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_SCALAR, ///< scalar (memory) instruction
308 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_TEX, ///< texture memory instruction
309 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS, ///< LDS memory instruction
310 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS_DIRECT, ///< LDS direct memory instruction
311 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_FLAT, ///< flat memory instruction
315 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BRANCH_NOT_TAKEN,
316 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BRANCH_TAKEN,
318 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_OTHER, ///< other types of instruction
319 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_NO_INST, ///< no instruction issued
320 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_DUAL_VALU, /// dual VALU instruction
321 ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LAST
322
323 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BRANCH_NOT_TAKEN
324 /// @brief Instruction representing a branch not being taken.
325 /// @var ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BRANCH_TAKEN
326 /// @brief Instruction representing a taken branch.
rocprofiler_pc_sampling_instruction_type_t
(experimental) Enumeration describing type of sampled issued instruction.
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS
LDS memory instruction.
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_BARRIER
barrier instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_OTHER
other types of instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_MATRIX
matrix instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_TEX
texture memory instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_NO_INST
no instruction issued
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_MESSAGE
message instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_EXPORT
export instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_SCALAR
scalar (memory) instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_VALU
vector ALU instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_LDS_DIRECT
LDS direct memory instruction.
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_JUMP
jump instruction
@ ROCPROFILER_PC_SAMPLING_INSTRUCTION_TYPE_FLAT
flat memory instruction

Function Documentation

◆ rocprofiler_configure_pc_sampling_service()

rocprofiler_status_t rocprofiler_configure_pc_sampling_service ( rocprofiler_context_id_t  context_id,
rocprofiler_agent_id_t  agent_id,
rocprofiler_pc_sampling_method_t  method,
rocprofiler_pc_sampling_unit_t  unit,
uint64_t  interval,
rocprofiler_buffer_id_t  buffer_id,
int  flags 
)

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Function used to configure the PC sampling service on the GPU agent with agent_id.

Prerequisites are the following:

  • The client must create a context and supply its context_id. By using this context, the client can start/stop PC sampling on the agent. For more information, please
    See also
    rocprofiler_start_context/rocprofiler_stop_context.
  • The user must create a buffer and supply its buffer_id. Rocprofiler-SDK uses the buffer to deliver the PC samples to the client. For more information about the data delivery, please
    See also
    rocprofiler_create_buffer and
    rocprofiler_buffer_tracing_cb_t.
    Before calling this function, we recommend querying PC sampling configurations supported by the GPU agent via the
    See also
    rocprofiler_query_pc_sampling_agent_configurations. The client chooses the method, unit, and interval to match one of the available configurations. Note that the interval must belong to the range of values [available_config.min_interval, available_config.max_interval], where available_config is the instance of the
    rocprofiler_pc_sampling_configuration_s supported/available at the moment.
    Rocprofiler-SDK checks whether the requsted configuration is actually supported at the moment of calling this function. If the answer is yes, it returns the
    See also
    ROCPROFILER_STATUS_SUCCESS. Otherwise, it notifies the client about the rejection reason via the returned status code. For more information about the status codes, please
    rocprofiler_status_t.
    There are a few constraints a client's code needs to be aware of.

Constraint1: A GPU agent can be configured to support at most one running PC sampling configuration at any time, which implies some of the consequences described below. After the tool configures the PC sampling with one of the available configurations, rocprofiler-SDK guarantees that this configuration will be valid for the tool's lifetime. The tool can start and stop the configured PC sampling service whenever convenient.

Constraint2: Since the same GPU agent can be used by multiple processes concurrently, Rocprofiler-SDK cannot guarantee the exclusive access to the PC sampling capability. The consequence is the following scenario. The tool TA that belongs to the process PA, calls the

See also
rocprofiler_query_pc_sampling_agent_configurations that returns the two supported configurations CA and CB by the agent. Then the tool TB of the process PB, configures the PC sampling on the same agent by using the configuration CB. Subsequently, the TA tries configuring the CA on the agent, and it fails. To point out that this case happened, we introduce a special status code
ROCPROFILER_STATUS_ERROR_NOT_AVAILABLE. When this status code is observed by the tool TA, it queries all available configurations again by calling
rocprofiler_query_pc_sampling_agent_configurations, that returns only CB this time. The tool TA can choose CB, so that both TA and TB use the PC sampling capability in the separate processes. Both TA and TB receives samples generated by the kernels launched by the corresponding processes PA and PB, respectively.

Constraint3: Rocprofiler-SDK allows only one context to contain the configured PC sampling service within the process, that implies that at most one of the loaded tools can use PC sampling. One context can contains multiple PC sampling services configured for different GPU agents.

Constraint4: PC sampling feature is not available within the ROCgdb.

Constraint5: PC sampling service cannot be used simultaneously with counter collection service.

Parameters
[in]context_id- id of the context used for starting/stopping PC sampling service
[in]agent_id- id of the agent on which caller tries using PC sampling capability
[in]method- the type of PC sampling the caller tries to use on the agent.
[in]unit- The unit appropriate to the PC sampling type/method.
[in]interval- frequency at which PC samples are generated
[in]buffer_id- id of the buffer used for delivering PC samples
[in]flags- for future use
Returns
rocprofiler_status_t
Return values
ROCPROFILER_STATUS_SUCCESSPC sampling service configured successfully
ROCPROFILER_STATUS_ERROR_NOT_AVAILABLEOne of the scenarios is present:
  1. PC sampling is already configured with configuration different than requested,
  2. PC sampling is requested from a process that runs within the ROCgdb.
  3. HSA runtime does not support PC sampling.
  4. GPU device does not support requested PC sampling method.
ROCPROFILER_STATUS_ERROR_INCOMPATIBLE_KERNELthe amdgpu driver installed on the system does not support the PC sampling feature
ROCPROFILER_STATUS_ERRORa general error caused by the amdgpu driver
ROCPROFILER_STATUS_ERROR_CONTEXT_CONFLICTcounter collection service already setup in the context
ROCPROFILER_STATUS_ERROR_INVALID_ARGUMENTfunction invoked with an invalid argument

◆ rocprofiler_get_pc_sampling_instruction_not_issued_reason_name()

const char * rocprofiler_get_pc_sampling_instruction_not_issued_reason_name ( rocprofiler_pc_sampling_instruction_not_issued_reason_t  not_issued_reason)

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Return the string encoding of rocprofiler_pc_sampling_instruction_not_issued_reason_t value

Parameters
[in]not_issued_reasonno issue reason enum value
Returns
Will return a nullptr if invalid/unsupported rocprofiler_pc_sampling_instruction_not_issued_reason_t value is provided.

◆ rocprofiler_get_pc_sampling_instruction_type_name()

const char * rocprofiler_get_pc_sampling_instruction_type_name ( rocprofiler_pc_sampling_instruction_type_t  instruction_type)

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Return the string encoding of rocprofiler_pc_sampling_instruction_type_t value

Parameters
[in]instruction_typeinstruction type enum value
Returns
Will return a nullptr if invalid/unsupported rocprofiler_pc_sampling_instruction_type_t value is provided.

◆ rocprofiler_query_pc_sampling_agent_configurations()

rocprofiler_status_t rocprofiler_query_pc_sampling_agent_configurations ( rocprofiler_agent_id_t  agent_id,
rocprofiler_available_pc_sampling_configurations_cb_t  cb,
void *  user_data 
)

#include <rocprofiler-sdk/pc_sampling.h>

(experimental) Query PC Sampling Configuration.

Lists PC sampling configurations a GPU agent with agent_id supports at the moment of invoking the function. Delivers configurations via cb. In case the PC sampling is configured on the GPU agent, the cb delivers information about the active PC sampling configuration. In case the GPU agent does not support PC sampling capability, the cb delivers none PC sampling configurations.

Parameters
[in]agent_id- id of the agent for which available configurations will be listed
[in]cb- User callback that delivers the available PC sampling configurations
[in]user_data- passed to the cb
Returns
rocprofiler_status_t
Return values
ROCPROFILER_STATUS_ERROR_NOT_AVAILABLEOne of the scenarios is present:
  1. PC sampling is requested from a process that runs within the ROCgdb.
  2. HSA runtime does not support PC sampling.
ROCPROFILER_STATUS_ERROR_INCOMPATIBLE_KERNELthe amdgpu driver installed on the system does not support the PC sampling feature.
ROCPROFILER_STATUS_ERRORa general error caused by the amdgpu driver
ROCPROFILER_STATUS_SUCCESScb successfully finished