rocprofiler-sdk/agent.h Source File#
|
ROCprofiler-SDK developer API 1.0.0
ROCm Profiling API and tools
|
agent.h
101 * @brief Provides an *estimate* about the runtime visibility of an agent based on the environment
102 * variables (ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL, CUDA_VISIBLE_DEVICES).
124 * indexing number. It is equivalent to the HSA-runtime HSA_AMD_AGENT_INFO_DRIVER_NODE_ID property
125 * of a `hsa_agent_t`. The `const char*` fields (`name`, `vendor_name`, etc.) are guaranteed to be
126 * valid pointers to null-terminated strings during tool finalization. Pointers to the agents via
127 * @see ::rocprofiler_query_available_agents are constant and will not be deallocated until after
274 * @retval ::ROCPROFILER_STATUS_ERROR_INCOMPATIBLE_ABI size of the agent struct in application is
276 * @retval ::ROCPROFILER_STATUS_ERROR_INVALID_ARGUMENT Invalid ::rocprofiler_agent_version_t value
uint32_t num_shader_banks
Number of Shader Banks or Shader Engines, typical values are 1 or 2.
Definition agent.h:161
HSA_ENGINE_ID fw_version
GPU only. Identifier (rev) of the GPU uEngine or Firmware, may be 0.
Definition agent.h:186
uint32_t hip
if not visible to HSA, agent not visible to anything built on HSA
Definition agent.h:109
int32_t logical_node_type_id
Logical sequence number with respect to other agents of same type. This will always be [0....
Definition agent.h:210
uint64_t hive_id
XGMI Hive the GPU node belongs to in the system. It is an opaque and static number hash created by th...
Definition agent.h:195
uint32_t workgroup_max_size
GPU only. Maximum total number of work-items in a work-group.
Definition agent.h:191
uint32_t grid_max_size
GPU only. Maximum number of fbarriers per work-group. Must be at least 32.
Definition agent.h:192
uint32_t recommended_transfer_size
recommended transfer size to reach maximum bandwidth in bytes
Definition agent.h:82
uint32_t location_id
GPU BDF (Bus/Device/function number) - identifies the device location in the overall system.
Definition agent.h:173
uint32_t gfx_target_version
major_version=((value / 10000) % 100) minor_version=((value / 100) % 100) patch_version=(value % 100)
Definition agent.h:168
uint32_t max_engine_clk_ccompute
maximum engine clocks for CPU, including any boost capabilities
Definition agent.h:181
uint32_t max_slots_scratch_cu
Number of temp. memory ("scratch") wave slots available to access, may be 0 if HW has no restrictions...
Definition agent.h:166
rocprofiler_dim3_t grid_max_dim
GPU only. Maximum number of work-items of each dimension of a grid.
Definition agent.h:199
uint32_t lds_size_in_kb
Size of Local Data Store in Kilobytes per SIMD Wavefront.
Definition agent.h:152
uint32_t gds_size_in_kb
Size of Global Data Store in Kilobytes shared across SIMD Wavefronts.
Definition agent.h:153
uint32_t cpu_core_id_base
low value of the logical processor ID of the latency (= CPU) cores available on this node
Definition agent.h:146
uint32_t mem_clk_max
clock for the memory, this allows computing the available bandwidth to the memory when needed
Definition agent.h:95
uint32_t max_engine_clk_fcompute
GPU only. Maximum engine clocks for GPU, including any boost capabilities.
Definition agent.h:183
int32_t logical_node_id
Logical sequence number. This will always be [0..N) where N is the total number of agents.
Definition agent.h:209
uint32_t simd_id_base
low value of the logical processor ID of the throughput (= GPU) units available on this node
Definition agent.h:148
uint32_t max_waves_per_simd
This identifies the max. number of launched waves per SIMD. If NumFComputeCores is 0,...
Definition agent.h:150
uint32_t node_id
Node sequence number. This will be equivalent to the HSA-runtime HSA_AMD_AGENT_INFO_DRIVER_NODE_ID pr...
Definition agent.h:208
rocprofiler_dim3_t workgroup_max_dim
GPU only. Maximum number of work-items of each dimension of a work-group.
Definition agent.h:197
uint32_t wave_front_size
Number of SIMD cores per wavefront executed, typically 64, may be 32 or a different value for some HS...
Definition agent.h:156
rocprofiler_agent_type_t type
Enumeration for identifying the agent type (CPU, GPU, etc.)
Definition agent.h:135
uint64_t size
set to sizeof(rocprofiler_agent_t) by rocprofiler. This can be used for versioning and compatibility ...
Definition agent.h:132
rocprofiler_agent_runtime_visiblity_t runtime_visibility
See rocprofiler_runtime_library_t. This is an estimate about whether this agent will be visible for t...
Definition agent.h:211
rocprofiler_agent_version_t
Enumeration ID for version of the rocprofiler_agent_v*_t struct in rocprofiler_i.
Definition agent.h:46
rocprofiler_status_t(* rocprofiler_query_available_agents_cb_t)(rocprofiler_agent_version_t version, const void **agents, unsigned long num_agents, void *user_data)
Callback function type for querying the available agents.
Definition agent.h:278
rocprofiler_status_t rocprofiler_query_available_agents(rocprofiler_agent_version_t version, rocprofiler_query_available_agents_cb_t callback, unsigned long agent_size, void *user_data)
Receive synchronous callback with an array of available agents at moment of invocation.
rocprofiler_agent_v0_t rocprofiler_agent_t
Typedef for the current rocprofiler_agent_version_t.
Definition agent.h:262
Provides an estimate about the runtime visibility of an agent based on the environment variables (ROC...
Definition agent.h:106
Multi-dimensional struct of data used to describe GPU workgroup and grid sizes.
Definition fwd.h:702
Generated by