Performance Counter Functions

Performance Counter Functions#

ROCmSMI: Performance Counter Functions
Performance Counter Functions

Functions

rsmi_status_t rsmi_dev_counter_group_supported (uint32_t dv_ind, rsmi_event_group_t group)
 Tell if an event group is supported by a given device. More...
 
rsmi_status_t rsmi_dev_counter_create (uint32_t dv_ind, rsmi_event_type_t type, rsmi_event_handle_t *evnt_handle)
 Create a performance counter object. More...
 
rsmi_status_t rsmi_dev_counter_destroy (rsmi_event_handle_t evnt_handle)
 Deallocate a performance counter object. More...
 
rsmi_status_t rsmi_counter_control (rsmi_event_handle_t evt_handle, rsmi_counter_command_t cmd, void *cmd_args)
 Issue performance counter control commands. More...
 
rsmi_status_t rsmi_counter_read (rsmi_event_handle_t evt_handle, rsmi_counter_value_t *value)
 Read the current value of a performance counter. More...
 
rsmi_status_t rsmi_counter_available_counters_get (uint32_t dv_ind, rsmi_event_group_t grp, uint32_t *available)
 Get the number of currently available counters. More...
 

Detailed Description

These functions are used to configure, query and control performance counting.

These functions use the same mechanisms as the "perf" command line utility. They share the same underlying resources and have some similarities in how they are used. The events supported by this API should have corresponding perf events that can be seen with "perf stat ...". The events supported by perf can be seen with "perf list"

The types of events available and the ability to count those events are dependent on which device is being targeted and if counters are still available for that device, respectively. rsmi_dev_counter_group_supported() can be used to see which event types (rsmi_event_group_t) are supported for a given device. Assuming a device supports a given event type, we can then check to see if there are counters available to count a specific event with rsmi_counter_available_counters_get(). Counters may be occupied by other perf based programs.

Once it is determined that events are supported and counters are available, an event counter can be created/destroyed and controlled.

rsmi_dev_counter_create() allocates internal data structures that will be used to used to control the event counter, and return a handle to this data structure.

Once an event counter handle is obtained, the event counter can be controlled (i.e., started, stopped,...) with rsmi_counter_control() by passing rsmi_counter_command_t commands. RSMI_CNTR_CMD_START starts an event counter and RSMI_CNTR_CMD_STOP stops a counter. rsmi_counter_read() reads an event counter.

Once the counter is no longer needed, the resources it uses should be freed by calling rsmi_dev_counter_destroy().

Important Notes about Counter Values

  • A running "absolute" counter is kept internally. For the discussion that follows, we will call the internal counter value at time t valt
  • Issuing RSMI_CNTR_CMD_START or calling rsmi_counter_read(), causes RSMI (in kernel) to internally record the current absolute counter value
  • rsmi_counter_read() returns the number of events that have occurred since the previously recorded value (ie, a relative value, valt - valt-1) from the issuing of RSMI_CNTR_CMD_START or calling rsmi_counter_read()

Example of event counting sequence:

// Determine if RSMI_EVNT_GRP_XGMI is supported for device dv_ind
// See if there are counters available for device dv_ind for event
// RSMI_EVNT_GRP_XGMI
RSMI_EVNT_GRP_XGMI, &counters_available);
// Assuming RSMI_EVNT_GRP_XGMI is supported and there is at least 1
// counter available for RSMI_EVNT_GRP_XGMI on device dv_ind, create
// an event object for an event of group RSMI_EVNT_GRP_XGMI (e.g.,
// RSMI_EVNT_XGMI_0_BEATS_TX) and get the handle
// (rsmi_event_handle_t).
&evnt_handle);
// A program that generates the events of interest can be started
// immediately before or after starting the counters.
// Start counting:
ret = rsmi_counter_control(evnt_handle, RSMI_CNTR_CMD_START, NULL);
// Wait...
// Get the number of events since RSMI_CNTR_CMD_START was issued:
ret = rsmi_counter_read(rsmi_event_handle_t evt_handle, &value)
// Wait...
// Get the number of events since rsmi_counter_read() was last called:
ret = rsmi_counter_read(rsmi_event_handle_t evt_handle, &value)
// Stop counting.
ret = rsmi_counter_control(evnt_handle, RSMI_CNTR_CMD_STOP, NULL);
// Release all resources (e.g., counter and memory resources) associated
with evnt_handle.
ret = rsmi_dev_counter_destroy(evnt_handle);
rsmi_status_t rsmi_counter_control(rsmi_event_handle_t evt_handle, rsmi_counter_command_t cmd, void *cmd_args)
Issue performance counter control commands.
rsmi_status_t rsmi_counter_read(rsmi_event_handle_t evt_handle, rsmi_counter_value_t *value)
Read the current value of a performance counter.
rsmi_status_t rsmi_counter_available_counters_get(uint32_t dv_ind, rsmi_event_group_t grp, uint32_t *available)
Get the number of currently available counters.
rsmi_status_t rsmi_dev_counter_destroy(rsmi_event_handle_t evnt_handle)
Deallocate a performance counter object.
rsmi_status_t rsmi_dev_counter_create(uint32_t dv_ind, rsmi_event_type_t type, rsmi_event_handle_t *evnt_handle)
Create a performance counter object.
rsmi_status_t rsmi_dev_counter_group_supported(uint32_t dv_ind, rsmi_event_group_t group)
Tell if an event group is supported by a given device.
@ RSMI_EVNT_GRP_XGMI
Data Fabric (XGMI) related events.
Definition: rocm_smi.h:215
@ RSMI_EVNT_XGMI_0_BEATS_TX
Data beats sent to neighbor 0; Each beat represents 32 bytes.
Definition: rocm_smi.h:249
uintptr_t rsmi_event_handle_t
Handle to performance event counter.
Definition: rocm_smi.h:206
@ RSMI_CNTR_CMD_START
Start the counter.
Definition: rocm_smi.h:293
@ RSMI_CNTR_CMD_STOP
Definition: rocm_smi.h:294
Definition: rocm_smi.h:301

Function Documentation

◆ rsmi_dev_counter_group_supported()

rsmi_status_t rsmi_dev_counter_group_supported ( uint32_t  dv_ind,
rsmi_event_group_t  group 
)

Tell if an event group is supported by a given device.

Given a device index dv_ind and an event group specifier group, tell if group type events are supported by the device associated with dv_ind

Parameters
[in]dv_inddevice index of device being queried
[in]grouprsmi_event_group_t identifier of group for which support is being queried
Return values
RSMI_STATUS_SUCCESSif the device associatee with dv_ind support counting events of the type indicated by group.
RSMI_STATUS_NOT_SUPPORTEDinstalled software or hardware does not support this function with the given arguments group

◆ rsmi_dev_counter_create()

rsmi_status_t rsmi_dev_counter_create ( uint32_t  dv_ind,
rsmi_event_type_t  type,
rsmi_event_handle_t evnt_handle 
)

Create a performance counter object.

Create a performance counter object of type type for the device with a device index of dv_ind, and write a handle to the object to the memory location pointed to by evnt_handle. evnt_handle can be used with other performance event operations. The handle should be deallocated with rsmi_dev_counter_destroy() when no longer needed.

Parameters
[in]dv_inda device index
[in]typethe rsmi_event_type_t of performance event to create
[in,out]evnt_handleA pointer to a rsmi_event_handle_t which will be associated with a newly allocated counter If this parameter is nullptr, this function will return RSMI_STATUS_INVALID_ARGS if the function is supported with the provided, arguments and RSMI_STATUS_NOT_SUPPORTED if it is not supported with the provided arguments.
Return values
RSMI_STATUS_SUCCESScall was successful
RSMI_STATUS_NOT_SUPPORTEDinstalled software or hardware does not support this function with the given arguments
RSMI_STATUS_INVALID_ARGSthe provided arguments are not valid
RSMI_STATUS_OUT_OF_RESOURCESunable to allocate memory for counter
RSMI_STATUS_PERMISSIONfunction requires root access

◆ rsmi_dev_counter_destroy()

rsmi_status_t rsmi_dev_counter_destroy ( rsmi_event_handle_t  evnt_handle)

Deallocate a performance counter object.

Deallocate the performance counter object with the provided rsmi_event_handle_t evnt_handle

Parameters
[in]evnt_handlehandle to event object to be deallocated
Return values
RSMI_STATUS_SUCCESSis returned upon successful call
RSMI_STATUS_INVALID_ARGSthe provided arguments are not valid
RSMI_STATUS_PERMISSIONfunction requires root access

◆ rsmi_counter_control()

rsmi_status_t rsmi_counter_control ( rsmi_event_handle_t  evt_handle,
rsmi_counter_command_t  cmd,
void *  cmd_args 
)

Issue performance counter control commands.

Issue a command cmd on the event counter associated with the provided handle evt_handle.

Parameters
[in]evt_handlean event handle
[in]cmdThe event counter command to be issued
[in,out]cmd_argsCurrently not used. Should be set to NULL.
Return values
RSMI_STATUS_SUCCESSis returned upon successful call
RSMI_STATUS_INVALID_ARGSthe provided arguments are not valid
RSMI_STATUS_PERMISSIONfunction requires root access

◆ rsmi_counter_read()

rsmi_status_t rsmi_counter_read ( rsmi_event_handle_t  evt_handle,
rsmi_counter_value_t value 
)

Read the current value of a performance counter.

Read the current counter value of the counter associated with the provided handle evt_handle and write the value to the location pointed to by value.

Parameters
[in]evt_handlean event handle
[in,out]valuepointer to memory of size of rsmi_counter_value_t to which the counter value will be written
Return values
RSMI_STATUS_SUCCESSis returned upon successful call
RSMI_STATUS_INVALID_ARGSthe provided arguments are not valid
RSMI_STATUS_PERMISSIONfunction requires root access

◆ rsmi_counter_available_counters_get()

rsmi_status_t rsmi_counter_available_counters_get ( uint32_t  dv_ind,
rsmi_event_group_t  grp,
uint32_t *  available 
)

Get the number of currently available counters.

Given a device index dv_ind, a performance event group grp, and a pointer to a uint32_t available, this function will write the number of grp type counters that are available on the device with index dv_ind to the memory that available points to.

Parameters
[in]dv_inda device index
[in]grpan event device group
[in,out]availableA pointer to a uint32_t to which the number of available counters will be written
Return values
RSMI_STATUS_SUCCESSis returned upon successful call
RSMI_STATUS_INVALID_ARGSthe provided arguments are not valid