This page contains proposed changes for a future release of ROCm. Read the latest Linux release of ROCm documentation for your production environments.

Device Profiling API

Device Profiling API#

rocprofiler: Device Profiling API

Data Structures

struct  rocprofiler_counter_value_t
 
struct  rocprofiler_device_profile_metric_t
 

Functions

ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_create (const char **counter_names, uint64_t num_counters, rocprofiler_session_id_t *session_id, int cpu_index, int gpu_index) ROCPROFILER_VERSION_9_0
 Create a device profiling session. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_start (rocprofiler_session_id_t session_id) ROCPROFILER_VERSION_9_0
 Start the device profiling session that was created previously. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_poll (rocprofiler_session_id_t session_id, rocprofiler_device_profile_metric_t *data) ROCPROFILER_VERSION_9_0
 Poll the device profiling session to read counters from the GPU device. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_stop (rocprofiler_session_id_t session_id) ROCPROFILER_VERSION_9_0
 Stop the device profiling session that was created previously. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_destroy (rocprofiler_session_id_t session_id) ROCPROFILER_VERSION_9_0
 Destroy the device profiling session that was created previously. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_create (rocprofiler_record_id_t *id, rocprofiler_codeobj_capture_mode_t mode, uint64_t userdata)
 Creates a codeobj capture record, returned in ID. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_get (rocprofiler_record_id_t id, rocprofiler_codeobj_symbols_t *capture)
 API to get the captured codeobj. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_free (rocprofiler_record_id_t id)
 API to delete a record. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_start (rocprofiler_record_id_t id)
 Records the current loaded codeobjs and any following loads until stop() is called. More...
 
ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_stop (rocprofiler_record_id_t id)
 Stops recording of future codeobjs, until start() is called again. More...
 

Detailed Description

Function Documentation

◆ rocprofiler_codeobj_capture_create()

ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_create ( rocprofiler_record_id_t id,
rocprofiler_codeobj_capture_mode_t  mode,
uint64_t  userdata 
)

Creates a codeobj capture record, returned in ID.

Parameters
[out]idcontains a handle for the created record.
[in]modeSet to capture symbols only, make a copy of codeobj under memory:// or copy all codeobj.
[in]userdatauserdata to be returned in the record. For ATT records, is the kernel addr.
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.

◆ rocprofiler_codeobj_capture_free()

ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_free ( rocprofiler_record_id_t  id)

API to delete a record.

Invalidates the pointer returned from rocprofiler_codeobj_capture_get.

Parameters
[in]idrecord handle.
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.

◆ rocprofiler_codeobj_capture_get()

ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_get ( rocprofiler_record_id_t  id,
rocprofiler_codeobj_symbols_t capture 
)

API to get the captured codeobj.

Each call invalidates the previous pointer for the same ID.

Parameters
[in]idrecord handle.
[out]capturecaptured code objects.
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.
ROCPROFILER_STATUS_ERROR_INVALID_ARGUMENTSinvalid ID.

◆ rocprofiler_codeobj_capture_start()

ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_start ( rocprofiler_record_id_t  id)

Records the current loaded codeobjs and any following loads until stop() is called.

Parameters
[in]idrecord handle.
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.
ROCPROFILER_STATUS_ERROR_INVALID_ARGUMENTSinvalid ID.

◆ rocprofiler_codeobj_capture_stop()

ROCPROFILER_API rocprofiler_status_t rocprofiler_codeobj_capture_stop ( rocprofiler_record_id_t  id)

Stops recording of future codeobjs, until start() is called again.

Calling stop() immediately after a start() snapshots the current state of loaded codeobjs.

Parameters
[in]idrecord handle.
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.
ROCPROFILER_STATUS_ERROR_INVALID_ARGUMENTSinvalid ID.

◆ rocprofiler_device_profiling_session_create()

ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_create ( const char **  counter_names,
uint64_t  num_counters,
rocprofiler_session_id_t session_id,
int  cpu_index,
int  gpu_index 
)

Create a device profiling session.

A device profiling session allows the user to profile the GPU device for counters irrespective of the running applications on the GPU. This is different from application profiling. device profiling session doesn't care about the host running processes and threads. It directly provides low level profiling information.

Parameters
[in]counter_namesThe names of the counters to be collected.
[in]num_countersThe number of counters specifief to be collected
[out]session_idPointer to the created session id.
[in]cpu_indexindex of the cpu to be used
[in]gpu_indexindex of the gpu to be used
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.

◆ rocprofiler_device_profiling_session_destroy()

ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_destroy ( rocprofiler_session_id_t  session_id)

Destroy the device profiling session that was created previously.

Parameters
[in]session_idsession id of the session to start
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.

◆ rocprofiler_device_profiling_session_poll()

ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_poll ( rocprofiler_session_id_t  session_id,
rocprofiler_device_profile_metric_t data 
)

Poll the device profiling session to read counters from the GPU device.

This will read out the values of the counters from the GPU device at the specific instant when this API is called. This is a thread-blocking call. Any thread that calls this API will have to wait until the counter values are being read out.

Parameters
[in]session_idsession id of the session to start
[out]datarecords of counter data read out from device
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.

◆ rocprofiler_device_profiling_session_start()

ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_start ( rocprofiler_session_id_t  session_id)

Start the device profiling session that was created previously.

This will enable the GPU device to start incrementing counters

Parameters
[in]session_idsession id of the session to start
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.

◆ rocprofiler_device_profiling_session_stop()

ROCPROFILER_API rocprofiler_status_t rocprofiler_device_profiling_session_stop ( rocprofiler_session_id_t  session_id)

Stop the device profiling session that was created previously.

This will inform the GPU device to stop counters collection.

Parameters
[in]session_idsession id of the session to start
Return values
ROCPROFILER_STATUS_SUCCESSThe function has been executed successfully.