Error Queries

Error Queries#

AMD SMI: Error Queries
Error Queries

Functions

amdsmi_status_t amdsmi_get_gpu_ecc_status (amdsmi_processor_handle processor_handle, amdsmi_gpu_block_t block, amdsmi_ras_err_state_t *state)
 Retrieve the ECC status for a GPU block. It is not supported on virtual machine guest.
 
amdsmi_status_t amdsmi_status_code_to_string (amdsmi_status_t status, const char **status_string)
 Get a description of a provided AMDSMI error status.
 

Detailed Description

These functions provide error information about AMDSMI calls as well as device errors.

Function Documentation

◆ amdsmi_get_gpu_ecc_status()

amdsmi_status_t amdsmi_get_gpu_ecc_status ( amdsmi_processor_handle  processor_handle,
amdsmi_gpu_block_t  block,
amdsmi_ras_err_state_t state 
)

Retrieve the ECC status for a GPU block. It is not supported on virtual machine guest.

See RAS Error Count sysfs Interface (AMDGPU RAS Support - Linux Kernel documentation) to learn how these error counts are accessed.

Platform:
gpu_bm_linux

Given a processor handle processor_handle, an amdsmi_gpu_block_t block and a pointer to an amdsmi_ras_err_state_t state, this function will write the current state for the GPU block indicated by block to memory pointed to by state.

Parameters
[in]processor_handlea processor handle
[in]blockThe block for which error counts should be retrieved
[in,out]stateA pointer to an amdsmi_ras_err_state_t to which the ECC state should be written If this parameter is nullptr, this function will return AMDSMI_STATUS_INVAL if the function is supported with the provided, arguments and AMDSMI_STATUS_NOT_SUPPORTED if it is not supported with the provided arguments.
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail

◆ amdsmi_status_code_to_string()

amdsmi_status_t amdsmi_status_code_to_string ( amdsmi_status_t  status,
const char **  status_string 
)

Get a description of a provided AMDSMI error status.

Platform:

gpu_bm_linux

host

cpu_bm

guest_1vf

guest_mvf

Set the provided pointer to a const char *, status_string, to a string containing a description of the provided error code status.

Parameters
[in]statusThe error status for which a description is desired
[in,out]status_stringA pointer to a const char * which will be made to point to a description of the provided error code
Returns
amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail