Error Queries#
Functions | |
| amdsmi_status_t | amdsmi_get_gpu_ecc_count (amdsmi_processor_handle processor_handle, amdsmi_gpu_block_t block, amdsmi_error_count_t *ec) |
| Retrieve the error counts for a GPU block. It is not supported on virtual machine guest. More... | |
| amdsmi_status_t | amdsmi_get_gpu_ecc_enabled (amdsmi_processor_handle processor_handle, uint64_t *enabled_blocks) |
| Retrieve the enabled ECC bit-mask. It is not supported on virtual machine guest. More... | |
| amdsmi_status_t | amdsmi_get_gpu_ecc_status (amdsmi_processor_handle processor_handle, amdsmi_gpu_block_t block, amdsmi_ras_err_state_t *state) |
| Retrieve the ECC status for a GPU block. It is not supported on virtual machine guest. More... | |
| amdsmi_status_t | amdsmi_status_code_to_string (amdsmi_status_t status, const char **status_string) |
| Get a description of a provided AMDSMI error status. More... | |
Detailed Description
These functions provide error information about AMDSMI calls as well as device errors.
Function Documentation
◆ amdsmi_get_gpu_ecc_count()
| amdsmi_status_t amdsmi_get_gpu_ecc_count | ( | amdsmi_processor_handle | processor_handle, |
| amdsmi_gpu_block_t | block, | ||
| amdsmi_error_count_t * | ec | ||
| ) |
Retrieve the error counts for a GPU block. It is not supported on virtual machine guest.
Given a processor handle processor_handle, an amdsmi_gpu_block_t block and a pointer to an amdsmi_error_count_t ec, this function will write the error count values for the GPU block indicated by block to memory pointed to by ec.
- Parameters
-
[in] processor_handle a processor handle [in] block The block for which error counts should be retrieved [in,out] ec A pointer to an amdsmi_error_count_t to which the error counts should be written If this parameter is nullptr, this function will return AMDSMI_STATUS_INVAL if the function is supported with the provided, arguments and AMDSMI_STATUS_NOT_SUPPORTED if it is not supported with the provided arguments.
- Returns
- amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail
◆ amdsmi_get_gpu_ecc_enabled()
| amdsmi_status_t amdsmi_get_gpu_ecc_enabled | ( | amdsmi_processor_handle | processor_handle, |
| uint64_t * | enabled_blocks | ||
| ) |
Retrieve the enabled ECC bit-mask. It is not supported on virtual machine guest.
Given a processor handle processor_handle, and a pointer to a uint64_t enabled_mask, this function will write bits to memory pointed to by enabled_blocks. Upon a successful call, enabled_blocks can then be AND'd with elements of the amdsmi_gpu_block_t ennumeration to determine if the corresponding block has ECC enabled. Note that whether a block has ECC enabled or not in the device is independent of whether there is kernel support for error counting for that block. Although a block may be enabled, but there may not be kernel support for reading error counters for that block.
- Parameters
-
[in] processor_handle a processor handle [in,out] enabled_blocks A pointer to a uint64_t to which the enabled blocks bits will be written. If this parameter is nullptr, this function will return AMDSMI_STATUS_INVAL if the function is supported with the provided, arguments and AMDSMI_STATUS_NOT_SUPPORTED if it is not supported with the provided arguments.
- Returns
- amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail
◆ amdsmi_get_gpu_ecc_status()
| amdsmi_status_t amdsmi_get_gpu_ecc_status | ( | amdsmi_processor_handle | processor_handle, |
| amdsmi_gpu_block_t | block, | ||
| amdsmi_ras_err_state_t * | state | ||
| ) |
Retrieve the ECC status for a GPU block. It is not supported on virtual machine guest.
Given a processor handle processor_handle, an amdsmi_gpu_block_t block and a pointer to an amdsmi_ras_err_state_t state, this function will write the current state for the GPU block indicated by block to memory pointed to by state.
- Parameters
-
[in] processor_handle a processor handle [in] block The block for which error counts should be retrieved [in,out] state A pointer to an amdsmi_ras_err_state_t to which the ECC state should be written If this parameter is nullptr, this function will return AMDSMI_STATUS_INVAL if the function is supported with the provided, arguments and AMDSMI_STATUS_NOT_SUPPORTED if it is not supported with the provided arguments.
- Returns
- amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail
◆ amdsmi_status_code_to_string()
| amdsmi_status_t amdsmi_status_code_to_string | ( | amdsmi_status_t | status, |
| const char ** | status_string | ||
| ) |
Get a description of a provided AMDSMI error status.
Set the provided pointer to a const char *, status_string, to a string containing a description of the provided error code status.
- Parameters
-
[in] status The error status for which a description is desired [in,out] status_string A pointer to a const char * which will be made to point to a description of the provided error code
- Returns
- amdsmi_status_t | AMDSMI_STATUS_SUCCESS on success, non-zero on fail