hipBLASLt API Reference#
hipblasLtCreate()#
- 
hipblasStatus_t hipblasLtCreate(hipblasLtHandle_t *handle)#
- Create a hipblaslt handle. - This function initializes the hipBLASLt library and creates a handle to an opaque structure holding the hipBLASLt library context. It allocates light hardware resources on the host and device, and must be called prior to making any other hipBLASLt library calls. The hipBLASLt library context is tied to the current CUDA device. To use the library on multiple devices, one hipBLASLt handle should be created for each device. - Parameters:
- handle – [out] Pointer to the allocated hipBLASLt handle for the created hipBLASLt context. 
- Return values:
- HIPBLAS_STATUS_SUCCESS – The allocation completed successfully. 
- HIPBLAS_STATUS_INVALID_VALUE – - handle== NULL.
 
 
hipblasLtDestroy()#
- 
hipblasStatus_t hipblasLtDestroy(const hipblasLtHandle_t handle)#
- Destory a hipblaslt handle. - This function releases hardware resources used by the hipBLASLt library. This function is usually the last call with a particular handle to the hipBLASLt library. Because hipblasLtCreate() allocates some internal resources and the release of those resources by calling hipblasLtDestroy() will implicitly call hipDeviceSynchronize(), it is recommended to minimize the number of hipblasLtCreate()/hipblasLtDestroy() occurrences. - Parameters:
- handle – [in] Pointer to the hipBLASLt handle to be destroyed. 
- Return values:
- HIPBLAS_STATUS_SUCCESS – The hipBLASLt context was successfully destroyed. 
- HIPBLAS_STATUS_NOT_INITIALIZED – The hipBLASLt library was not initialized. 
- HIPBLAS_STATUS_INVALID_VALUE – - handle== NULL.
 
 
hipblasLtMatrixLayoutCreate()#
- 
hipblasStatus_t hipblasLtMatrixLayoutCreate(hipblasLtMatrixLayout_t *matLayout, hipDataType type, uint64_t rows, uint64_t cols, int64_t ld)#
- Create a matrix layout descriptor. - This function creates a matrix layout descriptor by allocating the memory needed to hold its opaque structure. - Parameters:
- matLayout – [out] Pointer to the structure holding the matrix layout descriptor created by this function. see hipblasLtMatrixLayout_t . 
- type – [in] Enumerant that specifies the data precision for the matrix layout descriptor this function creates. See hipDataType. 
- rows – [in] Number of rows of the matrix. 
- cols – [in] Number of columns of the matrix. 
- ld – [in] The leading dimension of the matrix. In column major layout, this is the number of elements to jump to reach the next column. Thus ld >= m (number of rows). 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the descriptor was created successfully. 
- HIPBLAS_STATUS_ALLOC_FAILED – If the memory could not be allocated. 
 
 
hipblasLtMatrixLayoutDestroy()#
- 
hipblasStatus_t hipblasLtMatrixLayoutDestroy(const hipblasLtMatrixLayout_t matLayout)#
- Destory a matrix layout descriptor. - This function destroys a previously created matrix layout descriptor object. - Parameters:
- matLayout – [in] Pointer to the structure holding the matrix layout descriptor that should be destroyed by this function. see hipblasLtMatrixLayout_t . 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the operation was successful. 
 
hipblasLtMatrixLayoutSetAttribute()#
- 
hipblasStatus_t hipblasLtMatrixLayoutSetAttribute(hipblasLtMatrixLayout_t matLayout, hipblasLtMatrixLayoutAttribute_t attr, const void *buf, size_t sizeInBytes)#
- Set attribute to a matrix descriptor. - This function sets the value of the specified attribute belonging to a previously created matrix descriptor. - Parameters:
- matLayout – [in] Pointer to the previously created structure holding the matrix mdescriptor queried by this function. See hipblasLtMatrixLayout_t. 
- attr – [in] The attribute that will be set by this function. See hipblasLtMatrixLayoutAttribute_t. 
- buf – [in] The value to which the specified attribute should be set. 
- sizeInBytes – [in] Size of buf buffer (in bytes) for verification. 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the attribute was set successfully.. 
- HIPBLAS_STATUS_INVALID_VALUE – If - bufis NULL or- sizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
 
 
hipblasLtMatrixLayoutGetAttribute()#
- 
hipblasStatus_t hipblasLtMatrixLayoutGetAttribute(hipblasLtMatrixLayout_t matLayout, hipblasLtMatrixLayoutAttribute_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
- Query attribute from a matrix descriptor. - This function returns the value of the queried attribute belonging to a previously created matrix descriptor. - Parameters:
- matLayout – [in] Pointer to the previously created structure holding the matrix descriptor queried by this function. See hipblasLtMatrixLayout_t. 
- attr – [in] The attribute that will be retrieved by this function. See hipblasLtMatrixLayoutAttribute_t. 
- buf – [out] Memory address containing the attribute value retrieved by this function. 
- sizeInBytes – [in] Size of - bufbuffer (in bytes) for verification.
- sizeWritten – [out] Valid only when the return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero: then sizeWritten is the number of bytes actually written; if sizeInBytes is 0: then sizeWritten is the number of bytes needed to write full contents. 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If attribute’s value was successfully written to user memory. 
- HIPBLAS_STATUS_INVALID_VALUE – If - sizeInBytesis 0 and- sizeWrittenis NULL, or if- sizeInBytesis non-zero and- bufis NULL, or- sizeInBytesdoesn’t match size of internal storage for the selected attribute.
 
 
hipblasLtMatmulDescCreate()#
- 
hipblasStatus_t hipblasLtMatmulDescCreate(hipblasLtMatmulDesc_t *matmulDesc, hipblasComputeType_t computeType, hipDataType scaleType)#
- Create a matrix multiply descriptor. - This function creates a matrix multiply descriptor by allocating the memory needed to hold its opaque structure. - Parameters:
- matmulDesc – [out] Pointer to the structure holding the matrix multiply descriptor created by this function. See hipblasLtMatmulDesc_t . 
- computeType – [in] Enumerant that specifies the data precision for the matrix multiply descriptor this function creates. See hipblasComputeType_t . 
- scaleType – [in] Enumerant that specifies the data precision for the matrix transform descriptor this function creates. See hipDataType. 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the descriptor was created successfully. 
- HIPBLAS_STATUS_ALLOC_FAILED – If the memory could not be allocated. 
 
 
hipblasLtMatmulDescDestroy()#
- 
hipblasStatus_t hipblasLtMatmulDescDestroy(const hipblasLtMatmulDesc_t matmulDesc)#
- Destory a matrix multiply descriptor. - This function destroys a previously created matrix multiply descriptor object. - Parameters:
- matmulDesc – [in] Pointer to the structure holding the matrix multiply descriptor that should be destroyed by this function. See hipblasLtMatmulDesc_t . 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If operation was successful. 
 
hipblasLtMatmulDescSetAttribute()#
- 
hipblasStatus_t hipblasLtMatmulDescSetAttribute(hipblasLtMatmulDesc_t matmulDesc, hipblasLtMatmulDescAttributes_t attr, const void *buf, size_t sizeInBytes)#
- Set attribute to a matrix multiply descriptor. - This function sets the value of the specified attribute belonging to a previously created matrix multiply descriptor. - Parameters:
- matmulDesc – [in] Pointer to the previously created structure holding the matrix multiply descriptor queried by this function. See hipblasLtMatmulDesc_t. 
- attr – [in] The attribute that will be set by this function. See hipblasLtMatmulDescAttributes_t. 
- buf – [in] The value to which the specified attribute should be set. 
- sizeInBytes – [in] Size of buf buffer (in bytes) for verification. 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the attribute was set successfully.. 
- HIPBLAS_STATUS_INVALID_VALUE – If - bufis NULL or- sizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
 
 
hipblasLtMatmulDescGetAttribute()#
- 
hipblasStatus_t hipblasLtMatmulDescGetAttribute(hipblasLtMatmulDesc_t matmulDesc, hipblasLtMatmulDescAttributes_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
- Query attribute from a matrix multiply descriptor. - This function returns the value of the queried attribute belonging to a previously created matrix multiply descriptor. - Parameters:
- matmulDesc – [in] Pointer to the previously created structure holding the matrix multiply descriptor queried by this function. See hipblasLtMatmulDesc_t. 
- attr – [in] The attribute that will be retrieved by this function. See hipblasLtMatmulDescAttributes_t. 
- buf – [out] Memory address containing the attribute value retrieved by this function. 
- sizeInBytes – [in] Size of - bufbuffer (in bytes) for verification.
- sizeWritten – [out] Valid only when the return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero: then sizeWritten is the number of bytes actually written; if sizeInBytes is 0: then sizeWritten is the number of bytes needed to write full contents. 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If attribute’s value was successfully written to user memory. 
- HIPBLAS_STATUS_INVALID_VALUE – If - sizeInBytesis 0 and- sizeWrittenis NULL, or if- sizeInBytesis non-zero and- bufis NULL, or- sizeInBytesdoesn’t match size of internal storage for the selected attribute.
 
 
hipblasLtMatmulPreferenceCreate()#
- 
hipblasStatus_t hipblasLtMatmulPreferenceCreate(hipblasLtMatmulPreference_t *pref)#
- Create a preference descriptor. - This function creates a matrix multiply heuristic search preferences descriptor by allocating the memory needed to hold its opaque structure. - Parameters:
- pref – [out] Pointer to the structure holding the matrix multiply preferences descriptor created by this function. see hipblasLtMatmulPreference_t . 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the descriptor was created successfully. 
- HIPBLAS_STATUS_ALLOC_FAILED – If memory could not be allocated. 
 
 
hipblasLtMatmulPreferenceDestroy()#
- 
hipblasStatus_t hipblasLtMatmulPreferenceDestroy(const hipblasLtMatmulPreference_t pref)#
- Destory a preferences descriptor. - This function destroys a previously created matrix multiply preferences descriptor object. - Parameters:
- pref – [in] Pointer to the structure holding the matrix multiply preferences descriptor that should be destroyed by this function. See hipblasLtMatmulPreference_t . 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If operation was successful. 
 
hipblasLtMatmulPreferenceSetAttribute()#
- 
hipblasStatus_t hipblasLtMatmulPreferenceSetAttribute(hipblasLtMatmulPreference_t pref, hipblasLtMatmulPreferenceAttributes_t attr, const void *buf, size_t sizeInBytes)#
- Set attribute to a preference descriptor. - This function sets the value of the specified attribute belonging to a previously created matrix multiply preferences descriptor. - Parameters:
- pref – [in] Pointer to the previously created structure holding the matrix multiply preferences descriptor queried by this function. See hipblasLtMatmulPreference_t 
- attr – [in] The attribute that will be set by this function. See hipblasLtMatmulPreferenceAttributes_t. 
- buf – [in] The value to which the specified attribute should be set. 
- sizeInBytes – [in] Size of - bufbuffer (in bytes) for verification.
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the attribute was set successfully.. 
- HIPBLAS_STATUS_INVALID_VALUE – If - bufis NULL or- sizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
 
 
hipblasLtMatmulPreferenceGetAttribute()#
- 
hipblasStatus_t hipblasLtMatmulPreferenceGetAttribute(hipblasLtMatmulPreference_t pref, hipblasLtMatmulPreferenceAttributes_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
- Query attribute from a preference descriptor. - This function returns the value of the queried attribute belonging to a previously created matrix multiply heuristic search preferences descriptor. - Parameters:
- pref – [in] Pointer to the previously created structure holding the matrix multiply heuristic search preferences descriptor queried by this function. See hipblasLtMatmulPreference_t. 
- attr – [in] The attribute that will be retrieved by this function. See hipblasLtMatmulPreferenceAttributes_t. 
- buf – [out] Memory address containing the attribute value retrieved by this function. 
- sizeInBytes – [in] Size of - bufbuffer (in bytes) for verification.
- sizeWritten – [out] Valid only when the return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero: then sizeWritten is the number of bytes actually written; if sizeInBytes is 0: then sizeWritten is the number of bytes needed to write full contents. 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If attribute’s value was successfully written to user memory. 
- HIPBLAS_STATUS_INVALID_VALUE – If - sizeInBytesis 0 and- sizeWrittenis NULL, or if- sizeInBytesis non-zero and- bufis NULL, or- sizeInBytesdoesn’t match size of internal storage for the selected attribute.
 
 
hipblasLtMatmulAlgoGetHeuristic()#
- 
hipblasStatus_t hipblasLtMatmulAlgoGetHeuristic(hipblasLtHandle_t handle, hipblasLtMatmulDesc_t matmulDesc, hipblasLtMatrixLayout_t Adesc, hipblasLtMatrixLayout_t Bdesc, hipblasLtMatrixLayout_t Cdesc, hipblasLtMatrixLayout_t Ddesc, hipblasLtMatmulPreference_t pref, int requestedAlgoCount, hipblasLtMatmulHeuristicResult_t heuristicResultsArray[], int *returnAlgoCount)#
- Retrieve the possible algorithms. - This function retrieves the possible algorithms for the matrix multiply operation hipblasLtMatmul() function with the given input matrices A, B and C, and the output matrix D. The output is placed in heuristicResultsArray[] in the order of increasing estimated compute time. - Parameters:
- handle – [in] Pointer to the allocated hipBLASLt handle for the hipBLASLt context. See hipblasLtHandle_t . 
- matmulDesc – [in] Handle to a previously created matrix multiplication descriptor of type hipblasLtMatmulDesc_t . 
- Adesc, Bdesc, Cdesc, Ddesc – [in] Handles to the previously created matrix layout descriptors of the type hipblasLtMatrixLayout_t . 
- pref – [in] Pointer to the structure holding the heuristic search preferences descriptor. See hipblasLtMatmulPreference_t . 
- requestedAlgoCount – [in] Size of the - heuristicResultsArray(in elements). This is the requested maximum number of algorithms to return.
- heuristicResultsArray[] – [out] Array containing the algorithm heuristics and associated runtime characteristics, returned by this function, in the order of increasing estimated compute time. 
- returnAlgoCount – [out] Number of algorithms returned by this function. This is the number of - heuristicResultsArrayelements written.
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If query was successful. Inspect heuristicResultsArray[0 to (returnAlgoCount -1)].state for the status of the results. 
- HIPBLAS_STATUS_NOT_SUPPORTED – If no heuristic function available for current configuration. 
- HIPBLAS_STATUS_INVALID_VALUE – If - requestedAlgoCountis less or equal to zero.
 
 
hipblasLtMatmul()#
- 
hipblasStatus_t hipblasLtMatmul(hipblasLtHandle_t handle, hipblasLtMatmulDesc_t matmulDesc, const void *alpha, const void *A, hipblasLtMatrixLayout_t Adesc, const void *B, hipblasLtMatrixLayout_t Bdesc, const void *beta, const void *C, hipblasLtMatrixLayout_t Cdesc, void *D, hipblasLtMatrixLayout_t Ddesc, const hipblasLtMatmulAlgo_t *algo, void *workspace, size_t workspaceSizeInBytes, hipStream_t stream)#
- Retrieve the possible algorithms. - This function computes the matrix multiplication of matrices A and B to produce the output matrix D, according to the following operation: - D=- alpha*(- A*- B) +- beta*(- C), where- A,- B, and- Care input matrices, and- alphaand- betaare input scalars. Note: This function supports both in-place matrix multiplication (C == D and Cdesc == Ddesc) and out-of-place matrix multiplication (C != D, both matrices must have the same data type, number of rows, number of columns, batch size, and memory order). In the out-of-place case, the leading dimension of C can be different from the leading dimension of D. Specifically the leading dimension of C can be 0 to achieve row or column broadcast. If Cdesc is omitted, this function assumes it to be equal to Ddesc.- Parameters:
- handle – [in] Pointer to the allocated hipBLASLt handle for the hipBLASLt context. See hipblasLtHandle_t . 
- matmulDesc – [in] Handle to a previously created matrix multiplication descriptor of type hipblasLtMatmulDesc_t . 
- alpha, beta – [in] Pointers to the scalars used in the multiplication. 
- Adesc, Bdesc, Cdesc, Ddesc – [in] Handles to the previously created matrix layout descriptors of the type hipblasLtMatrixLayout_t . 
- A, B, C – [in] Pointers to the GPU memory associated with the corresponding descriptors - Adesc,- Bdescand- Cdesc.
- D – [out] Pointer to the GPU memory associated with the descriptor - Ddesc.
- algo – [in] Handle for matrix multiplication algorithm to be used. See hipblasLtMatmulAlgo_t . When NULL, an implicit heuristics query with default search preferences will be performed to determine actual algorithm to use. 
- workspace – [in] Pointer to the workspace buffer allocated in the GPU memory. Pointer must be 16B aligned (that is, lowest 4 bits of address must be 0). 
- workspaceSizeInBytes – [in] Size of the workspace. 
- stream – [in] The HIP stream where all the GPU work will be submitted. 
 
- Return values:
- HIPBLAS_STATUS_SUCCESS – If the operation completed successfully. 
- HIPBLAS_STATUS_EXECUTION_FAILED – If HIP reported an execution error from the device. 
- HIPBLAS_STATUS_ARCH_MISMATCH – If the configured operation cannot be run using the selected device. 
- HIPBLAS_STATUS_NOT_SUPPORTED – If the current implementation on the selected device doesn’t support the configured operation. 
- HIPBLAS_STATUS_INVALID_VALUE – If the parameters are unexpectedly NULL, in conflict or in an impossible configuration. For example, when workspaceSizeInBytes is less than workspace required by the configured algo. 
- HIBLAS_STATUS_NOT_INITIALIZED – If hipBLASLt handle has not been initialized. 
 
 
Datatypes Supported:
hipblasLtMatmul supports the following computeType, scaleType, Atype/Btype, Ctype/Dtype and Bias Type:
| computeType | scaleType/Bias Type | Atype/Btype | Ctype/Dtype | 
|---|---|---|---|
| HIPBLASLT_COMPUTE_F32 | HIPBLASLT_R_32F | HIPBLASLT_R_32F | HIPBLASLT_R_32F | 
| HIPBLASLT_COMPUTE_F32 | HIPBLASLT_R_32F | HIPBLASLT_R_16F | HIPBLASLT_R_16F | 
| HIPBLASLT_COMPUTE_F32 | HIPBLASLT_R_32F | HIPBLASLT_R_16F | HIPBLASLT_R_32F | 
| HIPBLASLT_COMPUTE_F32 | HIPBLASLT_R_32F | HIPBLASLT_R_16B | HIPBLASLT_R_16B | 
hipblasLtMatrixTransformDescCreate()#
- 
hipblasStatus_t hipblasLtMatrixTransformDescCreate(hipblasLtMatrixTransformDesc_t *transformDesc, hipDataType scaleType)#
- Create new matrix transform operation descriptor. - Return values:
- HIPBLAS_STATUS_ALLOC_FAILED – if memory could not be allocated 
- HIPBLAS_STATUS_SUCCESS – if desciptor was created successfully 
 
 
hipblasLtMatrixTransformDescDestroy()#
- 
hipblasStatus_t hipblasLtMatrixTransformDescDestroy(hipblasLtMatrixTransformDesc_t transformDesc)#
- Destroy matrix transform operation descriptor. - Return values:
- HIPBLAS_STATUS_SUCCESS – if operation was successful 
 
hipblasLtMatrixTransformDescSetAttribute()#
- 
hipblasStatus_t hipblasLtMatrixTransformDescSetAttribute(hipblasLtMatrixTransformDesc_t transformDesc, hipblasLtMatrixTransformDescAttributes_t attr, const void *buf, size_t sizeInBytes)#
- Set matrix transform operation descriptor attribute. - Parameters:
- transformDesc – [in] The descriptor 
- attr – [in] The attribute 
- buf – [in] memory address containing the new value 
- sizeInBytes – [in] size of buf buffer for verification (in bytes) 
 
- Return values:
- HIPBLAS_STATUS_INVALID_VALUE – if buf is NULL or sizeInBytes doesn’t match size of internal storage for selected attribute 
- HIPBLAS_STATUS_SUCCESS – if attribute was set successfully 
 
 
hipblasLtMatrixTransformDescGetAttribute()#
- 
hipblasStatus_t hipblasLtMatrixTransformDescGetAttribute(hipblasLtMatrixTransformDesc_t transformDesc, hipblasLtMatrixTransformDescAttributes_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
- Matrix transform operation getter. - Get matrix transform operation descriptor attribute. - Parameters:
- transformDesc – [in] The descriptor 
- attr – [in] The attribute 
- buf – [out] memory address containing the new value 
- sizeInBytes – [in] size of buf buffer for verification (in bytes) 
- sizeWritten – [out] only valid when return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero: number of bytes actually written, if sizeInBytes is 0: number of bytes needed to write full contents 
 
- Return values:
- HIPBLAS_STATUS_INVALID_VALUE – if sizeInBytes is 0 and sizeWritten is NULL, or if sizeInBytes is non-zero and buf is NULL or sizeInBytes doesn’t match size of internal storage for selected attribute 
- HIPBLAS_STATUS_SUCCESS – if attribute’s value was successfully written to user memory 
 
 
hipblasLtMatrixTransform()#
- 
hipblasStatus_t hipblasLtMatrixTransform(hipblasLtHandle_t lightHandle, hipblasLtMatrixTransformDesc_t transformDesc, const void *alpha, const void *A, hipblasLtMatrixLayout_t Adesc, const void *beta, const void *B, hipblasLtMatrixLayout_t Bdesc, void *C, hipblasLtMatrixLayout_t Cdesc, hipStream_t stream)#
- Matrix layout conversion helper. - Matrix layout conversion helper (C = alpha * op(A) + beta * op(B)), can be used to change memory order of data or to scale and shift the values. - Parameters:
- lightHandle – [in] Pointer to the allocated hipBLASLt handle for the hipBLASLt context. See hipblasLtHandle_t . 
- transformDesc – [in] Pointer to allocated matrix transform descriptor. 
- alpha – [in] Pointer to scalar alpha, either pointer to host or device address. 
- A – [in] Pointer to matrix A, must be pointer to device address. 
- Adesc – [in] Pointer to layout for input matrix A. 
- beta – [in] Pointer to scalar beta, either pointer to host or device address. 
- B – [in] Pointer to layout for matrix B, must be pointer to device address 
- Bdesc – [in] Pointer to layout for inputmatrix B. 
- C – [in] Pointer to matrix C, must be pointer to device address 
- Cdesc – [out] Pointer to layout for output matrix C. 
- stream – [in] The HIP stream where all the GPU work will be submitted. 
 
- Return values:
- HIPBLAS_STATUS_NOT_INITIALIZED – if hipBLASLt handle has not been initialized 
- HIPBLAS_STATUS_INVALID_VALUE – if parameters are in conflict or in an impossible configuration; e.g. when A is not NULL, but Adesc is NULL 
- HIPBLAS_STATUS_NOT_SUPPORTED – if current implementation on selected device doesn’t support configured operation 
- HIPBLAS_STATUS_ARCH_MISMATCH – if configured operation cannot be run using selected device 
- HIPBLAS_STATUS_EXECUTION_FAILED – if HIP reported execution error from the device 
- HIPBLAS_STATUS_SUCCESS – if the operation completed successfully 
 
 
hipblasLtMatrixTransform supports the following Atype/Btype/Ctype and scaleType:
| Atype/Btype/Ctype | scaleType | 
|---|---|
| HIP_R_32F | HIP_R_32F | 
| HIP_R_16F | HIP_R_32F/HIP_R_16F | 
| HIP_R_16BF | HIP_R_32F | 
| HIP_R_8I | HIP_R_32F | 
| HIP_R_32I | HIP_R_32F |