hipBLASLt API reference#
hipblasLtCreate()#
-
hipblasStatus_t hipblasLtCreate(hipblasLtHandle_t *handle)#
Create a hipBLASLt handle.
This function initializes the hipBLASLt library and creates a handle to an opaque structure holding the hipBLASLt library context. It allocates light hardware resources on the host and device and must be called prior to making any other hipBLASLt library calls. The hipBLASLt library context is tied to the current ROCm device. To use the library on multiple devices, one hipBLASLt handle should be created for each device.
- Parameters:
handle – [out] Pointer to the allocated hipBLASLt handle for the created hipBLASLt context.
- Return values:
HIPBLAS_STATUS_SUCCESS – The allocation completed successfully.
HIPBLAS_STATUS_INVALID_VALUE –
handle== NULL.
hipblasLtDestroy()#
-
hipblasStatus_t hipblasLtDestroy(const hipblasLtHandle_t handle)#
Destroy a hipBLASLt handle.
This function releases hardware resources used by the hipBLASLt library. It is usually the last call with a particular handle to the hipBLASLt library. Because hipblasLtCreate() allocates some internal resources and the release of those resources by calling hipblasLtDestroy() implicitly calls
hipDeviceSynchronize, it is recommended to minimize the number of hipblasLtCreate() / hipblasLtDestroy() occurrences.- Parameters:
handle – [in] Pointer to the hipBLASLt handle to be destroyed.
- Return values:
HIPBLAS_STATUS_SUCCESS – The hipBLASLt context was successfully destroyed.
HIPBLAS_STATUS_NOT_INITIALIZED – The hipBLASLt library was not initialized.
HIPBLAS_STATUS_INVALID_VALUE –
handle== NULL.
hipblasLtMatrixLayoutCreate()#
-
hipblasStatus_t hipblasLtMatrixLayoutCreate(hipblasLtMatrixLayout_t *matLayout, hipDataType type, uint64_t rows, uint64_t cols, int64_t ld)#
Create a matrix layout descriptor.
This function creates a matrix layout descriptor by allocating the memory needed to hold its opaque structure.
- Parameters:
matLayout – [out] Pointer to the structure holding the matrix layout descriptor created by this function. See hipblasLtMatrixLayout_t.
type – [in] Enumerant that specifies the data precision for the matrix layout descriptor created by this function. See hipDataType.
rows – [in] Number of rows of the matrix.
cols – [in] Number of columns of the matrix.
ld – [in] The leading dimension of the matrix. In column major layout, this is the number of elements to jump to reach the next column. Therefore, ld >= m (number of rows).
- Return values:
HIPBLAS_STATUS_SUCCESS – If the descriptor was created successfully.
HIPBLAS_STATUS_ALLOC_FAILED – If the memory could not be allocated.
hipblasLtMatrixLayoutDestroy()#
-
hipblasStatus_t hipblasLtMatrixLayoutDestroy(const hipblasLtMatrixLayout_t matLayout)#
Destroy a matrix layout descriptor.
This function destroys a previously created matrix layout descriptor object.
- Parameters:
matLayout – [in] Pointer to the structure holding the matrix layout descriptor to be destroyed by this function. see hipblasLtMatrixLayout_t.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the operation was successful.
hipblasLtMatrixLayoutSetAttribute()#
-
hipblasStatus_t hipblasLtMatrixLayoutSetAttribute(hipblasLtMatrixLayout_t matLayout, hipblasLtMatrixLayoutAttribute_t attr, const void *buf, size_t sizeInBytes)#
Set an attribute for a matrix descriptor.
This function sets the value of the specified attribute belonging to a previously created matrix descriptor.
- Parameters:
matLayout – [in] Pointer to the previously created structure holding the matrix descriptor queried by this function. See hipblasLtMatrixLayout_t.
attr – [in] The attribute that will be set by this function. See hipblasLtMatrixLayoutAttribute_t.
buf – [in] The value to which the specified attribute should be set.
sizeInBytes – [in] Size of the buf buffer (in bytes) for verification.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the attribute was set successfully.
HIPBLAS_STATUS_INVALID_VALUE – If
bufis NULL orsizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
hipblasLtMatrixLayoutGetAttribute()#
-
hipblasStatus_t hipblasLtMatrixLayoutGetAttribute(hipblasLtMatrixLayout_t matLayout, hipblasLtMatrixLayoutAttribute_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
Query an attribute from a matrix descriptor.
This function returns the value of the queried attribute belonging to a previously created matrix descriptor.
- Parameters:
matLayout – [in] Pointer to the previously created structure holding the matrix descriptor queried by this function. See hipblasLtMatrixLayout_t.
attr – [in] The attribute that will be retrieved by this function. See hipblasLtMatrixLayoutAttribute_t.
buf – [out] Memory address containing the attribute value retrieved by this function.
sizeInBytes – [in] Size of the
bufbuffer (in bytes) for verification.sizeWritten – [out] Valid only when the return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero, then sizeWritten is the number of bytes actually written. If sizeInBytes is 0, then sizeWritten is the number of bytes needed to write the full contents.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the attribute’s value was successfully written to user memory.
HIPBLAS_STATUS_INVALID_VALUE – If
sizeInBytesis 0 andsizeWrittenis NULL, or ifsizeInBytesis non-zero andbufis NULL, orsizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
hipblasLtMatmulDescCreate()#
-
hipblasStatus_t hipblasLtMatmulDescCreate(hipblasLtMatmulDesc_t *matmulDesc, hipblasComputeType_t computeType, hipDataType scaleType)#
Create a matrix multiply descriptor.
This function creates a matrix multiply descriptor by allocating the memory needed to hold its opaque structure.
- Parameters:
matmulDesc – [out] Pointer to the structure holding the matrix multiply descriptor created by this function. See hipblasLtMatmulDesc_t.
computeType – [in] Enumerant that specifies the data precision for the matrix multiply descriptor this function creates. See hipblasComputeType_t.
scaleType – [in] Enumerant that specifies the data precision for the matrix transform descriptor this function creates. See hipDataType.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the descriptor was created successfully.
HIPBLAS_STATUS_ALLOC_FAILED – If the memory could not be allocated.
hipblasLtMatmulDescDestroy()#
-
hipblasStatus_t hipblasLtMatmulDescDestroy(const hipblasLtMatmulDesc_t matmulDesc)#
Destroy a matrix multiply descriptor.
This function destroys a previously created matrix multiply descriptor object.
- Parameters:
matmulDesc – [in] Pointer to the structure holding the matrix multiply descriptor to be destroyed by this function. See hipblasLtMatmulDesc_t.
- Return values:
HIPBLAS_STATUS_SUCCESS – If operation was successful.
hipblasLtMatmulDescSetAttribute()#
-
hipblasStatus_t hipblasLtMatmulDescSetAttribute(hipblasLtMatmulDesc_t matmulDesc, hipblasLtMatmulDescAttributes_t attr, const void *buf, size_t sizeInBytes)#
Set attribute to a matrix multiply descriptor.
This function sets the value of the specified attribute belonging to a previously created matrix multiply descriptor.
- Parameters:
matmulDesc – [in] Pointer to the previously created structure holding the matrix multiply descriptor queried by this function. See hipblasLtMatmulDesc_t.
attr – [in] The attribute that will be set by this function. See hipblasLtMatmulDescAttributes_t.
buf – [in] The value to which the specified attribute should be set.
sizeInBytes – [in] Size of the buf buffer (in bytes) for verification.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the attribute was set successfully.
HIPBLAS_STATUS_INVALID_VALUE – If
bufis NULL orsizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
hipblasLtMatmulDescGetAttribute()#
-
hipblasStatus_t hipblasLtMatmulDescGetAttribute(hipblasLtMatmulDesc_t matmulDesc, hipblasLtMatmulDescAttributes_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
Query attribute from a matrix multiply descriptor.
This function returns the value of the queried attribute belonging to a previously created matrix multiply descriptor.
- Parameters:
matmulDesc – [in] Pointer to the previously created structure holding the matrix multiply descriptor queried by this function. See hipblasLtMatmulDesc_t.
attr – [in] The attribute that will be retrieved by this function. See hipblasLtMatmulDescAttributes_t.
buf – [out] Memory address containing the attribute value retrieved by this function.
sizeInBytes – [in] Size of the
bufbuffer (in bytes) for verification.sizeWritten – [out] Valid only when the return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero, then sizeWritten is the number of bytes actually written. If sizeInBytes is 0, then sizeWritten is the number of bytes needed to write the full contents.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the attribute’s value was successfully written to user memory.
HIPBLAS_STATUS_INVALID_VALUE – If
sizeInBytesis 0 andsizeWrittenis NULL, or ifsizeInBytesis non-zero andbufis NULL, orsizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
hipblasLtMatmulPreferenceCreate()#
-
hipblasStatus_t hipblasLtMatmulPreferenceCreate(hipblasLtMatmulPreference_t *pref)#
Create a preference descriptor.
This function creates a matrix multiply heuristic search preferences descriptor by allocating the memory needed to hold its opaque structure.
- Parameters:
pref – [out] Pointer to the structure holding the matrix multiply preferences descriptor created by this function. see hipblasLtMatmulPreference_t.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the descriptor was created successfully.
HIPBLAS_STATUS_ALLOC_FAILED – If memory could not be allocated.
hipblasLtMatmulPreferenceDestroy()#
-
hipblasStatus_t hipblasLtMatmulPreferenceDestroy(const hipblasLtMatmulPreference_t pref)#
Destroy a preference descriptor.
This function destroys a previously created matrix multiply preferences descriptor object.
- Parameters:
pref – [in] Pointer to the structure holding the matrix multiply preferences descriptor to be destroyed by this function. See hipblasLtMatmulPreference_t.
- Return values:
HIPBLAS_STATUS_SUCCESS – If operation was successful.
hipblasLtMatmulPreferenceSetAttribute()#
-
hipblasStatus_t hipblasLtMatmulPreferenceSetAttribute(hipblasLtMatmulPreference_t pref, hipblasLtMatmulPreferenceAttributes_t attr, const void *buf, size_t sizeInBytes)#
Set attribute in a preference descriptor.
This function sets the value of the specified attribute belonging to a previously created matrix multiply preferences descriptor.
- Parameters:
pref – [in] Pointer to the previously created structure holding the matrix multiply preferences descriptor queried by this function. See hipblasLtMatmulPreference_t.
attr – [in] The attribute that will be set by this function. See hipblasLtMatmulPreferenceAttributes_t.
buf – [in] The value to which the specified attribute should be set.
sizeInBytes – [in] Size of the
bufbuffer (in bytes) for verification.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the attribute was set successfully.
HIPBLAS_STATUS_INVALID_VALUE – If
bufis NULL orsizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
hipblasLtMatmulPreferenceGetAttribute()#
-
hipblasStatus_t hipblasLtMatmulPreferenceGetAttribute(hipblasLtMatmulPreference_t pref, hipblasLtMatmulPreferenceAttributes_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
Query attribute from a preference descriptor.
This function returns the value of the queried attribute belonging to a previously created matrix multiply heuristic search preferences descriptor.
- Parameters:
pref – [in] Pointer to the previously created structure holding the matrix multiply heuristic search preferences descriptor queried by this function. See hipblasLtMatmulPreference_t.
attr – [in] The attribute that will be retrieved by this function. See hipblasLtMatmulPreferenceAttributes_t.
buf – [out] Memory address containing the attribute value retrieved by this function.
sizeInBytes – [in] Size of the
bufbuffer (in bytes) for verification.sizeWritten – [out] Valid only when the return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero, then sizeWritten is the number of bytes actually written. If sizeInBytes is 0, then sizeWritten is the number of bytes needed to write the full contents.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the attribute’s value was successfully written to user memory.
HIPBLAS_STATUS_INVALID_VALUE – If
sizeInBytesis 0 andsizeWrittenis NULL, or ifsizeInBytesis non-zero andbufis NULL, orsizeInBytesdoesn’t match the size of the internal storage for the selected attribute.
hipblasLtMatmulAlgoGetHeuristic()#
-
hipblasStatus_t hipblasLtMatmulAlgoGetHeuristic(hipblasLtHandle_t handle, hipblasLtMatmulDesc_t matmulDesc, hipblasLtMatrixLayout_t Adesc, hipblasLtMatrixLayout_t Bdesc, hipblasLtMatrixLayout_t Cdesc, hipblasLtMatrixLayout_t Ddesc, hipblasLtMatmulPreference_t pref, int requestedAlgoCount, hipblasLtMatmulHeuristicResult_t heuristicResultsArray[], int *returnAlgoCount)#
Retrieve the possible algorithms.
This function retrieves the possible algorithms for the matrix multiply operation hipblasLtMatmul() with the given input matrices A, B, and C, and the output matrix D. The output is placed in
heuristicResultsArrayin order of increasing estimated compute time. Note that the wall duration increases if therequestedAlgoCountincreases.- Parameters:
handle – [in] Pointer to the allocated hipBLASLt handle for the hipBLASLt context. See hipblasLtHandle_t.
matmulDesc – [in] Handle to a previously created matrix multiplication descriptor of type hipblasLtMatmulDesc_t.
Adesc, Bdesc, Cdesc, Ddesc – [in] Handles to the previously created matrix layout descriptors of the type hipblasLtMatrixLayout_t.
pref – [in] Pointer to the structure holding the heuristic search preferences descriptor. See hipblasLtMatmulPreference_t.
requestedAlgoCount – [in] Size of the
heuristicResultsArray(in elements). This is the requested maximum number of algorithms to return.heuristicResultsArray[] – [out] Array containing the algorithm heuristics and associated runtime characteristics returned by this function, in order of increasing estimated compute time.
returnAlgoCount – [out] Number of algorithms returned by this function. This is the number of
heuristicResultsArrayelements written.
- Return values:
HIPBLAS_STATUS_SUCCESS – If query was successful. Inspect
heuristicResultsArray[0 to (returnAlgoCount -1)].statefor the status of the results.HIPBLAS_STATUS_NOT_SUPPORTED – If no heuristic function is available for current configuration.
HIPBLAS_STATUS_INVALID_VALUE – If
requestedAlgoCountis less than or equal to zero.
hipblasLtMatmul()#
-
hipblasStatus_t hipblasLtMatmul(hipblasLtHandle_t handle, hipblasLtMatmulDesc_t matmulDesc, const void *alpha, const void *A, hipblasLtMatrixLayout_t Adesc, const void *B, hipblasLtMatrixLayout_t Bdesc, const void *beta, const void *C, hipblasLtMatrixLayout_t Cdesc, void *D, hipblasLtMatrixLayout_t Ddesc, const hipblasLtMatmulAlgo_t *algo, void *workspace, size_t workspaceSizeInBytes, hipStream_t stream)#
Retrieve the possible algorithms.
This function computes the matrix multiplication of matrices A and B to produce the output matrix D, according to the following operation:
D=alpha*(A*B) +beta*(C), whereA,B, andCare input matrices, andalphaandbetaare input scalars. Note: This function supports both in-place matrix multiplication (C == DandCdesc == Ddesc) and out-of-place matrix multiplication (C != D, both matrices must have the same data type, number of rows, number of columns, batch size, and memory order). In the out-of-place case, the leading dimension ofCcan be different from the leading dimension ofD. Specifically, the leading dimension ofCcan be 0 to achieve row or column broadcast. IfCdescis omitted, this function assumes it to be equal toDdesc.- Parameters:
handle – [in] Pointer to the allocated hipBLASLt handle for the hipBLASLt context. See hipblasLtHandle_t.
matmulDesc – [in] Handle to a previously created matrix multiplication descriptor of type hipblasLtMatmulDesc_t.
alpha, beta – [in] Pointers to the scalars used in the multiplication.
Adesc, Bdesc, Cdesc, Ddesc – [in] Handles to the previously created matrix layout descriptors of the type hipblasLtMatrixLayout_t.
A, B, C – [in] Pointers to the GPU memory associated with the corresponding descriptors
Adesc,Bdesc, andCdesc.D – [out] Pointer to the GPU memory associated with the descriptor
Ddesc.algo – [in] Handle for matrix multiplication algorithm to be used. See hipblasLtMatmulAlgo_t. When NULL, an implicit heuristics query with default search preferences will be performed to determine the actual algorithm to use.
workspace – [in] Pointer to the workspace buffer allocated in the GPU memory. Pointer must be 16B aligned (that is, the lowest 4 bits of the address must be 0).
workspaceSizeInBytes – [in] Size of the workspace.
stream – [in] The HIP stream where all GPU work is submitted.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the operation completed successfully.
HIPBLAS_STATUS_EXECUTION_FAILED – If HIP reported an execution error from the device.
HIPBLAS_STATUS_ARCH_MISMATCH – If the configured operation cannot be run using the selected device.
HIPBLAS_STATUS_NOT_SUPPORTED – If the current implementation on the selected device doesn’t support the configured operation.
HIPBLAS_STATUS_INVALID_VALUE – If the parameters are unexpectedly NULL, in conflict, or in an impossible configuration. For example, when workspaceSizeInBytes is less than the workspace required by the configured algorithm.
HIBLAS_STATUS_NOT_INITIALIZED – If the hipBLASLt handle has not been initialized.
Supported data types#
hipblasLtMatmul supports the following computeType, scaleType, Bias type, Atype/Btype, and Ctype/Dtype:
computeType |
scaleType/Bias type |
Atype/Btype |
Ctype/Dtype |
|---|---|---|---|
HIPBLAS_COMPUTE_32F |
HIP_R_32F |
HIP_R_32F |
HIP_R_32F |
HIPBLAS_COMPUTE_32F_FAST_TF32 |
HIP_R_32F |
HIP_R_32F |
HIP_R_32F |
HIPBLAS_COMPUTE_32F |
HIP_R_32F |
HIP_R_16F |
HIP_R_16F |
HIPBLAS_COMPUTE_32F |
HIP_R_32F |
HIP_R_16F |
HIP_R_32F |
HIPBLAS_COMPUTE_32F |
HIP_R_32F |
HIP_R_16BF |
HIP_R_16BF |
For FP8 type Matmul, hipBLASLt supports the type combinations shown in the following table:
This table uses simpler abbreviations:
FP16 means HIP_R_16F
BF16 means HIP_R_16BF
FP32 means HIP_R_32F
FP8 means HIP_R_8F_E4M3
BF8 means HIP_R_8F_E5M2
FP8_FNUZ means HIP_R_8F_E4M3_FNUZ and
BF8_FNUZ means HIP_R_8F_E5M2_FNUZ
The table applies to all transpose types (NN/NT/TT/TN).
Default bias type indicates the type when the bias type is not explicitly specified.
Atype |
Btype |
Ctype |
Dtype |
computeType |
scaleA,B |
scaleC,D |
Bias type |
Default bias type |
|
|---|---|---|---|---|---|---|---|---|---|
FP8 |
FP8 |
FP16 |
FP16 |
FP32 |
Yes |
No |
FP32, FP16 |
FP16 |
|
BF16 |
BF16 |
FP32, BF16 |
BF16 |
||||||
FP32 |
FP32 |
FP32, BF16 |
BF16 |
||||||
FP8 |
FP8 |
Yes |
FP32, FP16 |
FP16 |
|||||
BF8 |
BF8 |
FP32, FP16 |
FP16 |
||||||
BF8 |
FP16 |
FP16 |
No |
FP32, FP16 |
FP16 |
||||
BF16 |
BF16 |
FP32, BF16 |
BF16 |
||||||
FP32 |
FP32 |
FP32, BF16 |
BF16 |
||||||
FP8 |
FP8 |
Yes |
FP32, FP16 |
FP16 |
|||||
BF8 |
BF8 |
FP32, FP16 |
FP16 |
||||||
BF8 |
FP8 |
FP16 |
FP16 |
No |
FP32, FP16 |
FP16 |
|||
BF16 |
BF16 |
FP32, BF16 |
BF16 |
||||||
FP32 |
FP32 |
FP32, BF16 |
BF16 |
||||||
FP8 |
FP8 |
Yes |
FP32, FP16 |
FP16 |
|||||
BF8 |
BF8 |
FP32, FP16 |
FP16 |
||||||
BF8 |
FP16 |
FP16 |
No |
FP32, FP16 |
FP16 |
||||
BF16 |
BF16 |
FP32, BF16 |
BF16 |
||||||
FP32 |
FP32 |
FP32, BF16 |
BF16 |
||||||
FP8 |
FP8 |
Yes |
FP32, FP16 |
FP16 |
|||||
BF8 |
BF8 |
FP32, FP16 |
FP16 |
||||||
To use special data ordering for HIPBLASLT_ORDER_COL16_4R8 and HIPBLASLT_ORDER_COL16_4R16 in hipblasLtMatmul for the gfx94x architecture, choose one of these valid combinations of transposes and orders of input and output matrices:
Atype |
Btype |
CType |
DType |
opA |
opB |
orderA |
orderB |
orderC |
orderD |
|---|---|---|---|---|---|---|---|---|---|
FP8 |
FP8 |
FP16 |
FP16 |
T |
N |
HIPBLASLT_ORDER_COL16_4R16 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
FP8 |
FP8 |
BF16 |
BF16 |
T |
N |
HIPBLASLT_ORDER_COL16_4R16 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
FP16 |
FP16 |
FP32 |
FP32 |
T |
N |
HIPBLASLT_ORDER_COL16_4R8 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
FP16 |
FP16 |
FP16 |
FP16 |
T |
N |
HIPBLASLT_ORDER_COL16_4R8 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
BF16 |
BF16 |
BF16 |
BF16 |
T |
N |
HIPBLASLT_ORDER_COL16_4R8 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
FP8 |
FP8 |
FP16 |
FP16 |
T |
N |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL16_4R16 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
FP8 |
FP8 |
BF16 |
BF16 |
T |
N |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL16_4R16 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
FP16 |
FP16 |
FP32 |
FP32 |
T |
N |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL16_4R8 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
FP16 |
FP16 |
FP16 |
FP16 |
T |
N |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL16_4R8 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
BF16 |
BF16 |
BF16 |
BF16 |
T |
N |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL16_4R8 |
HIPBLASLT_ORDER_COL |
HIPBLASLT_ORDER_COL |
There are restrictions on the supported problem sizes for the HIP_R_4F_E2M1, HIP_R_6F_E2M3, HIP_R_6F_E3M2,
HIP_R_8F_E4M3, and HIP_R_8F_E5M2 data types.
When HIPBLASLT_MATMUL_DESC_A_SCALE_MODE and HIPBLASLT_MATMUL_DESC_B_SCALE_MODE are both set
to HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0, the following restrictions apply:
Atype and Btype can be any combination of:
HIP_R_8F_E4M3,HIP_R_8F_E5M2,HIP_R_6F_E2M3,HIP_R_6F_E3M2, andHIP_R_4F_E2M1.Ctype must be the same as Dtype.
Dtype can be any of the following:
HIP_R_32F,HIP_R_16F, orHIP_R_16BF.m % 16must be equal to0.n % 16must be equal to0.K % 128must be equal to0.Bmust be equal to1.opAmust be equal toT.opBmust be equal toN.Epilogues are not supported.
The scaling data pointed to by
HIPBLASLT_MATMUL_DESC_A_SCALE_POINTERmust be stored in the same order asA.The scaling data pointed to by
HIPBLASLT_MATMUL_DESC_B_SCALE_POINTERmust be stored in the same order asB.
hipblasLtMatrixTransformDescCreate()#
-
hipblasStatus_t hipblasLtMatrixTransformDescCreate(hipblasLtMatrixTransformDesc_t *transformDesc, hipDataType scaleType)#
Create a new matrix transform operation descriptor.
- Return values:
HIPBLAS_STATUS_ALLOC_FAILED – If memory could not be allocated.
HIPBLAS_STATUS_SUCCESS – If the descriptor was created successfully.
hipblasLtMatrixTransformDescDestroy()#
-
hipblasStatus_t hipblasLtMatrixTransformDescDestroy(hipblasLtMatrixTransformDesc_t transformDesc)#
Destroy a matrix transform operation descriptor.
- Return values:
HIPBLAS_STATUS_SUCCESS – If the operation was successful.
hipblasLtMatrixTransformDescSetAttribute()#
-
hipblasStatus_t hipblasLtMatrixTransformDescSetAttribute(hipblasLtMatrixTransformDesc_t transformDesc, hipblasLtMatrixTransformDescAttributes_t attr, const void *buf, size_t sizeInBytes)#
Set a matrix transform operation descriptor attribute.
- Parameters:
transformDesc – [in] The descriptor.
attr – [in] The attribute.
buf – [in] Memory address containing the new value.
sizeInBytes – [in] Size of the buf buffer for verification (in bytes).
- Return values:
HIPBLAS_STATUS_INVALID_VALUE – If buf is NULL or sizeInBytes doesn’t match the size of the internal storage for the selected attribute.
HIPBLAS_STATUS_SUCCESS – If the attribute was set successfully.
hipblasLtMatrixTransformDescGetAttribute()#
-
hipblasStatus_t hipblasLtMatrixTransformDescGetAttribute(hipblasLtMatrixTransformDesc_t transformDesc, hipblasLtMatrixTransformDescAttributes_t attr, void *buf, size_t sizeInBytes, size_t *sizeWritten)#
Gets the matrix transform attribute.
Gets the attribute from the matrix transform operation descriptor.
- Parameters:
transformDesc – [in] The descriptor.
attr – [in] The attribute.
buf – [out] Memory address containing the new value.
sizeInBytes – [in] Size of the buf buffer for verification (in bytes).
sizeWritten – [out] Only valid when return value is HIPBLAS_STATUS_SUCCESS. If sizeInBytes is non-zero, the number of bytes actually written. If sizeInBytes is 0, the number of bytes needed to write the full contents.
- Return values:
HIPBLAS_STATUS_INVALID_VALUE – If sizeInBytes is 0 and sizeWritten is NULL, or sizeInBytes is non-zero and buf is NULL, or sizeInBytes doesn’t match the size of the internal storage for the selected attribute.
HIPBLAS_STATUS_SUCCESS – If the attribute’s value was successfully written to user memory.
hipblasLtMatrixTransform()#
-
hipblasStatus_t hipblasLtMatrixTransform(hipblasLtHandle_t lightHandle, hipblasLtMatrixTransformDesc_t transformDesc, const void *alpha, const void *A, hipblasLtMatrixLayout_t Adesc, const void *beta, const void *B, hipblasLtMatrixLayout_t Bdesc, void *C, hipblasLtMatrixLayout_t Cdesc, hipStream_t stream)#
Matrix layout conversion helper.
The matrix layout conversion helper (
C = alpha * op(A) + beta * op(B)), can be used to change the memory order of the data or to scale and shift the values.- Parameters:
lightHandle – [in] Pointer to the allocated hipBLASLt handle for the hipBLASLt context. See hipblasLtHandle_t.
transformDesc – [in] Pointer to the allocated matrix transform descriptor.
alpha – [in] Pointer to scalar alpha. Pointer to either the host or device address.
A – [in] Pointer to matrix A. Must be a pointer to the device address.
Adesc – [in] Pointer to the layout for input matrix A.
beta – [in] Pointer to scalar beta. Pointer to either the host or device address.
B – [in] Pointer to the layout for matrix B. Must be a pointer to the device address.
Bdesc – [in] Pointer to the layout for input matrix B.
C – [in] Pointer to matrix C. Must be a pointer to the device address.
Cdesc – [out] Pointer to the layout for output matrix C.
stream – [in] The HIP stream where all the GPU work will be submitted.
- Return values:
HIPBLAS_STATUS_NOT_INITIALIZED – If the hipBLASLt handle has not been initialized.
HIPBLAS_STATUS_INVALID_VALUE – If the parameters are in conflict or in an impossible configuration, for example, when A is not NULL but Adesc is NULL.
HIPBLAS_STATUS_NOT_SUPPORTED – If the current implementation on the selected device doesn’t support the configured operation.
HIPBLAS_STATUS_ARCH_MISMATCH – If the configured operation cannot be run using the selected device.
HIPBLAS_STATUS_EXECUTION_FAILED – If HIP reported an execution error from the device.
HIPBLAS_STATUS_SUCCESS – If the operation completed successfully.
hipblasLtMatrixTransform supports the following Atype/Btype/Ctype and scaleType:
Atype/Btype/Ctype |
scaleType |
|---|---|
HIP_R_32F |
HIP_R_32F |
HIP_R_16F |
HIP_R_32F/HIP_R_16F |
HIP_R_16BF |
HIP_R_32F |
HIP_R_8I |
HIP_R_32F |
HIP_R_32I |
HIP_R_32F |