hipBLASLt datatypes reference

hipBLASLt datatypes reference#

hipblasLtEpilogue_t#

enum hipblasLtEpilogue_t#

Specifies the enumeration type to set the postprocessing options for the epilogue.

Values:

enumerator HIPBLASLT_EPILOGUE_DEFAULT#: No special postprocessing. Scale and quantize the results if necessary.

enumerator HIPBLASLT_EPILOGUE_RELU#: Apply ReLU pointwise transform to the results (x:=max(x, 0))

enumerator HIPBLASLT_EPILOGUE_BIAS#: Apply (broadcast) bias from the bias vector. The bias vector length must match the number of rows in matrix D, and it must be packed (so the stride between vector elements is one). The bias vector is broadcast to all columns and added before applying the final postprocessing.

enumerator HIPBLASLT_EPILOGUE_RELU_BIAS#: Apply bias and then ReLU transform.

enumerator HIPBLASLT_EPILOGUE_GELU#: Apply GELU pointwise transform to the results (x:=GELU(x)).

enumerator HIPBLASLT_EPILOGUE_GELU_BIAS#: Apply Bias and then GELU transform.

enumerator HIPBLASLT_EPILOGUE_RELU_AUX#: Output GEMM results before applying RELU transform.

enumerator HIPBLASLT_EPILOGUE_RELU_AUX_BIAS#: Output GEMM results after applying bias but before applying RELU transform.

enumerator HIPBLASLT_EPILOGUE_DRELU#

enumerator HIPBLASLT_EPILOGUE_DRELU_BGRAD#: Apply gradient RELU transform and bias gradient to the results. Requires additional auxiliary input. Apply gradient RELU transform. Requires additional auxiliary input.

enumerator HIPBLASLT_EPILOGUE_GELU_AUX#: Output GEMM results before applying GELU transform.

enumerator HIPBLASLT_EPILOGUE_GELU_AUX_BIAS#: Output GEMM results after applying bias but before applying GELU transform.

enumerator HIPBLASLT_EPILOGUE_DGELU#: Apply gradient GELU transform. Requires additional auxiliary input.

enumerator HIPBLASLT_EPILOGUE_DGELU_BGRAD#: Apply gradient GELU transform and bias gradient to the results. Requires additional auxiliary input.

enumerator HIPBLASLT_EPILOGUE_BGRADA#: Apply bias gradient to A and output GEMM result.

enumerator HIPBLASLT_EPILOGUE_BGRADB#: Apply bias gradient to B and output GEMM result.

enumerator HIPBLASLT_EPILOGUE_SIGMOID#: Apply sigmoid activation function pointwise.

enumerator HIPBLASLT_EPILOGUE_SWISH_EXT#: Apply Swish pointwise transform to the results (x:=Swish(x, 1)).

enumerator HIPBLASLT_EPILOGUE_SWISH_BIAS_EXT#: Apply Bias and then Swish transform.

enumerator HIPBLASLT_EPILOGUE_CLAMP_EXT#: Apply pointwise clamp to the results (x:=max(alpha, min(x, beta))).

enumerator HIPBLASLT_EPILOGUE_CLAMP_BIAS_EXT#: Apply Bias and then clamp.

enumerator HIPBLASLT_EPILOGUE_CLAMP_AUX_EXT#: Output GEMM results before applying clamp transform.

enumerator HIPBLASLT_EPILOGUE_CLAMP_AUX_BIAS_EXT#: Output GEMM results after applying bias but before applying clamp transform.

hipblasLtPointerMode_t#

enum hipblasLtPointerMode_t#

Pointer mode to use for alpha.

Values:

enumerator HIPBLASLT_POINTER_MODE_HOST#: Targets host memory.

enumerator HIPBLASLT_POINTER_MODE_DEVICE#: Targets device memory.

enumerator HIPBLASLT_POINTER_MODE_ALPHA_DEVICE_VECTOR_BETA_HOST#: Alpha pointer targets a device memory vector of length equal to the number of rows of matrix D. Beta is a single value in host memory.

hipblasLtHandle_t#

typedef void *hipblasLtHandle_t#

Handle to the hipBLASLt library context queue.

The hipblasLtHandle_t type is a pointer type to an opaque structure holding the hipBLASLt library context. A handle encapsulates the execution state and manages device-side resources associated with the submitted operations.

A hipBLASLt handle is not safe for concurrent use across multiple HIP streams. Applications must ensure any previously submitted work associated with a handle has completed before reusing that handle on a different stream. For multi-stream execution, create one handle per stream.

Use the following functions to manipulate this library context:

hipblasLtCreate(): To initialize the hipBLASLt library context and return a handle to an opaque structure holding the hipBLASLt library context.

hipblasLtDestroy(): To destroy a previously created hipBLASLt library context descriptor and release the resources.

hipblasLtMatmulAlgo_t#

struct hipblasLtMatmulAlgo_t#

Description of the matrix multiplication algorithm.

This is an opaque structure holding the description of the matrix multiplication algorithm. This structure can be trivially serialized and later restored for use with the same version of the hipBLASLt library to save compute time when selecting the right configuration again.

hipblasLtMatmulDesc_t#

typedef hipblasLtMatmulDescOpaque_t *hipblasLtMatmulDesc_t#

Descriptor of the matrix multiplication operation.

This is a pointer to an opaque structure holding the description of the matrix multiplication operation hipblasLtMatmul(). Use the following functions to manipulate this descriptor:

hipblasLtMatmulDescCreate(): To create one instance of the descriptor.

hipblasLtMatmulDescDestroy(): To destroy a previously created descriptor and release the resources.

hipblasLtOrder_t#

enum hipblasLtOrder_t#

Enumeration for data ordering.

Values:

enumerator HIPBLASLT_ORDER_COL#

Column-major.

Leading dimension is the stride (in elements) to the beginning of the next column in memory.

enumerator HIPBLASLT_ORDER_ROW#

Row-major.

Leading dimension is the stride (in elements) to the beginning of the next row in memory.

enumerator HIPBLASLT_ORDER_COL16_4R32#: Data is ordered in column-major ordered tiles of composite tiles with a total of 32 columns and 128 rows. A tile is composed of 4 inner tiles in column-major with a total of 32 rows and 128 columns. The element offset within the tile is calculated as row%32+32*col+(row/32)*32*32. Note that for this order, the number of columns (rows) of the tensor has to be a multiple of 32(128) or pre-padded to 32(128).

enumerator HIPBLASLT_ORDER_COL16_4R16#: Data is ordered in column-major ordered tiles of composite tiles with a total of 16 columns and 64 rows. A tile is composed of 4 inner tiles in column-major with a total of 16 rows and 16 columns. The element offset within the tile is calculated as row%16+16*col+(row/16)*16*16. Note that for this order, the number of columns (rows) of the tensor has to be a multiple of 16(64) or pre-padded to 16(64).

enumerator HIPBLASLT_ORDER_COL16_4R8#: Data is ordered in column-major ordered tiles of composite tiles with a total of 16 columns and 32 rows. A tile is composed of 4 inner tiles in column-major with a total of 8 rows and 16 columns. Element offset within the tile is calculated as row%8+8*col+(row/8)*16*8. Note that for this order, the number of columns (rows) of the tensor has to be a multiple of 16(32) or pre-padded to 16(32).

enumerator HIPBLASLT_ORDER_COL16_4R4#

enumerator HIPBLASLT_ORDER_COL16_4R2#

hipblasLtMatmulDescAttributes_t#

enum hipblasLtMatmulDescAttributes_t#

Specifies the attributes that define the specifics of the matrix multiply operation.

Values:

enumerator HIPBLASLT_MATMUL_DESC_TRANSA#: Specifies the type of transformation operation that should be performed on matrix A. Default value is HIPBLAS_OP_N (for example, non-transpose operation). See hipblasOperation_t. Data type: int32_t.

enumerator HIPBLASLT_MATMUL_DESC_TRANSB#: Specifies the type of transformation operation that should be performed on matrix B. Default value is HIPBLAS_OP_N (for example, non-transpose operation). See hipblasOperation_t. Data type: int32_t.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE#: Epilogue function. See hipblasLtEpilogue_t. Default value is HIPBLASLT_EPILOGUE_DEFAULT. Data type: uint32_t.

enumerator HIPBLASLT_MATMUL_DESC_BIAS_POINTER#: Bias or bias gradient vector pointer in the device memory. Data type: void* / const void*.

enumerator HIPBLASLT_MATMUL_DESC_BIAS_DATA_TYPE#: Type of the bias vector in the device memory. Can be set the same as the D matrix type or Scale type. Bias case: see HIPBLASLT_EPILOGUE_BIAS. Data type: int32_t based on hipDataType.

enumerator HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER#: Device pointer to the scale factor value that converts data in matrix A to the compute data type range. The scaling factor must have the same type as the compute type. If not specified, or set to NULL, the scaling factor is assumed to be 1. If set for an unsupported matrix data, scale, and compute type combination, calling hipblasLtMatmul() will return HIPBLAS_INVALID_VALUE. Default value: NULL. Data type: void* /const void*.

enumerator HIPBLASLT_MATMUL_DESC_B_SCALE_POINTER#: Equivalent to HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER for matrix B. Default value: NULL. Data type: void* / const void*.

enumerator HIPBLASLT_MATMUL_DESC_C_SCALE_POINTER#: Equivalent to HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER for matrix C. Default value: NULL. Data type: void* / const void*.

enumerator HIPBLASLT_MATMUL_DESC_D_SCALE_POINTER#: Equivalent to HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER for matrix D. Default value: NULL. Data type: void* / const void*.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE_AUX_SCALE_POINTER#: Equivalent to HIPBLASLT_MATMUL_DESC_A_SCALE_POINTER for matrix AUX. Default value: NULL. Data type: void* / const void*.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE_AUX_POINTER#: Epilogue auxiliary buffer pointer in the device memory. Data type: void* / const void*.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE_AUX_LD#: The leading dimension of the epilogue auxiliary buffer pointer in the device memory. Data type: int64_t.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE_AUX_BATCH_STRIDE#: The batch stride of the epilogue auxiliary buffer pointer in the device memory. Data type: int64_t.

enumerator HIPBLASLT_MATMUL_DESC_POINTER_MODE#: Specifies that alpha and beta are passed by reference, whether they are scalars on the host or on the device, or device vectors. Default value is: HIPBLASLT_POINTER_MODE_HOST (on the host). Data type: int32_t based on hipblasLtPointerMode_t.

enumerator HIPBLASLT_MATMUL_DESC_AMAX_D_POINTER#: Device pointer to the memory location that on completion will be set to the maximum of the absolute values in the output matrix. Data type: void* / const void*.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE_AUX_DATA_TYPE#: Type of the auxiliary vector in the device memory. Default value is: HIPBLASLT_DATATYPE_INVALID (using D matrix type). Data type: int32_t based on hipDataType.

enumerator HIPBLASLT_MATMUL_DESC_A_SCALE_MODE#: Scaling mode that defines how the matrix scaling factor for matrix A is interpreted. See hipblasLtMatmulMatrixScale_t.

enumerator HIPBLASLT_MATMUL_DESC_B_SCALE_MODE#: Scaling mode that defines how the matrix scaling factor for matrix B is interpreted. See hipblasLtMatmulMatrixScale_t.

enumerator HIPBLASLT_MATMUL_DESC_COMPUTE_INPUT_TYPE_A_EXT#: Compute input A types. Defines the data type used for the input A of a matrix multiply.

enumerator HIPBLASLT_MATMUL_DESC_COMPUTE_INPUT_TYPE_B_EXT#: Compute input B types. Defines the data type used for the input B of a matrix multiply.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE_ACT_ARG0_EXT#: First extra argument for the activation function. Data type: float.

enumerator HIPBLASLT_MATMUL_DESC_EPILOGUE_ACT_ARG1_EXT#: Second extra argument for the activation function. Data type: float.

enumerator HIPBLASLT_MATMUL_DESC_MAX#

hipblasLtMatmulHeuristicResult_t#

struct hipblasLtMatmulHeuristicResult_t#

Description of the matrix multiplication algorithm.

This is a descriptor that holds the configured matrix multiplication algorithm descriptor and its runtime properties. This structure can be trivially serialized and later restored for use with the same version of the hipBLASLt library to save compute time when selecting the right configuration again.

Param algo:: hipblasLtMatmulAlgo_t struct.
Param workspaceSize:: Actual size of workspace memory required.
Param state:: Result status. The other fields are valid only if, after a call to hipblasLtMatmulAlgoGetHeuristic(), this member is set to HIPBLAS_STATUS_SUCCESS.

hipblasLtMatmulPreference_t#

typedef hipblasLtMatmulPreferenceOpaque_t *hipblasLtMatmulPreference_t#

Descriptor of the matrix multiplication preference.

This is a pointer to an opaque structure holding the description of the preferences for hipblasLtMatmulAlgoGetHeuristic() configuration. Use the following functions to manipulate this descriptor:

hipblasLtMatmulPreferenceCreate(): To create one instance of the descriptor.

hipblasLtMatmulPreferenceDestroy(): To destroy a previously created descriptor and release the resources.

hipblasLtMatmulPreferenceAttributes_t#

enum hipblasLtMatmulPreferenceAttributes_t#

This is an enumerated type used to apply algorithm search preferences while fine-tuning the heuristic function.

Values:

enumerator HIPBLASLT_MATMUL_PREF_SEARCH_MODE#: Search mode. Data type: uint32_t.

enumerator HIPBLASLT_MATMUL_PREF_MAX_WORKSPACE_BYTES#: Maximum allowed workspace memory. Default is 0 (no workspace memory allowed). Data type: uint64_t.

enumerator HIPBLASLT_MATMUL_PREF_MAX#

hipblasLtMatmulMatrixScale_t#

enum hipblasLtMatmulMatrixScale_t#

Block scale mode for A and B.

Values:

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_SCALAR_32F#: Scaling factors are single-precision scalars applied to the whole tensors (this mode is the default for fp8).

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_VEC16_UE4M3#: Not supported yet. Scaling factors are tensors that contain a dedicated scaling factor stored as an 8-bit HIP_R_8F_E4M3 value for each 16-element block in the innermost dimension of the corresponding data tensor.

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_VEC32_UE8M0#: Scaling factors are tensors that contain a dedicated scaling factor stored as an 8-bit R_8F_UE8M0 value for each 32-element block in the innermost dimension of the corresponding data tensor.

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_OUTER_VEC_32F#: Scaling factors are single-precision vectors. This mode is only applicable to matrices A and B, in which case the vectors are expected to have M and N elements respectively, and each (i, j)-th element of product of A and B is multiplied by i-th element of A scale and j-th element of B scale.

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_VEC128_32F#: Not supported yet. Scaling factors are tensors that contain a dedicated FP32 scaling factor for each 128-element block in the innermost dimension of the corresponding data tensor.

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_BLK128x128_32F#: Not supported yet. Scaling factors are tensors that contain a dedicated FP32 scaling factor for each 128x128-element block in the corresponding data tensor.

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_BLK32_UE8M0_32_8_EXT#: Scaling factors are tensors that contain a dedicated 8-bit R_8F_UE8M0 value for each 32-element block in the innermost dimension of the corresponding data tensor. The scale data is pre-swizzled to match the memory access pattern expected by the kernel.

enumerator HIPBLASLT_MATMUL_MATRIX_SCALE_END#

hipblasLtMatrixLayout_t#

typedef hipblasLtMatrixLayoutOpaque_t *hipblasLtMatrixLayout_t#

Descriptor of the matrix layout.

This is a pointer to an opaque structure holding the description of a matrix layout. Use the following functions to manipulate this descriptor:

hipblasLtMatrixLayoutCreate(): To create one instance of the descriptor.

hipblasLtMatrixLayoutDestroy(): To destroy a previously created descriptor and release the resources.

hipblasLtMatrixLayoutAttribute_t#

enum hipblasLtMatrixLayoutAttribute_t#

Specifies the attributes that define the details of the matrix.

Values:

enumerator HIPBLASLT_MATRIX_LAYOUT_BATCH_COUNT#: Number of batches of this matrix. Default value is 1. Data type: int32_t.

enumerator HIPBLASLT_MATRIX_LAYOUT_STRIDED_BATCH_OFFSET#: Stride (in elements) to the next matrix for the strided batch operation. Default value is 0. Data type: int64_t.

enumerator HIPBLASLT_MATRIX_LAYOUT_TYPE#

Data type. See hipDataType.

uint32_t

enumerator HIPBLASLT_MATRIX_LAYOUT_ORDER#

Memory order of the data. See hipblasLtOrder_t.

int32_t, default: HIPBLASLT_ORDER_COL.

enumerator HIPBLASLT_MATRIX_LAYOUT_ROWS#

Number of rows.

Typically only values that can be expressed as int32_t are supported.

uint64_t

enumerator HIPBLASLT_MATRIX_LAYOUT_COLS#

Number of columns.

Typically only values that can be expressed as int32_t are supported.

uint64_t

enumerator HIPBLASLT_MATRIX_LAYOUT_LD#

Matrix leading dimension.

For HIPBLASLT_ORDER_COL, this is the stride (in elements) of the matrix column. For more details and documentation for other memory orders, see the documentation for hipblasLtOrder_t values.

Currently only non-negative values are supported. The value must be large enough so that matrix memory locations are not overlapping (that is, greater or equal to HIPBLASLT_MATRIX_LAYOUT_ROWS in the case of HIPBLASLT_ORDER_COL).

int64_t

hipblasLtMatrixTransformDesc_t#

typedef hipblasLtMatrixTransformDescOpaque_t *hipblasLtMatrixTransformDesc_t#

Opaque descriptor for hipblasLtMatrixTransform() operation details.

The hipblasLtMatrixTransformDesc_t is a pointer to an opaque structure holding the description of a matrix transformation operation.

hipblasLtMatrixTransformDescCreate(): To create one instance of the descriptor.

hipblasLtMatrixTransformDescDestroy(): To destroy a previously created descriptor and release the resources.

hipBLASLt datatypes reference

Contents

hipBLASLt datatypes reference#

hipblasLtEpilogue_t#

hipblasLtPointerMode_t#

hipblasLtHandle_t#

hipblasLtMatmulAlgo_t#

hipblasLtMatmulDesc_t#

hipblasLtOrder_t#

hipblasLtMatmulDescAttributes_t#

hipblasLtMatmulHeuristicResult_t#

hipblasLtMatmulPreference_t#

hipblasLtMatmulPreferenceAttributes_t#

hipblasLtMatmulMatrixScale_t#

hipblasLtMatrixLayout_t#

hipblasLtMatrixLayoutAttribute_t#

hipblasLtMatrixTransformDesc_t#