hipBLASLtExt operation API reference#

hipBLASLt has the following extension operation APIs that are independent of GEMM operations. These extensions support the following:

  • hipblasltExtSoftmax

    Softmax for 2D tensor. It performs softmax on the second dimension of input tensor and assumes the input is contiguous on the second dimension. For sample code, see Softmax for a 2D tensor.

  • hipblasltExtLayerNorm

    Converts a 2D tensor using LayerNorm to generate a new 2D normalized tensor. This is an independent function used to call and get the result. For sample code, see Converting a 2D tensor using LayerNorm.

  • hipblasltExtAMax

    Determines the absolute maximum value of a 2D tensor. This is an independent function used to call and get the result. For sample code, see Absolute maximum value of a 2D Tensor.

These APIs are explained in detail below.

hipblasltExtSoftmax()#

hipblasStatus_t hipblasltExtSoftmax(hipDataType datatype, uint32_t m, uint32_t n, uint32_t dim, void *output, void *input, hipStream_t stream)#

Perform softmax on a given tensor.

This function computes softmax on a given 2D-tensor along a specified dimension.

Parameters:
  • datatype[in] Data type of the input and output tensors. Only supports HIP_R_32F.

  • m[in] The first dimension of the input and output tensors.

  • n[in] The second dimension of the input and output tensors. Only supports values less than or equal to 256.

  • dim[in] Specified dimension to perform softmax on. Currently 1 is the only valid value.

  • input[in] Input tensor buffer.

  • stream[in] The HIP stream where all the GPU work will be submitted.

  • output[out] Output tensor buffer.

Return values:
  • HIPBLAS_STATUS_SUCCESS – If it runs successfully.

  • HIPBLAS_STATUS_INVALID_VALUE – If n is greater than 256.

  • HIPBLAS_STATUS_NOT_SUPPORTED – If dim is not 1 or datatype is not HIP_R_32F.

hipblasltExtLayerNorm()#

hipblasStatus_t hipblasltExtLayerNorm(hipDataType datatype, void *output, void *mean, void *invvar, void *input, uint32_t m, uint32_t n, float eps, void *gamma, void *beta, hipStream_t stream)#

Perform 2-D layernorm on a source input tensor, with the result placed in the output tensor.

This function computes layernorm on a given 2D-tensor.

Parameters:
  • datatype[in] Data type of the input and output tensors. Only supports HIP_R_32F.

  • output[out] Output tensor buffer. Can’t be a nullptr.

  • mean[out] Tensor buffer. Can’t be a nullptr.

  • invvar[out] Tensor buffer. 1 / sqrt(std). Can’t be a nullptr.

  • input[in] Tensor buffer. Can’t be a nullptr.

  • m[in] The first dimension of the input and output tensors.

  • n[in] The second dimension of the input and output tensors.

  • eps[in] For sqrt to avoid inf value.

  • gamma[in] Tensor buffer. nullptr means the calculation doesn’t involve gamma.

  • beta[in] Tensor buffer. nullptr means the calculation doesn’t involve beta.

  • stream[in] The HIP stream where all the GPU work will be submitted.

Return values:
  • HIPBLAS_STATUS_SUCCESS – If it runs successfully.

  • HIPBLAS_STATUS_INVALID_VALUE – If m is greater than 4096.

  • HIPBLAS_STATUS_NOT_SUPPORTED – If datatype is not HIP_R_32F.

hipblasltExtAMax()#

hipblasStatus_t hipblasltExtAMax(const hipDataType datatype, const hipDataType outDatatype, void *output, void *input, uint32_t m, uint32_t n, hipStream_t stream)#

Perform absmax on a given 2-D tensor and output one absmax(tensor) value.

This function computes amax on a given 2D-tensor.

Parameters:
  • datatype[in] Data type of the input tensor. Only supports HIP_R_32F and HIP_R_16F.

  • outDatatype[in] Data type of the output tensor. Only supports HIP_R_32F and HIP_R_16F.

  • output[out] Amax tensor buffer. Can’t be a nullptr.

  • input[in] 2-D tensor buffer. Can’t be a nullptr.

  • m[in] The first dimension of the input and output tensors.

  • n[in] The second dimension of the input and output tensors.

  • stream[in] The HIP stream where all the GPU work will be submitted.

Return values:
  • HIPBLAS_STATUS_SUCCESS – If it runs successfully.

  • HIPBLAS_STATUS_INVALID_VALUE – If m or n is 0, or input or output is nullptr.

  • HIPBLAS_STATUS_NOT_SUPPORTED – If datatype is not HIP_R_32F or HIP_R_16F.