This page contains changes for a test release of ROCm. Read the latest Linux release of ROCm documentation for your production environments.

API reference guide

Contents

API reference guide#

This document provides information about hipTensor APIs, data types, and other programming constructs.

Supported GPU architectures#

List of supported CDNA architectures:

  • gfx908

  • gfx90a

  • gfx940

  • gfx941

  • gfx942

Note

gfx9 = gfx908, gfx90a, gfx940, gfx941, gfx942

gfx940+ = gfx940, gfx941, gfx942

Supported data types#

hipTensor supports the following datatype combinations in API functionality.

Data Types <Ti / To / Tc> = <Input type / Output Type / Compute Type>, where:

  • Input Type = Matrix A / B

  • Output Type = Matrix C / D

  • Compute Type = Math / accumulation type

  • f16 = half-precision floating point

  • bf16 = half-precision brain floating point

  • f32 = single-precision floating point

  • cf32 = complex single-precision floating point

  • f64 = double-precision floating point

  • cf64 = complex double-precision floating point

Note

f16 represents equivalent support for both _Float16 and __half types.

API context

Datatype Support <Ti / To / Tc>

GPU Support

Tensor Rank Support

Contraction (Scale, bilinear)

f16 / f16 / f32

gfx908 gfx90a gfx940+

2m2n2k (Rank4)

3m3n3k (Rank6)

4m4n4k (Rank8)

5m5n5k (Rank10)

6m6n6k (Rank12)

bf16 / bf16 / f32

f32 / f32 / f32

f32 / f32 / f16

f32 / f32 / bf16

cf32 / cf32 / cf32

f64 / f64 / f64

gfx940+

f64 / f64 / f32

cf64 / cf64 / cf64

Permutation

f16 / f16 / -

gfx908 gfx90a gfx940+

Rank2 - Rank6

f16 / f32 / -

f32 / f32 / -

Limitations#

  • hipTensor currently supports tensors up to 2GB in size due to backend address-space limitations.

hipTensor API objects#

hiptensorStatus_t#

enum hiptensorStatus_t#

hipTensor status type enumeration

The type is used to indicate the resulting status of hipTensor library function calls

Values:

enumerator HIPTENSOR_STATUS_SUCCESS#

The operation is successful.

enumerator HIPTENSOR_STATUS_NOT_INITIALIZED#

The handle was not initialized.

enumerator HIPTENSOR_STATUS_ALLOC_FAILED#

Resource allocation failed inside the hipTensor library.

enumerator HIPTENSOR_STATUS_INVALID_VALUE#

Invalid value or parameter was passed to the function (indicates a user error).

enumerator HIPTENSOR_STATUS_ARCH_MISMATCH#

Indicates that the target architecure is not supported, or the device is not ready.

enumerator HIPTENSOR_STATUS_EXECUTION_FAILED#

Indicates the failure of a GPU program or a kernel, which can be caused by multiple reasons.

enumerator HIPTENSOR_STATUS_INTERNAL_ERROR#

An internal error has occurred.

enumerator HIPTENSOR_STATUS_NOT_SUPPORTED#

The requested operation is not supported.

enumerator HIPTENSOR_STATUS_CK_ERROR#

A call to Composable Kernels did not succeed.

enumerator HIPTENSOR_STATUS_HIP_ERROR#

Unknown hipTensor error has occurred.

enumerator HIPTENSOR_STATUS_INSUFFICIENT_WORKSPACE#

The provided workspace was insufficient.

enumerator HIPTENSOR_STATUS_INSUFFICIENT_DRIVER#

Indicates that the driver version is insufficient.

enumerator HIPTENSOR_STATUS_IO_ERROR#

Indicates an error related to file I/O.

hiptensorComputeType_t#

enum hiptensorComputeType_t#

hipTensor compute type enumeration

Values:

enumerator HIPTENSOR_COMPUTE_32F#

Single precision floating point.

enumerator HIPTENSOR_COMPUTE_64F#

Double precision floating point.

enumerator HIPTENSOR_COMPUTE_16F#

Half precision floating point.

enumerator HIPTENSOR_COMPUTE_16BF#

Brain float half precision floating point.

enumerator HIPTENSOR_COMPUTE_C32F#

Complex single precision floating point.

enumerator HIPTENSOR_COMPUTE_C64F#

Complex double precision floating point.

enumerator HIPTENSOR_COMPUTE_NONE#

No type.

hiptensorOperator_t#

enum hiptensorOperator_t#

Element-wise operations.

Values:

enumerator HIPTENSOR_OP_IDENTITY#

Identity operator.

enumerator HIPTENSOR_OP_SQRT#

Square root operator.

enumerator HIPTENSOR_OP_UNKNOWN#

Reserved.

hiptensorAlgo_t#

enum hiptensorAlgo_t#

Tensor contraction kernel selection algorithm.

Values:

enumerator HIPTENSOR_ALGO_ACTOR_CRITIC#

Uses novel actor-critic selection model.

enumerator HIPTENSOR_ALGO_DEFAULT#

Lets the internal heuristic choose.

enumerator HIPTENSOR_ALGO_DEFAULT_PATIENT#

Uses the more accurate and time-consuming model.

hiptensorWorksizePreference_t#

enum hiptensorWorksizePreference_t#

Workspace size selection.

Values:

enumerator HIPTENSOR_WORKSPACE_MIN#

At least one algorithm will be available.

enumerator HIPTENSOR_WORKSPACE_RECOMMENDED#

The most suitable algorithm will be available.

enumerator HIPTENSOR_WORKSPACE_MAX#

All algorithms will be available.

hiptensorLogLevel_t#

enum hiptensorLogLevel_t#

Logging context.

The logger output of certain contexts maybe constrained to these levels

Values:

enumerator HIPTENSOR_LOG_LEVEL_OFF#

No logging.

enumerator HIPTENSOR_LOG_LEVEL_ERROR#

Log errors.

enumerator HIPTENSOR_LOG_LEVEL_PERF_TRACE#

Log performance messages.

enumerator HIPTENSOR_LOG_LEVEL_PERF_HINT#

Log performance hints.

enumerator HIPTENSOR_LOG_LEVEL_HEURISTICS_TRACE#

Log selection messages.

enumerator HIPTENSOR_LOG_LEVEL_API_TRACE#

Log a trace of API calls.

hiptensorHandle_t#

struct hiptensorHandle_t#

hipTensor’s library context

hiptensorTensorDescriptor_t#

struct hiptensorTensorDescriptor_t#

Structure representing a tensor descriptor.

Represents a descriptor for the tensor with the given properties of data type, lengths, strides and element-wise unary operation. Constructed with hiptensorInitTensorDescriptor() function.

Public Members

hipDataType mType#

Data type of the tensors enum selection.

std::vector<std::size_t> mLengths#

Lengths of the tensor.

std::vector<std::size_t> mStrides#

Strides of the tensor.

hiptensorOperator_t mUnaryOp#

Unary operator applied to the tensor.

hiptensorContractionDescriptor_t#

struct hiptensorContractionDescriptor_t#

Structure representing a tensor contraction descriptor.

Represents contraction descriptor with the given properties of internal contraction op (either scale or bilinear), the internal compute type, as well as all of the input tensor descriptors, their alignment requirements and modes. Constructed with hiptensorInitContractionDescriptor() function.

Public Members

int32_t mContractionOpId#

Enum that differentiates the internal contraction operation.

hiptensorComputeType_t mComputeType#

Compute type for the contraction.

std::vector<hiptensorTensorDescriptor_t> mTensorDesc#

Cache of tensor descriptors.

std::vector<uint32_t> mAlignmentReq#

Cache of alignment requirements.

std::vector<std::vector<int32_t>> mTensorMode#

Tensor modes.

hiptensorContractionFind_t#

struct hiptensorContractionFind_t#

hipTensor structure representing the contraction selection algorithm and candidates.

Public Members

hiptensorAlgo_t mSelectionAlgorithm#

Id of the selection algorithm.

std::vector<void*> mCandidates#

A vector of the solver candidates.

hiptensorContractionPlan_t#

struct hiptensorContractionPlan_t#

hipTensor structure representing a contraction plan. Constructed with the hiptensorInitContractionPlan() function.

Public Members

void *mSolution#

Final solution candidate.

hiptensorContractionDescriptor_t mContractionDesc#

Contraction parameters.

Helper functions#

hiptensorCreate#

hiptensorStatus_t hiptensorCreate(hiptensorHandle_t **handle)#

Allocates an instance of hiptensorHandle_t on the heap and updates the handle pointer.

Creates hipTensor handle for the associated device. In order for the hipTensor library to use a different device, set the new device to be used by calling hipInit(0) and then create another hipTensor handle, which will be associated with the new device, by calling hiptensorCreate().

Parameters:

handle[out] Pointer to hiptensorHandle_t pointer

Returns:

HIPTENSOR_STATUS_SUCCESS on success and an error code otherwise

hiptensorDestroy#

hiptensorStatus_t hiptensorDestroy(hiptensorHandle_t *handle)#

De-allocates the instance of hiptensorHandle_t.

Parameters:

handle[out] Pointer to hiptensorHandle_t

Returns:

HIPTENSOR_STATUS_SUCCESS on success and an error code otherwise

hiptensorInitTensorDescriptor#

hiptensorStatus_t hiptensorInitTensorDescriptor(const hiptensorHandle_t *handle, hiptensorTensorDescriptor_t *desc, const uint32_t numModes, const int64_t lens[], const int64_t strides[], hipDataType dataType, hiptensorOperator_t unaryOp)#

Initializes a tensor descriptor.

Parameters:
  • handle[in] Opaque handle holding hipTensor’s library context.

  • desc[out] Pointer to the allocated tensor descriptor object.

  • numModes[in] Number of modes.

  • lens[in] Extent of each mode(lengths) (must be larger than zero).

  • strides[in] stride[i] denotes the displacement (stride) between two consecutive elements in the ith-mode. If stride is NULL, generalized packed column-major memory layout is assumed (i.e., the strides increase monotonically from left to right).

  • dataType[in] Data type of the stored entries.

  • unaryOp[in] Unary operator that will be applied to the tensor.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – The operation completed successfully.

  • HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle is not initialized.

hiptensorGetAlignmentRequirement#

hiptensorStatus_t hiptensorGetAlignmentRequirement(const hiptensorHandle_t *handle, const void *ptr, const hiptensorTensorDescriptor_t *desc, uint32_t *alignmentRequirement)#

Computes the alignment requirement for a given pointer and descriptor.

Parameters:
  • handle[in] Opaque handle holding hipTensor’s library context.

  • ptr[in] Pointer to the respective tensor data.

  • desc[in] Tensor descriptor for ptr data.

  • alignmentRequirement[out] Largest alignment requirement that ptr can fulfill (in bytes).

Return values:
  • HIPTENSOR_STATUS_SUCCESS – The operation completed successfully.

  • HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle is not initialized.

  • HIPTENSOR_STATUS_INVALID_VALUE – if the unsupported parameter is passed.

hiptensorGetErrorString#

const char *hiptensorGetErrorString(const hiptensorStatus_t error)#

Returns the description string for an error code.

Parameters:

error[in] Error code to convert to string.

Return values:

the – error string.

Contraction operations#

hiptensorInitContractionDescriptor#

hiptensorStatus_t hiptensorInitContractionDescriptor(const hiptensorHandle_t *handle, hiptensorContractionDescriptor_t *desc, const hiptensorTensorDescriptor_t *descA, const int32_t modeA[], const uint32_t alignmentRequirementA, const hiptensorTensorDescriptor_t *descB, const int32_t modeB[], const uint32_t alignmentRequirementB, const hiptensorTensorDescriptor_t *descC, const int32_t modeC[], const uint32_t alignmentRequirementC, const hiptensorTensorDescriptor_t *descD, const int32_t modeD[], const uint32_t alignmentRequirementD, hiptensorComputeType_t typeCompute)#

Initializes a contraction descriptor for the tensor contraction problem.

Parameters:
  • handle[in] Opaque handle holding hipTensor’s library context.

  • desc[out] Tensor contraction problem descriptor.

  • descA[in] A descriptor that holds information about tensor A.

  • modeA[in] Array with ‘nmodeA’ entries that represent the modes of A.

  • alignmentRequirementA[in] Alignment reqirement for A’s pointer (in bytes);

  • descB[in] A descriptor that holds information about tensor B.

  • modeB[in] Array with ‘nmodeB’ entries that represent the modes of B.

  • alignmentRequirementB[in] Alignment reqirement for B’s pointer (in bytes);

  • modeC[in] Array with ‘nmodeC’ entries that represent the modes of C.

  • descC[in] A descriptor that holds information about tensor C.

  • alignmentRequirementC[in] Alignment requirement for C’s pointer (in bytes);

  • modeD[in] Array with ‘nmodeD’ entries that represent the modes of D (must be identical to modeC).

  • descD[in] A descriptor that holds information about tensor D (must be identical to descC).

  • alignmentRequirementD[in] Alignment requirement for D’s pointer (in bytes);

  • typeCompute[in] Datatype for the intermediate computation T = A * B.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – Successful completion of the operation.

  • HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle or tensor descriptors are not initialized.

hiptensorInitContractionFind#

hiptensorStatus_t hiptensorInitContractionFind(const hiptensorHandle_t *handle, hiptensorContractionFind_t *find, const hiptensorAlgo_t algo)#

Narrows down the candidates for the contraction problem.

This function gives the user finer control over the candidates that the subsequent call to hiptensorInitContractionPlan is allowed to evaluate. Currently, the backend provides few set of algorithms(DEFAULT).

Parameters:
  • handle[in] Opaque handle holding hipTensor’s library context.

  • find[out] Narrowed set of candidates for the contraction problem.

  • algo[in] Allows users to select a specific algorithm.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – The operation completed successfully.

  • HIPTENSOR_STATUS_NOT_SUPPORTED – If a specified algorithm is not supported

  • HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle or find is not initialized.

hiptensorInitContractionPlan#

hiptensorStatus_t hiptensorInitContractionPlan(const hiptensorHandle_t *handle, hiptensorContractionPlan_t *plan, const hiptensorContractionDescriptor_t *desc, const hiptensorContractionFind_t *find, const uint64_t workspaceSize)#

Initializes the contraction plan for a given tensor contraction problem.

This function creates a contraction plan for the problem by applying hipTensor’s heuristics to select a candidate. The creaated plan can be reused multiple times for the same tensor contraction problem. The plan is created for the active HIP device.

Parameters:
  • handle[in] Opaque handle holding hipTensor’s library context.

  • plan[out] Opaque handle holding the contraction plan (i.e., the algorithm that will be executed, its runtime parameters for the given tensor contraction problem).

  • desc[in] Tensor contraction descriptor.

  • find[in] Narrows down the candidates for the contraction problem.

  • workspaceSize[in] Available workspace size (in bytes).

Return values:
  • HIPTENSOR_STATUS_SUCCESS – If a viable candidate has been found.

  • HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle or find or desc is not initialized.

hiptensorContraction#

hiptensorStatus_t hiptensorContraction(const hiptensorHandle_t *handle, const hiptensorContractionPlan_t *plan, const void *alpha, const void *A, const void *B, const void *beta, const void *C, void *D, void *workspace, uint64_t workspaceSize, hipStream_t stream)#

Computes the tensor contraction.

\[ D = alpha * A * B + beta * C \]

Parameters:
  • handle[in] Opaque handle holding hipTensor’s library context. HIP Device associated with the handle must be same/active at the time,0 the plan was created.

  • plan[in] Opaque handle holding the contraction plan (i.e., the algorithm that will be executed, its runtime parameters for the given tensor contraction problem).

  • alpha[in] Scaling parameter for A*B of data type ‘typeCompute’.

  • A[in] Pointer to A’s data in device memory.

  • B[in] Pointer to B’s data in device memory.

  • beta[in] Scaling parameter for C of data type ‘typeCompute’.

  • C[in] Pointer to C’s data in device memory.

  • D[out] Pointer to D’s data in device memory.

  • workspace[out] Workspace pointer in device memory

  • workspaceSize[in] Available workspace size.

  • stream[in] HIP stream to perform all operations.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – Successful completion of the operation.

  • HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle or pointers are not initialized.

  • HIPTENSOR_STATUS_CK_ERROR – if some unknown composable_kernel (CK) error has occurred (e.g., no instance supported by inputs).

hiptensorContractionGetWorkspaceSize#

hiptensorStatus_t hiptensorContractionGetWorkspaceSize(const hiptensorHandle_t *handle, const hiptensorContractionDescriptor_t *desc, const hiptensorContractionFind_t *find, const hiptensorWorksizePreference_t pref, uint64_t *workspaceSize)#

Computes the size of workspace for a given tensor contraction.

Parameters:
  • handle[in] Opaque handle holding hipTensor’s library context.

  • desc[in] Tensor contraction descriptor.

  • find[in] Narrowed set of candidates for the contraction problem.

  • pref[in] Preference to choose the workspace size.

  • workspaceSize[out] Size of the workspace (in bytes).

Return values:
  • HIPTENSOR_STATUS_SUCCESS – Successful completion of the operation.

  • HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle is not initialized.

  • HIPTENSOR_STATUS_INVALID_VALUE – if some input data is invalid (this typically indicates an user error).

Logging functions#

hiptensorLoggerSetCallback#

hiptensorStatus_t hiptensorLoggerSetCallback(hiptensorLoggerCallback_t callback)#

Registers a callback function that will be invoked by logger calls.

Parameters:

callback[in] This parameter is the callback function pointer provided to the logger.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – if the operation completed successfully.

  • HIPTENSOR_STATUS_INVALID_VALUE – if the given callback is invalid.

hiptensorLoggerSetFile#

hiptensorStatus_t hiptensorLoggerSetFile(FILE *file)#

Registers a file output stream to redirect logging output to.

Note

File stream must be open and writable in text mode.

Parameters:

file[in] This parameter is a file stream pointer provided to the logger.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – if the operation completed successfully.

  • HIPTENSOR_STATUS_IO_ERROR – if the output file is not valid (defaults back to stdout).

hiptensorLoggerOpenFile#

hiptensorStatus_t hiptensorLoggerOpenFile(const char *logFile)#

Redirects log output to a file given by the user.

Parameters:

logFile[in] This parameter is a file name (relative to binary) or full path to redirect logger output.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – if the operation completed successfully.

  • HIPTENSOR_STATUS_IO_ERROR – if the output file is not valid (defaults back to stdout).

hiptensorLoggerSetLevel#

hiptensorStatus_t hiptensorLoggerSetLevel(hiptensorLogLevel_t level)#

User-specified logging level. Logs in other contexts will not be recorded.

Parameters:

level[in] This parameter is the logging level to be enforced.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – if the operation completed successfully.

  • HIPTENSOR_STATUS_INVALID_VALUE – if the given log level is invalid.

hiptensorLoggerSetMask#

hiptensorStatus_t hiptensorLoggerSetMask(int32_t mask)#

User-specified logging mask. A mask may be a binary OR combination of several log levels together. Logs in other contexts will not be recorded.

Parameters:

mask[in] This parameter is the logging mask to be enforced.

Return values:
  • HIPTENSOR_STATUS_SUCCESS – if the operation completed successfully.

  • HIPTENSOR_STATUS_INVALID_VALUE – if the given log mask is invalid.

hiptensorLoggerForceDisable#

hiptensorStatus_t hiptensorLoggerForceDisable()#

Disables logging.

Return values:

HIPTENSOR_STATUS_SUCCESS – if the operation completed successfully.