API reference guide#
This document provides information about hipTensor APIs, data types, and other programming constructs.
Supported GPU architectures#
List of supported CDNA architectures:
gfx908
gfx90a
gfx942
gfx950
Note
gfx9 = gfx908, gfx90a, gfx942, gfx950
gfx942+ = gfx942, gfx950
Supported data types#
hipTensor supports the following datatype combinations in API functionality.
Data Types <Ti / To / Tc> = <Input type / Output Type / Compute Type>, where:
Input Type = Matrix A / B
Output Type = Matrix C / D
Compute Type = Math / accumulation type
f16 = half-precision floating point
bf16 = half-precision brain floating point
f32 = single-precision floating point
cf32 = complex single-precision floating point
f64 = double-precision floating point
cf64 = complex double-precision floating point
Note
f16 represents equivalent support for both _Float16 and __half types.
API context |
Datatype Support <Ti / To / Tc> |
GPU Support |
Tensor Rank Support |
---|---|---|---|
Contraction (Scale, bilinear) |
f16 / f16 / f32 |
gfx908 gfx90a gfx942+ |
2m2n2k (Rank4) 3m3n3k (Rank6) 4m4n4k (Rank8) 5m5n5k (Rank10) 6m6n6k (Rank12) |
bf16 / bf16 / f32 |
|||
f32 / f32 / f32 |
|||
f32 / f32 / f16 |
|||
f32 / f32 / bf16 |
|||
cf32 / cf32 / cf32 |
|||
f64 / f64 / f64 |
gfx942+ |
||
f64 / f64 / f32 |
|||
cf64 / cf64 / cf64 |
|||
Element-wise Operations |
f16 / f16 / - |
gfx908 gfx90a gfx942+ |
Rank2 - Rank6 |
f16 / f32 / - |
|||
f32 / f32 / - |
|||
Reduction |
f16 / f16 / f16 |
gfx908 gfx90a gfx942+ |
Rank2 - Rank6 |
f16 / f16 / f32 |
|||
bf16 / bf16 / bf16 |
|||
bf16 / bf16 / f32 |
|||
f32 / f32 / f32 |
|||
f64 / f64 / f64 |
gfx942+ |
Limitations#
hipTensor currently supports tensors up to 2GB in size due to backend address-space limitations.
hipTensor API objects#
hiptensorDataType_t#
-
enum hiptensorDataType_t#
hipTensor data types
Values:
-
enumerator HIPTENSOR_R_32F#
-
enumerator HIPTENSOR_R_64F#
-
enumerator HIPTENSOR_R_16F#
-
enumerator HIPTENSOR_R_8I#
-
enumerator HIPTENSOR_C_32F#
-
enumerator HIPTENSOR_C_64F#
-
enumerator HIPTENSOR_C_16F#
-
enumerator HIPTENSOR_C_8I#
-
enumerator HIPTENSOR_R_8U#
-
enumerator HIPTENSOR_C_8U#
-
enumerator HIPTENSOR_R_32I#
-
enumerator HIPTENSOR_C_32I#
-
enumerator HIPTENSOR_R_32U#
-
enumerator HIPTENSOR_C_32U#
-
enumerator HIPTENSOR_R_16BF#
-
enumerator HIPTENSOR_C_16BF#
-
enumerator HIPTENSOR_R_4I#
-
enumerator HIPTENSOR_C_4I#
-
enumerator HIPTENSOR_R_4U#
-
enumerator HIPTENSOR_C_4U#
-
enumerator HIPTENSOR_R_16I#
-
enumerator HIPTENSOR_C_16I#
-
enumerator HIPTENSOR_R_16U#
-
enumerator HIPTENSOR_C_16U#
-
enumerator HIPTENSOR_R_64I#
-
enumerator HIPTENSOR_C_64I#
-
enumerator HIPTENSOR_R_64U#
-
enumerator HIPTENSOR_C_64U#
-
enumerator HIPTENSOR_R_32F#
hiptensorStatus_t#
-
enum hiptensorStatus_t#
hipTensor status type enumeration
The type is used to indicate the resulting status of hipTensor library function calls
Values:
-
enumerator HIPTENSOR_STATUS_SUCCESS#
The operation is successful.
-
enumerator HIPTENSOR_STATUS_NOT_INITIALIZED#
The handle was not initialized.
-
enumerator HIPTENSOR_STATUS_ALLOC_FAILED#
Resource allocation failed inside the hipTensor library.
-
enumerator HIPTENSOR_STATUS_INVALID_VALUE#
Invalid value or parameter was passed to the function (indicates a user error).
-
enumerator HIPTENSOR_STATUS_ARCH_MISMATCH#
Indicates that the target architecure is not supported, or the device is not ready.
-
enumerator HIPTENSOR_STATUS_EXECUTION_FAILED#
Indicates the failure of a GPU program or a kernel, which can be caused by multiple reasons.
-
enumerator HIPTENSOR_STATUS_INTERNAL_ERROR#
An internal error has occurred.
-
enumerator HIPTENSOR_STATUS_NOT_SUPPORTED#
The requested operation is not supported.
-
enumerator HIPTENSOR_STATUS_CK_ERROR#
A call to Composable Kernels did not succeed.
-
enumerator HIPTENSOR_STATUS_HIP_ERROR#
Unknown hipTensor error has occurred.
-
enumerator HIPTENSOR_STATUS_INSUFFICIENT_WORKSPACE#
The provided workspace was insufficient.
-
enumerator HIPTENSOR_STATUS_INSUFFICIENT_DRIVER#
Indicates that the driver version is insufficient.
-
enumerator HIPTENSOR_STATUS_IO_ERROR#
Indicates an error related to file I/O.
-
enumerator HIPTENSOR_STATUS_SUCCESS#
hiptensorComputeDescriptor_t#
-
enum hiptensorComputeDescriptor_t#
hipTensor compute type enumeration
Values:
-
enumerator HIPTENSOR_COMPUTE_DESC_32F#
Single precision floating point.
-
enumerator HIPTENSOR_COMPUTE_DESC_64F#
Double precision floating point.
-
enumerator HIPTENSOR_COMPUTE_DESC_16F#
Half precision floating point.
-
enumerator HIPTENSOR_COMPUTE_DESC_16BF#
Brain float half precision floating point.
-
enumerator HIPTENSOR_COMPUTE_DESC_C32F#
Complex single precision floating point.
-
enumerator HIPTENSOR_COMPUTE_DESC_C64F#
Complex double precision floating point.
-
enumerator HIPTENSOR_COMPUTE_DESC_NONE#
No type.
-
enumerator HIPTENSOR_COMPUTE_DESC_32F#
hiptensorOperator_t#
-
enum hiptensorOperator_t#
Element-wise operations.
Values:
-
enumerator HIPTENSOR_OP_IDENTITY#
Identity operator (i.e., elements are not changed)
-
enumerator HIPTENSOR_OP_SQRT#
Square root.
-
enumerator HIPTENSOR_OP_RELU#
Rectified linear unit.
-
enumerator HIPTENSOR_OP_CONJ#
Complex conjugate.
-
enumerator HIPTENSOR_OP_RCP#
Reciprocal.
-
enumerator HIPTENSOR_OP_SIGMOID#
y=1/(1+exp(-x))
-
enumerator HIPTENSOR_OP_TANH#
y=tanh(x)
-
enumerator HIPTENSOR_OP_EXP#
Exponentiation.
-
enumerator HIPTENSOR_OP_LOG#
Log (base e).
-
enumerator HIPTENSOR_OP_ABS#
Absolute value.
-
enumerator HIPTENSOR_OP_NEG#
Negation.
-
enumerator HIPTENSOR_OP_SIN#
Sine.
-
enumerator HIPTENSOR_OP_COS#
Cosine.
-
enumerator HIPTENSOR_OP_TAN#
Tangent.
-
enumerator HIPTENSOR_OP_SINH#
Hyperbolic sine.
-
enumerator HIPTENSOR_OP_COSH#
Hyperbolic cosine.
-
enumerator HIPTENSOR_OP_ASIN#
Inverse sine.
-
enumerator HIPTENSOR_OP_ACOS#
Inverse cosine.
-
enumerator HIPTENSOR_OP_ATAN#
Inverse tangent.
-
enumerator HIPTENSOR_OP_ASINH#
Inverse hyperbolic sine.
-
enumerator HIPTENSOR_OP_ACOSH#
Inverse hyperbolic cosine.
-
enumerator HIPTENSOR_OP_ATANH#
Inverse hyperbolic tangent.
-
enumerator HIPTENSOR_OP_CEIL#
Ceiling.
-
enumerator HIPTENSOR_OP_FLOOR#
Floor.
-
enumerator HIPTENSOR_OP_ADD#
Addition of two elements.
-
enumerator HIPTENSOR_OP_MUL#
Multiplication of two elements.
-
enumerator HIPTENSOR_OP_MAX#
Maximum of two elements.
-
enumerator HIPTENSOR_OP_MIN#
Minimum of two elements.
-
enumerator HIPTENSOR_OP_UNKNOWN#
reserved for internal use only)
-
enumerator HIPTENSOR_OP_IDENTITY#
hiptensorAlgo_t#
-
enum hiptensorAlgo_t#
Tensor contraction kernel selection algorithm.
Values:
-
enumerator HIPTENSOR_ALGO_ACTOR_CRITIC#
Uses novel actor-critic selection model.
-
enumerator HIPTENSOR_ALGO_DEFAULT#
Lets the internal heuristic choose.
-
enumerator HIPTENSOR_ALGO_DEFAULT_PATIENT#
Uses the more accurate and time-consuming model.
-
enumerator HIPTENSOR_ALGO_ACTOR_CRITIC#
hiptensorWorksizePreference_t#
-
enum hiptensorWorksizePreference_t#
Workspace size selection.
Values:
-
enumerator HIPTENSOR_WORKSPACE_MIN#
At least one algorithm will be available.
-
enumerator HIPTENSOR_WORKSPACE_DEFAULT#
The most suitable algorithm will be available.
-
enumerator HIPTENSOR_WORKSPACE_MAX#
All algorithms will be available.
-
enumerator HIPTENSOR_WORKSPACE_MIN#
hiptensorLogLevel_t#
-
enum hiptensorLogLevel_t#
Logging context.
The logger output of certain contexts maybe constrained to these levels
Values:
-
enumerator HIPTENSOR_LOG_LEVEL_OFF#
No logging.
-
enumerator HIPTENSOR_LOG_LEVEL_ERROR#
Log errors.
-
enumerator HIPTENSOR_LOG_LEVEL_PERF_TRACE#
Log performance messages.
-
enumerator HIPTENSOR_LOG_LEVEL_PERF_HINT#
Log performance hints.
-
enumerator HIPTENSOR_LOG_LEVEL_HEURISTICS_TRACE#
Log selection messages.
-
enumerator HIPTENSOR_LOG_LEVEL_API_TRACE#
Log a trace of API calls.
-
enumerator HIPTENSOR_LOG_LEVEL_OFF#
hiptensorOperationDescriptorAttribute_t#
-
enum hiptensorOperationDescriptorAttribute_t#
Values:
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_TAG#
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_SCALAR_TYPE#
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_FLOPS#
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_MOVED_BYTES#
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_PADDING_LEFT#
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_PADDING_RIGHT#
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_PADDING_VALUE#
-
enumerator HIPTENSOR_OPERATION_DESCRIPTOR_TAG#
hiptensorPlanPreferenceAttribute_t#
-
enum hiptensorPlanPreferenceAttribute_t#
Values:
-
enumerator HIPTENSOR_PLAN_PREFERENCE_AUTOTUNE_MODE#
-
enumerator HIPTENSOR_PLAN_PREFERENCE_CACHE_MODE#
-
enumerator HIPTENSOR_PLAN_PREFERENCE_INCREMENTAL_COUNT#
-
enumerator HIPTENSOR_PLAN_PREFERENCE_ALGO#
-
enumerator HIPTENSOR_PLAN_PREFERENCE_KERNEL_RANK#
-
enumerator HIPTENSOR_PLAN_PREFERENCE_JIT#
-
enumerator HIPTENSOR_PLAN_PREFERENCE_AUTOTUNE_MODE#
hiptensorPlanAttribute_t#
hiptensorAutotuneMode_t#
hiptensorCacheMode_t#
hiptensorJitMode_t#
hiptensorLoggerCallback_t#
-
typedef void (*hiptensorLoggerCallback_t)(int32_t logContext, const char *funcName, const char *msg)#
Logging callback The specified callback is invoked whenever logging is enabled and a message is generated.
- Param logContext:
The logging context enum
- Param funcName:
A string holding the function name where the logging message was generated
- Param msg:
A string holding the logging message
hiptensorTensorDescriptor#
-
struct hiptensorTensorDescriptor#
Structure representing a tensor descriptor.
Pointer to hiptensorTensorDescriptor.
Represents a descriptor for the tensor with the given properties of data type, lengths, strides and element-wise unary operation. Constructed with hiptensorInitTensorDescriptor() function.
hiptensorOperationDescriptor#
-
struct hiptensorOperationDescriptor#
Pointer to hiptensorOperationDescriptor.
hiptensorPlanPreference#
-
struct hiptensorPlanPreference#
Pointer to hiptensorPlanPreference.
Helper functions#
hiptensorCreate#
-
hiptensorStatus_t hiptensorCreate(hiptensorHandle_t *handle)#
Allocates and initializes a hipTensor library handle.
This function creates a hipTensor handle associated with the current device. To use a different device, call
hipInit(0)
to set the new device, then create another hipTensor handle withhiptensorCreate()
.- Parameters:
handle – [out] A pointer to the
hiptensorHandle_t
pointer that will store the newly created handle.- Returns:
HIPTENSOR_STATUS_SUCCESS
if the handle is created successfully, otherwise an error code.
hiptensorDestroy#
-
hiptensorStatus_t hiptensorDestroy(hiptensorHandle_t handle)#
Deallocates a hipTensor library handle.
- Parameters:
handle – [out] The
hiptensorHandle_t
to be deallocated.- Returns:
HIPTENSOR_STATUS_SUCCESS
on successful deallocation, otherwise an error code.
hiptensorHandleResizePlanCache#
-
hiptensorStatus_t hiptensorHandleResizePlanCache(hiptensorHandle_t handle, const uint32_t numEntries)#
Resizes the plan cache associated with a hipTensor handle.
- Parameters:
handle – [in] The hipTensor handle.
numEntries – [in] Number of entries the cache will support.
- Returns:
HIPTENSOR_STATUS_SUCCESS
on success, or an error code otherwise.
hiptensorHandleWritePlanCacheToFile#
-
hiptensorStatus_t hiptensorHandleWritePlanCacheToFile(const hiptensorHandle_t handle, const char filename[])#
Writes the plan cache of a hipTensor handle to a file.
- Parameters:
handle – [in] The hipTensor handle whose plan cache will be written.
filename – [in] The name of the file to write the cache to.
- Returns:
HIPTENSOR_STATUS_SUCCESS
on success, or an error code otherwise.
hiptensorHandleReadPlanCacheFromFile#
-
hiptensorStatus_t hiptensorHandleReadPlanCacheFromFile(hiptensorHandle_t handle, const char filename[], uint32_t *numCachelinesRead)#
Reads a plan cache from a file into a hipTensor handle.
- Parameters:
handle – [in] The hipTensor handle to populate with the plan cache.
filename – [in] The name of the file to read the cache from.
numCachelinesRead – [out] On exit, this variable will hold the number of successfully-read cachelines.
- Returns:
HIPTENSOR_STATUS_SUCCESS
on success, or an error code otherwise.
hiptensorWriteKernelCacheToFile#
-
hiptensorStatus_t hiptensorWriteKernelCacheToFile(const hiptensorHandle_t handle, const char filename[])#
Writes the kernel cache of a hipTensor handle to a file.
- Parameters:
handle – [in] The hipTensor handle whose kernel cache will be written.
filename – [in] The name of the file to write the cache to.
- Returns:
HIPTENSOR_STATUS_SUCCESS
on success, or an error code otherwise.
hiptensorReadKernelCacheFromFile#
-
hiptensorStatus_t hiptensorReadKernelCacheFromFile(hiptensorHandle_t handle, const char filename[])#
Reads a kernel cache from a file into a hipTensor handle.
- Parameters:
handle – [in] The hipTensor handle to populate with the kernel cache.
filename – [in] The name of the file to read the cache from.
- Returns:
HIPTENSOR_STATUS_SUCCESS
on success, or an error code otherwise.
hiptensorCreateTensorDescriptor#
-
hiptensorStatus_t hiptensorCreateTensorDescriptor(const hiptensorHandle_t handle, hiptensorTensorDescriptor_t *desc, const uint32_t numModes, const int64_t lens[], const int64_t strides[], hiptensorDataType_t dataType, uint32_t alignmentRequirement)#
Creates and initializes a tensor descriptor.
This function allocates an instance of
hiptensorTensorDescriptor_t
. CallhiptensorDestroyTensorDescriptor()
to free this instance.- Parameters:
handle – [in] An opaque handle representing the hipTensor library context.
desc – [out] A pointer to the
hiptensorTensorDescriptor_t
object to be allocated.numModes – [in] The number of modes (dimensions) for the tensor.
lens – [in] An array specifying the extent (length) of each mode; all values must be greater than zero.
strides – [in] An array where
strides[i]
is the displacement between consecutive elements in the i-th mode. IfNULL
, a generalized packed column-major memory layout is assumed (strides increase monotonically from left to right).dataType – [in] The data type of the tensor elements.
alignmentRequirement – [in] The memory alignment requirement for the tensor.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completes successfully.
`HIPTENSOR_STATUS_NOT_INITIALIZED` – if the handle is not initialized.
`HIPTENSOR_STATUS_ARCH_MISMATCH` – if the data type is not supported.
`HIPTENSOR_STATUS_INVALID_VALUE` – if any parameters are invalid.
hiptensorDestroyTensorDescriptor#
-
hiptensorStatus_t hiptensorDestroyTensorDescriptor(hiptensorTensorDescriptor_t desc)#
Destroys a tensor descriptor.
- Parameters:
desc – [in] A pointer to the tensor descriptor object to be deallocated.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completes successfully.
hiptensorDestroyOperationDescriptor#
-
hiptensorStatus_t hiptensorDestroyOperationDescriptor(hiptensorOperationDescriptor_t desc)#
Releases all resources linked to a
hiptensorOperationDescriptor
object.- Parameters:
desc – [inout] The
hiptensorOperationDescriptor_t
object to deallocate.- Return values:
`HIPTENSOR_STATUS_SUCCESS` – when successful, otherwise returns an error code.
hiptensorOperationDescriptorSetAttribute#
-
hiptensorStatus_t hiptensorOperationDescriptorSetAttribute(const hiptensorHandle_t handle, hiptensorOperationDescriptor_t desc, hiptensorOperationDescriptorAttribute_t attr, const void *buf, size_t sizeInBytes)#
Configures an attribute in a
hiptensorOperationDescriptor_t
object.- Parameters:
handle – [in] Opaque handle for the hipTensor library context.
desc – [in] The
hiptensorOperationDescriptor_t
object being modified.attr – [in] The attribute to configure.
buf – [in] Pointer to the buffer containing the attribute’s new value.
sizeInBytes – [in] Size of the
buf
in bytes.
- Returns:
HIPTENSOR_STATUS_SUCCESS
when successful, otherwise returns an error code.
hiptensorOperationDescriptorGetAttribute#
-
hiptensorStatus_t hiptensorOperationDescriptorGetAttribute(const hiptensorHandle_t handle, hiptensorOperationDescriptor_t desc, hiptensorOperationDescriptorAttribute_t attr, void *buf, size_t sizeInBytes)#
Extracts an attribute from a
hiptensorOperationDescriptor_t
object.- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
desc – [in] The
hiptensorOperationDescriptor_t
object to examine.attr – [in] The attribute to extract.
buf – [out] Pointer to the buffer where the attribute value will be written.
sizeInBytes – [in] The buffer size in bytes.
- Returns:
HIPTENSOR_STATUS_SUCCESS
when successful, otherwise returns an error code.
hiptensorCreatePlanPreference#
-
hiptensorStatus_t hiptensorCreatePlanPreference(const hiptensorHandle_t handle, hiptensorPlanPreference_t *pref, hiptensorAlgo_t algo, hiptensorJitMode_t jitMode)#
Creates a
hiptensorPlanPreference_t
object that lets users limit kernel options for a plan/operation.- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
pref – [out] Pointer to the new
hiptensorPlanPreference_t
structure.algo – [in] Controls algorithm selection. Use
HIPTENSOR_ALGO_DEFAULT
to let the heuristic choose. ReturnsHIPTENSOR_STATUS_NOT_SUPPORTED
if the specified algorithm isn’t available.jitMode – [in] Controls whether hipTensor can use JIT-compiled kernels.
- Return values:
HIPTENSOR_STATUS_SUCCESS – When the operation completes successfully.
hiptensorDestroyPlanPreference#
-
hiptensorStatus_t hiptensorDestroyPlanPreference(hiptensorPlanPreference_t pref)#
Releases all resources associated with a
hiptensorPlanPreference_t
object.- Parameters:
pref – [inout] The
hiptensorPlanPreference_t
object to deallocate.- Return values:
`HIPTENSOR_STATUS_SUCCESS` – when successful, otherwise returns an error code.
hiptensorPlanPreferenceSetAttribute#
-
hiptensorStatus_t hiptensorPlanPreferenceSetAttribute(const hiptensorHandle_t handle, hiptensorPlanPreference_t pref, hiptensorPlanPreferenceAttribute_t attr, const void *buf, size_t sizeInBytes)#
Configures an attribute in a
hiptensorPlanPreference_t
object.- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
pref – [inout] Opaque structure that narrows the search space for viable kernel candidates.
attr – [in] The attribute to configure.
buf – [in] Buffer (of size
sizeInBytes
) containing the new value forattr
.sizeInBytes – [in] Size of
buf
in bytes.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – When the operation completes successfully.
`HIPTENSOR_STATUS_NOT_INITIALIZED` – When the handle isn’t initialized.
`HIPTENSOR_STATUS_INVALID_VALUE` – When input data is invalid (typically user error).
hiptensorPlanGetAttribute#
-
hiptensorStatus_t hiptensorPlanGetAttribute(const hiptensorHandle_t handle, const hiptensorPlan_t plan, hiptensorPlanAttribute_t attr, void *buf, size_t sizeInBytes)#
Fetches information from an existing plan.
- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
plan – [in] The existing plan (created via
hiptensorCreatePlan
orhiptensorCreatePlanAutotuned
).attr – [in] The attribute to retrieve.
buf – [out] Buffer that will contain the requested attribute information upon successful return.
sizeInBytes – [in] Size of
buf
in bytes.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – When the operation completes successfully.
`HIPTENSOR_STATUS_INVALID_VALUE` – When input data is invalid (typically user error).
hiptensorEstimateWorkspaceSize#
-
hiptensorStatus_t hiptensorEstimateWorkspaceSize(const hiptensorHandle_t handle, const hiptensorOperationDescriptor_t desc, const hiptensorPlanPreference_t planPref, const hiptensorWorksizePreference_t workspacePref, uint64_t *workspaceSizeEstimate)#
Calculates the workspace size needed for a specific operation.
- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
desc – [in] Opaque structure encoding the operation details.
planPref – [in] Opaque structure limiting the viable candidate space.
workspacePref – [in] Parameter that affects workspace size calculation.
workspaceSizeEstimate – [out] The estimated workspace size in bytes needed for the operation.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – When the operation completes successfully.
`HIPTENSOR_STATUS_NOT_INITIALIZED` – When the handle isn’t initialized.
`HIPTENSOR_STATUS_INVALID_VALUE` – When input data is invalid (typically user error).
hiptensorCreatePlan#
-
hiptensorStatus_t hiptensorCreatePlan(const hiptensorHandle_t handle, hiptensorPlan_t *plan, const hiptensorOperationDescriptor_t desc, const hiptensorPlanPreference_t pref, uint64_t workspaceSizeLimit)#
Creates a
hiptensorPlan_t
object that selects an appropriate kernel for an operation and prepares execution.Uses hipTensor’s heuristic to select a kernel for operations created by functions like
hiptensorCreateContraction
,hiptensorCreateReduction
,hiptensorCreatePermutation
,hiptensorCreateElementwiseBinary
, orhiptensorCreateElementwiseTrinary
. The resulting plan can be passed to the correspondinghiptensor*Execute
function to perform the operation. The plan is created for the currently active HIP device.- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
plan – [out] Pointer to the
hiptensorPlan_t
structure that will contain all execution information (including selected kernel).desc – [in] Opaque structure encoding the operation details.
pref – [in] Opaque structure limiting the applicable kernels. May be
nullptr
to use defaults.workspaceSizeLimit – [in] Maximum workspace size in bytes that the operation may use.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – When a viable kernel is found.
`HIPTENSOR_STATUS_NOT_SUPPORTED` – When no viable kernel can be found.
`HIPTENSOR_STATUS_NOT_INITIALIZED` – When the handle isn’t initialized.
`HIPTENSOR_STATUS_INSUFFICIENT_WORKSPACE` – When the provided workspace is too small.
`HIPTENSOR_STATUS_INVALID_VALUE` – When input data is invalid (typically user error).
hiptensorDestroyPlan#
-
hiptensorStatus_t hiptensorDestroyPlan(hiptensorPlan_t plan)#
Releases all resources associated with a plan.
- Parameters:
plan – [inout] The
hiptensorPlan_t
object to deallocate.- Return values:
`HIPTENSOR_STATUS_SUCCESS` – when successful, otherwise returns an error code.
hiptensorGetErrorString#
-
const char *hiptensorGetErrorString(const hiptensorStatus_t error)#
Returns a descriptive string for a given error code.
- Parameters:
error – [in] The error code to convert to a string.
- Return values:
ErrorString – A string describing the error.
hiptensorGetVersion#
-
inline size_t hiptensorGetVersion()#
Returns the version number of hipTensor.
Return the version with three least significant digits for patch version, the next three digits for minor version, and the most significant digits for major version.
- Returns:
The version number calculated as major * 10000 + minor * 100 + patch.
hiptensorGetHiprtVersion#
-
int hiptensorGetHiprtVersion()#
Queries the HIP runtime version.
- Return values:
-1 – If the operation failed.
Integer – An integer representing the HIP runtime version if the operation succeeded.
Contraction operations#
hiptensorCreateContraction#
-
hiptensorStatus_t hiptensorCreateContraction(const hiptensorHandle_t handle, hiptensorOperationDescriptor_t *desc, const hiptensorTensorDescriptor_t descA, const int32_t modeA[], hiptensorOperator_t opA, const hiptensorTensorDescriptor_t descB, const int32_t modeB[], hiptensorOperator_t opB, const hiptensorTensorDescriptor_t descC, const int32_t modeC[], hiptensorOperator_t opC, const hiptensorTensorDescriptor_t descD, const int32_t modeD[], const hiptensorComputeDescriptor_t descCompute)#
Allocates and initializes a
hiptensorOperationDescriptor
object for a tensor contraction of the form \(D = \alpha \mathcal{A} \mathcal{B} + \beta \mathcal{C}\).Free this object by calling
hiptensorDestroyOperationDescriptor()
.- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
desc – [out] Pointer to the
hiptensorOperationDescriptor_t
object that will be allocated and populated with contraction operation details.descA – [in] Tensor descriptor for A, specifying data type, modes, and strides.
modeA – [in] Array with
nmodeA
entries representing tensor A’s modes. EachmodeA[i]
corresponds to theextent[i]
andstride[i]
fromhiptensorInitTensorDescriptor
.opA – [in] Unary operator applied to each element of A before processing. A’s original data remains unchanged.
descB – [in] Tensor descriptor for B.
modeB – [in] Array with
nmodeB
entries representing tensor B’s modes.opB – [in] Unary operator applied to each element of B.
descC – [in] Tensor descriptor for C.
modeC – [in] Array with
nmodeC
entries representing tensor C’s modes.opC – [in] Unary operator applied to each element of C.
descD – [in] Tensor descriptor for D (must match
descC
).modeD – [in] Array with
nmodeD
entries representing tensor D’s modes (must matchmodeC
).descCompute – [in] Data type used for intermediate computation of \(T = A * B\).
- Return values:
`HIPTENSOR_STATUS_NOT_SUPPORTED` – When data type combinations or operations aren’t supported.
`HIPTENSOR_STATUS_INVALID_VALUE` – When tensor dimensions or modes contain illegal values.
`HIPTENSOR_STATUS_SUCCESS` – When the operation completes successfully.
`HIPTENSOR_STATUS_NOT_INITIALIZED` – When the handle isn’t initialized.
hiptensorContract#
-
hiptensorStatus_t hiptensorContract(const hiptensorHandle_t handle, const hiptensorPlan_t plan, const void *alpha, const void *A, const void *B, const void *beta, const void *C, void *D, void *workspace, uint64_t workspaceSize, hipStream_t stream)#
Performs tensor contraction \(D = \alpha \mathcal{A} \mathcal{B} + \beta \mathcal{C}\).
Computes: \(\mathcal{D}_{{modes}_\mathcal{D}} \gets \alpha * \mathcal{A}_{{modes}_\mathcal{A}} B_{{modes}_\mathcal{B}} + \beta \mathcal{C}_{{modes}_\mathcal{C}}\). The active HIP device must match the device that was active during plan creation.
- Parameters:
handle – [in] Opaque handle representing the hipTensor library context.
plan – [in] Opaque handle containing the contraction execution plan.
alpha – [in] Scaling factor for \(A*B\). Data type determined by
descCompute
. Pointer to host memory.A – [in] Pointer to tensor A data in GPU-accessible memory. Must not overlap with elements written to D.
B – [in] Pointer to tensor B data in GPU-accessible memory. Must not overlap with elements written to D.
beta – [in] Scaling factor for C. Data type determined by
descCompute
. Pointer to host memory.C – [in] Pointer to tensor C data in GPU-accessible memory.
D – [out] Pointer to tensor D data in GPU-accessible memory.
workspace – [out] Optional parameter (can be
NULL
). Additional device memory workspace for optimizations.workspaceSize – [in] Size of the
workspace
array in bytes.stream – [in] HIP stream for all computations.
- Return values:
`HIPTENSOR_STATUS_NOT_SUPPORTED` – When the operation isn’t supported.
`HIPTENSOR_STATUS_INVALID_VALUE` – When input data is invalid (typically user error).
`HIPTENSOR_STATUS_SUCCESS` – When the operation completes successfully.
`HIPTENSOR_STATUS_NOT_INITIALIZED` – When the handle isn’t initialized.
`HIPTENSOR_STATUS_ARCH_MISMATCH` – When the plan was created for a different device than the currently active one.
`HIPTENSOR_STATUS_INSUFFICIENT_DRIVER` – When the driver is insufficient.
Element-wise operations#
hiptensorCreatePermutation#
-
hiptensorStatus_t hiptensorCreatePermutation(const hiptensorHandle_t handle, hiptensorOperationDescriptor_t *desc, const hiptensorTensorDescriptor_t descA, const int32_t modeA[], hiptensorOperator_t opA, const hiptensorTensorDescriptor_t descB, const int32_t modeB[], const hiptensorComputeDescriptor_t descCompute)#
Creates an operation descriptor for tensor permutation.
- Parameters:
handle – [in] Opaque handle containing the hipTENSOR library context.
desc – [out] Opaque structure that will be allocated and filled with the encoded permutation information.
descA – [in] Descriptor containing information about A’s data type, modes, and strides.
modeA – [in] Array of size descA->numModes containing the mode names of A.
opA – [in] Unary operator applied to each element of A before further processing. The original tensor data remains unchanged.
descB – [in] Descriptor containing information about B’s data type, modes, and strides.
modeB – [in] Array of size descB->numModes containing the mode names of B.
descCompute – [in] Determines the precision used for this operation.
- Return values:
HIPTENSOR_STATUS_NOT_SUPPORTED – When data type combinations or operations aren’t supported
HIPTENSOR_STATUS_INVALID_VALUE – When tensor dimensions or modes contain illegal values
HIPTENSOR_STATUS_SUCCESS – When the operation completes successfully
HIPTENSOR_STATUS_NOT_INITIALIZED – When the handle isn’t initialized.
hiptensorPermute#
-
hiptensorStatus_t hiptensorPermute(const hiptensorHandle_t handle, const hiptensorPlan_t plan, const void *alpha, const void *A, void *B, const hipStream_t stream)#
Executes tensor permutation.
Computes the permutation operation:
\[ B_{\Pi^B(i_0,i_1,...,i_n)} = \alpha \Psi(A_{\Pi^A(i_0,i_1,...,i_n)}) \]- Parameters:
handle – [in] Opaque handle containing hipTensor’s library context.
plan – [in] Opaque handle with permutation information.
alpha – [in] Scaling factor for A (typeScalar type). Pointer to host memory.
A – [in] Multi-mode tensor (typeA type) with nmodeA modes. Pointer to GPU-accessible memory.
B – [inout] Multi-mode tensor (typeB type) with nmodeB modes. Pointer to GPU-accessible memory.
stream – [in] HIP stream for all operations.
- Return values:
HIPTENSOR_STATUS_NOT_SUPPORTED – When data type combinations or operations aren’t supported
HIPTENSOR_STATUS_INVALID_VALUE – When tensor dimensions or modes contain illegal values
HIPTENSOR_STATUS_SUCCESS – When the operation completes successfully
HIPTENSOR_STATUS_NOT_INITIALIZED – When the handle isn’t initialized.
hiptensorCreateElementwiseBinary#
-
hiptensorStatus_t hiptensorCreateElementwiseBinary(const hiptensorHandle_t handle, hiptensorOperationDescriptor_t *desc, const hiptensorTensorDescriptor_t descA, const int32_t modeA[], hiptensorOperator_t opA, const hiptensorTensorDescriptor_t descC, const int32_t modeC[], hiptensorOperator_t opC, const hiptensorTensorDescriptor_t descD, const int32_t modeD[], hiptensorOperator_t opAC, const hiptensorComputeDescriptor_t descCompute)#
Creates an operation descriptor for elementwise binary operations.
- Parameters:
handle – [in] Opaque handle containing hipTensor’s library context.
desc – [out] Opaque structure allocated and filled with the elementwise operation information.
descA – [in] Descriptor containing A’s data type, modes, and strides.
modeA – [in] Host memory array of size descA->numModes with A’s mode names.
opA – [in] Unary operator applied to each element of A before processing. A’s original data remains unchanged.
descC – [in] Descriptor containing C’s data type, modes, and strides.
modeC – [in] Host memory array of size descC->numModes with C’s mode names.
opC – [in] Unary operator applied to each element of C before processing. C’s original data remains unchanged.
descD – [in] Descriptor containing D’s data type, modes, and strides. Currently must be identical to descC.
modeD – [in] Host memory array of size descD->numModes with D’s mode names.
opAC – [in] Element-wise binary operator.
descCompute – [in] Determines the precision for this operation.
- Return values:
HIPTENSOR_STATUS_NOT_SUPPORTED – When data type combinations or operations aren’t supported
HIPTENSOR_STATUS_INVALID_VALUE – When tensor dimensions or modes contain illegal values
HIPTENSOR_STATUS_SUCCESS – When the operation completes successfully
HIPTENSOR_STATUS_NOT_INITIALIZED – When the handle isn’t initialized.
hiptensorElementwiseBinaryExecute#
-
hiptensorStatus_t hiptensorElementwiseBinaryExecute(const hiptensorHandle_t handle, const hiptensorPlan_t plan, const void *alpha, const void *A, const void *gamma, const void *C, void *D, hipStream_t stream)#
Executes element-wise tensor operation on two input tensors.
This function computes the element-wise operation:
\[ D_{\Pi^C(i_0,i_1,...,i_n)} = \Phi_{AC}(\alpha \Psi_A(A_{\Pi^A(i_0,i_1,...,i_n)}), \gamma \Psi_C(C_{\Pi^C(i_0,i_1,...,i_n)})) \]where:\(D\) is the output tensor.
\(A\) and \(C\) are the input tensors.
\(\alpha\) and \(\gamma\) are scalar scaling factors.
\(\Psi_A\) and \(\Psi_C\) are unary operators (applied only if \(\alpha\) and \(\gamma\) are non-zero).
\(\Phi_{AC}\) is a binary element-wise operator.
\(\Pi^A\) and \(\Pi^C\) represent mode permutations.
- Parameters:
handle – [in] Opaque handle containing hipTensor’s library context.
plan – [in] Opaque handle with elementwise operation information.
alpha – [in] Scaling factor for A. Host memory pointer.
A – [in] Multi-mode tensor in GPU-accessible memory. Must not overlap with elements written to D.
gamma – [in] Scaling factor for C. Host memory pointer.
C – [in] Multi-mode tensor in GPU-accessible memory. Must not overlap with elements written to D.
D – [out] Multi-mode tensor in GPU-accessible memory. C and D may be identical only if descC == descD.
stream – [in] Stream for performing the operation.
- Return values:
HIPTENSOR_STATUS_NOT_SUPPORTED – When data type combinations or operations aren’t supported
HIPTENSOR_STATUS_INVALID_VALUE – When tensor dimensions or modes contain illegal values
HIPTENSOR_STATUS_SUCCESS – When the operation completes successfully
HIPTENSOR_STATUS_NOT_INITIALIZED – When the handle isn’t initialized.
hiptensorCreateElementwiseTrinary#
-
hiptensorStatus_t hiptensorCreateElementwiseTrinary(const hiptensorHandle_t handle, hiptensorOperationDescriptor_t *desc, const hiptensorTensorDescriptor_t descA, const int32_t modeA[], hiptensorOperator_t opA, const hiptensorTensorDescriptor_t descB, const int32_t modeB[], hiptensorOperator_t opB, const hiptensorTensorDescriptor_t descC, const int32_t modeC[], hiptensorOperator_t opC, const hiptensorTensorDescriptor_t descD, const int32_t modeD[], hiptensorOperator_t opAB, hiptensorOperator_t opABC, const hiptensorComputeDescriptor_t descCompute)#
Creates an operation descriptor for elementwise trinary operations.
- Parameters:
handle – [in] Opaque handle containing hipTensor’s library context.
desc – [out] Opaque structure allocated and filled with the elementwise operation information.
descA – [in] Descriptor containing A’s data type, modes, and strides.
modeA – [in] Host memory array of size descA->numModes with A’s mode names.
opA – [in] Unary operator applied to each element of A before processing. A’s original data remains unchanged.
descB – [in] Descriptor containing B’s data type, modes, and strides.
modeB – [in] Host memory array of size descB->numModes with B’s mode names.
opB – [in] Unary operator applied to each element of B before processing. B’s original data remains unchanged.
descC – [in] Descriptor containing C’s data type, modes, and strides.
modeC – [in] Host memory array of size descC->numModes with C’s mode names.
opC – [in] Unary operator applied to each element of C before processing. C’s original data remains unchanged.
descD – [in] Descriptor containing D’s data type, modes, and strides. Currently must be identical to descC.
modeD – [in] Host memory array of size descD->numModes with D’s mode names.
opAB – [in] Element-wise binary operator.
opABC – [in] Element-wise binary operator.
descCompute – [in] Determines the precision for this operation.
- Return values:
HIPTENSOR_STATUS_SUCCESS – When the operation completes successfully.
HIPTENSOR_STATUS_NOT_INITIALIZED – When the handle isn’t initialized.
HIPTENSOR_STATUS_INVALID_VALUE – When input data is invalid (typically user error).
HIPTENSOR_STATUS_ARCH_MISMATCH – When the device isn’t ready or the target architecture isn’t supported.
hiptensorElementwiseTrinaryExecute#
-
hiptensorStatus_t hiptensorElementwiseTrinaryExecute(const hiptensorHandle_t handle, const hiptensorPlan_t plan, const void *alpha, const void *A, const void *beta, const void *B, const void *gamma, const void *C, void *D, hipStream_t stream)#
Executes element-wise tensor operation on three input tensors.
This function computes the element-wise operation:
\[ D_{\Pi^C(i_0,i_1,...,i_n)} = \Phi_{ABC}(\Phi_{AB}(\alpha \Psi_A(A_{\Pi^A(i_0,i_1,...,i_n)}), \beta \Psi_B(B_{\Pi^B(i_0,i_1,...,i_n)})), \gamma \Psi_C(C_{\Pi^C(i_0,i_1,...,i_n)})) \]Tensor modes can appear in any order, providing flexibility. However, the following restrictions apply:
Modes present in \(A\) or \(B\) must also be present in the output tensor \(D\). Modes only in inputs would imply contraction, which is handled by hiptensorContraction or hiptensorReduction.
Each mode can appear at most once in each tensor.
- Parameters:
handle – [in] Opaque handle containing hipTensor’s library context.
plan – [in] Opaque handle with elementwise operation information.
alpha – [in] Scaling factor for A. Host memory pointer.
A – [in] Multi-mode tensor in GPU-accessible memory. Must not overlap with elements written to D.
beta – [in] Scaling factor for B. Host memory pointer.
B – [in] Multi-mode tensor in GPU-accessible memory. Must not overlap with elements written to D.
gamma – [in] Scaling factor for C. Host memory pointer.
C – [in] Multi-mode tensor in GPU-accessible memory. Must not overlap with elements written to D.
D – [out] Multi-mode tensor in GPU-accessible memory. C and D may be identical only if descC == descD.
stream – [in] Stream for performing the operation.
- Return values:
HIPTENSOR_STATUS_NOT_SUPPORTED – When data type combinations or operations aren’t supported
HIPTENSOR_STATUS_INVALID_VALUE – When tensor dimensions or modes contain illegal values
HIPTENSOR_STATUS_SUCCESS – When the operation completes successfully
HIPTENSOR_STATUS_NOT_INITIALIZED – When the handle isn’t initialized.
Reduction operations#
hiptensorCreateReduction#
-
hiptensorStatus_t hiptensorCreateReduction(const hiptensorHandle_t handle, hiptensorOperationDescriptor_t *desc, const hiptensorTensorDescriptor_t descA, const int32_t modeA[], hiptensorOperator_t opA, const hiptensorTensorDescriptor_t descC, const int32_t modeC[], hiptensorOperator_t opC, const hiptensorTensorDescriptor_t descD, const int32_t modeD[], hiptensorOperator_t opReduce, const hiptensorComputeDescriptor_t descCompute)#
Creates a hiptensorOperatorDescriptor_t object that encodes a tensor reduction of the form \( D = alpha * opReduce(opA(A)) + beta * opC(C) \).
- Parameters:
handle – [in] Opaque handle holding hipTensor’s library context.
desc – [out] This opaque struct gets allocated and filled with the information that encodes the requested tensor reduction operation.
descA – [in] The descriptor that holds the information about the data type, modes and strides of A.
modeA – [in] Array with ‘nmodeA’ entries that represent the modes of A.
opA – [in] Unary operator that will be applied to each element of A before it is further processed. The original data of this tensor remains unchanged.
descC – [in] The descriptor that holds the information about the data type, modes and strides of C.
modeC – [in] Array with ‘nmodeC’ entries that represent the modes of C.
opC – [in] Unary operator that will be applied to each element of C before it is further processed. The original data of this tensor remains unchanged.
descD – [in] Must be identical to descC for now.
modeD – [in] Must be identical to modeC for now.
opReduce – [in] binary operator used to reduce elements of A.
descCompute – [in] All arithmetic is performed using this data type.
- Return values:
HIPTENSOR_STATUS_NOT_SUPPORTED – if operation is not supported.
HIPTENSOR_STATUS_INVALID_VALUE – if some input data is invalid (this typically indicates an user error).
HIPTENSOR_STATUS_SUCCESS – The operation completed successfully.
HIPTENSOR_STATUS_NOT_INITIALIZED – if the handle is not initialized.
hiptensorReduce#
-
hiptensorStatus_t hiptensorReduce(const hiptensorHandle_t handle, const hiptensorPlan_t plan, const void *alpha, const void *A, const void *beta, const void *C, void *D, void *workspace, uint64_t workspaceSize, hipStream_t stream)#
Performs the tensor reduction that is encoded by
plan
.- Parameters:
handle – [in] An opaque handle representing the hipTensor library context.
plan – [in] Opaque handle with elementwise operation information.
alpha – [in] Scaling for A. Its data type is determined by ‘descCompute’. Pointer to the host memory.
A – [in] Pointer to the data corresponding to A in device memory. Pointer to the GPU-accessible memory. The data accessed via this pointer must not overlap with the elements written to D.
beta – [in] Scaling for C. Its data type is determined by ‘descCompute’. Pointer to the host memory.
C – [in] Pointer to the data corresponding to C in device memory. Pointer to the GPU-accessible memory.
D – [out] Pointer to the data corresponding to C in device memory. Pointer to the GPU-accessible memory.
workspace – [out] Scratchpad (device) memory of size —at least—
workspaceSize
bytes.workspaceSize – [in] Please use hiptensorEstimateWorkspaceSize() to query the required workspace.
stream – [in] The stream in which all the computation is performed.
- Return values:
HIPTENSOR_STATUS_SUCCESS – The operation completed successfully.
Logging functions#
hiptensorLoggerSetCallback#
-
hiptensorStatus_t hiptensorLoggerSetCallback(hiptensorLoggerCallback_t callback)#
Registers a callback function to be invoked by logger calls.
- Parameters:
callback – [in] The callback function pointer to provide to the logger.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completed successfully.
`HIPTENSOR_STATUS_INVALID_VALUE` – if the given callback is invalid.
hiptensorLoggerSetFile#
-
hiptensorStatus_t hiptensorLoggerSetFile(FILE *file)#
Registers a file output stream to redirect logging output.
Note
The file stream must be open and writable in text mode.
- Parameters:
file – [in] A file stream pointer to provide to the logger.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completed successfully.
`HIPTENSOR_STATUS_IO_ERROR` – if the output file is not valid (defaults back to stdout).
hiptensorLoggerOpenFile#
-
hiptensorStatus_t hiptensorLoggerOpenFile(const char *logFile)#
Redirects log output to a user-specified file.
- Parameters:
logFile – [in] The file name (relative to binary) or full path to redirect logger output.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completed successfully.
`HIPTENSOR_STATUS_IO_ERROR` – if the output file is not valid (defaults back to stdout).
hiptensorLoggerSetLevel#
-
hiptensorStatus_t hiptensorLoggerSetLevel(hiptensorLogLevel_t level)#
Sets the user-specified logging level. Logs in other contexts will not be recorded.
- Parameters:
level – [in] The logging level to enforce.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completed successfully.
`HIPTENSOR_STATUS_INVALID_VALUE` – if the given log level is invalid.
hiptensorLoggerSetMask#
-
hiptensorStatus_t hiptensorLoggerSetMask(int32_t mask)#
Sets the user-specified logging mask. A mask can be a binary OR combination of several log levels. Logs in other contexts will not be recorded.
- Parameters:
mask – [in] The logging mask to enforce.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completed successfully.
`HIPTENSOR_STATUS_INVALID_VALUE` – if the given log mask is invalid.
hiptensorLoggerForceDisable#
-
hiptensorStatus_t hiptensorLoggerForceDisable()#
Disables logging.
- Return values:
`HIPTENSOR_STATUS_SUCCESS` – if the operation completed successfully.