Precision support

Precision support#

Tensile supports a rich variety of data types for matrix multiplication operations, enabling optimized performance across different precision requirements. This topic outlines the supported data types and precision formats used in Tensile’s GEMM implementations.

Data types#

Tensile represents data types using character codes in configuration files. The following table provides a comprehensive overview of all supported data types:

Character Code	HIP C++ Type	Bit Width	Description
D	`double`	64-bit	Standard IEEE 754 double precision format with 11 exponent bits, 52 mantissa bits, and 1 sign bit.
S	`float`	32-bit	Standard IEEE 754 single precision format with 8 exponent bits, 23 mantissa bits, and 1 sign bit.
H	`half`	16-bit	IEEE 754-2008 half precision format with 5 exponent bits, 10 mantissa bits, and 1 sign bit. Provides reduced precision but lower memory bandwidth requirements.
B	`bfloat16`	16-bit	Brain floating-point format with 8 exponent bits, 7 mantissa bits, and 1 sign bit. Maintains the same dynamic range as float32 but with reduced precision, making it suitable for deep learning applications.
F8	`__hip_fp8_e4m3` / `__hip_fp8_e4m3_fnuz`	8-bit	E4M3 float8 format with 4 exponent bits, 3 mantissa bits, and 1 sign bit. Designed for ultra-low precision operations while maintaining numerical stability in neural network operations.
B8	`__hip_fp8_e5m2` / `__hip_fp8_e5m2_fnuz`	8-bit	Brain float8 format with 5 exponent bits, 2 mantissa bits, and 1 sign bit. Provides greater dynamic range than F8 at the cost of reduced precision.
X	N/A	32-bit	Tensorfloat equivalent to custom bit distribution. Used for enhanced precision in specific computation patterns that are common in neural networks.
Z	`hipDoubleComplex`	128-bit	Double precision complex number format consisting of two 64-bit double precision values representing real and imaginary components.
C	`hipFloatComplex`	64-bit	Single precision complex number format consisting of two 32-bit single precision values representing real and imaginary components.
I	`int32_t`	32-bit	Standard signed 32-bit integer. Often used for accumulation in integer operations.
I8	`int8_t`	8-bit	Standard signed 8-bit integer. Commonly used in quantized neural network operations.
4xi8	`int32_t`	32-bit	Four 8-bit signed integers packed into a single 32-bit value. This format provides efficient memory access and higher computational throughput in 8-bit integer operations. This has been deprecated, please use I8.
F8B8	`__hip_fp8_e4m3` + `__hip_fp8_e5m2`	8-bit	Mixed precision format where Matrix A uses float8 (E4M3) and Matrix B uses bfloat8 (E5M2). This combination balances precision needs for different inputs in matrix multiplication.
B8F8	`__hip_fp8_e5m2` + `__hip_fp8_e4m3`	8-bit	Mixed precision format where Matrix A uses bfloat8 (E5M2) and Matrix B uses float8 (E4M3). This is the inverse of F8B8, allowing flexible precision allocation.

Supported GEMM configurations#

In Tensile’s GEMM implementations, data types are specified using the following terminology:

Ti = The data type of input matrices (A/B)
To = The data type of output matrices (C/D)
Tc = The data type used for computation (alpha/beta)

Standard precision configurations#

The following operations use the same data type for input, output, and computation:

GEMM Type	Input (Ti)	Output (To)	Computation (Tc)	Description
DGEMM	D	D	D	Double precision GEMM
SGEMM	S	S	S	Single precision GEMM
ZGEMM	Z	Z	Z	Double precision complex GEMM
CGEMM	C	C	C	Single precision complex GEMM
HGEMM	H	H	H	Half precision GEMM

High-Precision Accumulation (HPA) configurations#

The following operations use higher precision for computation than for inputs, outputs, or both:

GEMM Type	Input (Ti)	Output (To)	Computation (Tc)	Description
GEMM_EX (HHS)	H	H	S	Half precision with single precision computation
GEMM_EX (HSS)	H	S	S	Half precision input with single precision output and computation
GEMM_EX (BBS)	B	B	S	BFloat16 with single precision computation
GEMM_EX (BSS)	B	S	S	BFloat16 input with single precision output and computation
GEMM_EX (I8II)	I8	I	I	8-bit integer input with 32-bit integer output and computation
GEMM_EX (4xi8II)	4xi8	I	I	Packed 8-bit integer input with 32-bit integer output and computation

8-bit floating-point configurations#

Tensile supports the following combinations with newer 8-bit floating-point formats:

GEMM Type	Input (Ti)	Output (To)	Computation (Tc)	Description
GEMM_EX	F8	S	S	Float8 input with single precision output and computation
GEMM_EX	B8	S	S	BFloat8 input with single precision output and computation
GEMM_EX	F8	F8	S	Float8 input or output with single precision computation
GEMM_EX	B8	B8	S	BFloat8 input or output with single precision computation
GEMM_EX	F8	H	S	Float8 input with half precision output and single precision computation
GEMM_EX	B8	H	S	BFloat8 input with half precision output and single precision computation

Mixed input type configurations#

Tensile supports GEMM operations with the following input types for matrices A and B:

GEMM Type	Input A/B (Ti)	Output (To)	Computation (Tc)	Description
GEMM_EX	F8B8	S	S	Matrix A is float8, Matrix B is bfloat8, with single precision output
GEMM_EX	B8F8	S	S	Matrix A is bfloat8, Matrix B is float8, with single precision output
GEMM_EX	F8B8	B8	S	Matrix A is float8, Matrix B is bfloat8, with bfloat8 output
GEMM_EX	B8F8	B8	S	Matrix A is bfloat8, Matrix B is float8, with bfloat8 output
GEMM_EX	F8B8	H	S	Matrix A is float8, Matrix B is bfloat8, with half precision output
GEMM_EX	B8F8	H	S	Matrix A is bfloat8, Matrix B is float8, with half precision output

Data types in configuration files#

In Tensile’s configuration files, the following data types are specified as part of the problem definition:

Example configurations#

Standard single-precision GEMM

- # SGEMM
  - {M: 5504, N: 5504, K: 5504, transposeA: false, transposeB: true, dataType: S}

Half-precision with single-precision accumulation

- # GEMM_EX (HHS)
  - {M: 5504, N: 5504, K: 5504, transposeA: false, transposeB: true, dataType: H, destDataType: H, computeDataType: S}

BFloat16 input with float32 output

- # GEMM_EX (BSS)
  - {M: 4096, N: 4096, K: 4096, transposeA: false, transposeB: true, dataType: B, destDataType: S, computeDataType: S}

8-bit integer operations

- # GEMM_EX (I8II)
  - {M: 4096, N: 4096, K: 4096, transposeA: false, transposeB: true, dataType: I8, destDataType: I, computeDataType: I}

Mixed F8/B8 input with half precision output

- # GEMM_EX
  - {M: 5504, N: 5504, K: 5504, transposeA: false, transposeB: true, dataType: F8B8, destDataType: H, computeDataType: S}

Library logic file naming#

Tensile uses specific naming conventions for library logic files based on the precision types:

For standard GEMM types (non-HPA): *_TiB*.yaml
For HPA types: *_TiToTc_BH*.yaml