Precision support#
Tensile supports a rich variety of data types for matrix multiplication operations, enabling optimized performance across different precision requirements. This topic outlines the supported data types and precision formats used in Tensile’s GEMM implementations.
Data types#
Tensile represents data types using character codes in configuration files. The following table provides a comprehensive overview of all supported data types:
Character Code |
HIP C++ Type |
Bit Width |
Description |
---|---|---|---|
D |
|
64-bit |
Standard IEEE 754 double precision format with 11 exponent bits, 52 mantissa bits, and 1 sign bit. |
S |
|
32-bit |
Standard IEEE 754 single precision format with 8 exponent bits, 23 mantissa bits, and 1 sign bit. |
H |
|
16-bit |
IEEE 754-2008 half precision format with 5 exponent bits, 10 mantissa bits, and 1 sign bit. Provides reduced precision but lower memory bandwidth requirements. |
B |
|
16-bit |
Brain floating-point format with 8 exponent bits, 7 mantissa bits, and 1 sign bit. Maintains the same dynamic range as float32 but with reduced precision, making it suitable for deep learning applications. |
F8 |
|
8-bit |
E4M3 float8 format with 4 exponent bits, 3 mantissa bits, and 1 sign bit. Designed for ultra-low precision operations while maintaining numerical stability in neural network operations. |
B8 |
|
8-bit |
Brain float8 format with 5 exponent bits, 2 mantissa bits, and 1 sign bit. Provides greater dynamic range than F8 at the cost of reduced precision. |
X |
N/A |
32-bit |
Tensorfloat equivalent to custom bit distribution. Used for enhanced precision in specific computation patterns that are common in neural networks. |
Z |
|
128-bit |
Double precision complex number format consisting of two 64-bit double precision values representing real and imaginary components. |
C |
|
64-bit |
Single precision complex number format consisting of two 32-bit single precision values representing real and imaginary components. |
I |
|
32-bit |
Standard signed 32-bit integer. Often used for accumulation in integer operations. |
I8 |
|
8-bit |
Standard signed 8-bit integer. Commonly used in quantized neural network operations. |
4xi8 |
|
32-bit |
Four 8-bit signed integers packed into a single 32-bit value. This format provides efficient memory access and higher computational throughput in 8-bit integer operations. This has been deprecated, please use I8. |
F8B8 |
|
8-bit |
Mixed precision format where Matrix A uses float8 (E4M3) and Matrix B uses bfloat8 (E5M2). This combination balances precision needs for different inputs in matrix multiplication. |
B8F8 |
|
8-bit |
Mixed precision format where Matrix A uses bfloat8 (E5M2) and Matrix B uses float8 (E4M3). This is the inverse of F8B8, allowing flexible precision allocation. |
Supported GEMM configurations#
In Tensile’s GEMM implementations, data types are specified using the following terminology:
Ti
= The data type of input matrices (A/B)To
= The data type of output matrices (C/D)Tc
= The data type used for computation (alpha/beta)
Standard precision configurations#
The following operations use the same data type for input, output, and computation:
GEMM Type |
Input (Ti) |
Output (To) |
Computation (Tc) |
Description |
---|---|---|---|---|
DGEMM |
D |
D |
D |
Double precision GEMM |
SGEMM |
S |
S |
S |
Single precision GEMM |
ZGEMM |
Z |
Z |
Z |
Double precision complex GEMM |
CGEMM |
C |
C |
C |
Single precision complex GEMM |
HGEMM |
H |
H |
H |
Half precision GEMM |
High-Precision Accumulation (HPA) configurations#
The following operations use higher precision for computation than for inputs, outputs, or both:
GEMM Type |
Input (Ti) |
Output (To) |
Computation (Tc) |
Description |
---|---|---|---|---|
GEMM_EX (HHS) |
H |
H |
S |
Half precision with single precision computation |
GEMM_EX (HSS) |
H |
S |
S |
Half precision input with single precision output and computation |
GEMM_EX (BBS) |
B |
B |
S |
BFloat16 with single precision computation |
GEMM_EX (BSS) |
B |
S |
S |
BFloat16 input with single precision output and computation |
GEMM_EX (I8II) |
I8 |
I |
I |
8-bit integer input with 32-bit integer output and computation |
GEMM_EX (4xi8II) |
4xi8 |
I |
I |
Packed 8-bit integer input with 32-bit integer output and computation |
8-bit floating-point configurations#
Tensile supports the following combinations with newer 8-bit floating-point formats:
GEMM Type |
Input (Ti) |
Output (To) |
Computation (Tc) |
Description |
---|---|---|---|---|
GEMM_EX |
F8 |
S |
S |
Float8 input with single precision output and computation |
GEMM_EX |
B8 |
S |
S |
BFloat8 input with single precision output and computation |
GEMM_EX |
F8 |
F8 |
S |
Float8 input or output with single precision computation |
GEMM_EX |
B8 |
B8 |
S |
BFloat8 input or output with single precision computation |
GEMM_EX |
F8 |
H |
S |
Float8 input with half precision output and single precision computation |
GEMM_EX |
B8 |
H |
S |
BFloat8 input with half precision output and single precision computation |
Mixed input type configurations#
Tensile supports GEMM operations with the following input types for matrices A and B:
GEMM Type |
Input A/B (Ti) |
Output (To) |
Computation (Tc) |
Description |
---|---|---|---|---|
GEMM_EX |
F8B8 |
S |
S |
Matrix A is float8, Matrix B is bfloat8, with single precision output |
GEMM_EX |
B8F8 |
S |
S |
Matrix A is bfloat8, Matrix B is float8, with single precision output |
GEMM_EX |
F8B8 |
B8 |
S |
Matrix A is float8, Matrix B is bfloat8, with bfloat8 output |
GEMM_EX |
B8F8 |
B8 |
S |
Matrix A is bfloat8, Matrix B is float8, with bfloat8 output |
GEMM_EX |
F8B8 |
H |
S |
Matrix A is float8, Matrix B is bfloat8, with half precision output |
GEMM_EX |
B8F8 |
H |
S |
Matrix A is bfloat8, Matrix B is float8, with half precision output |
Data types in configuration files#
In Tensile’s configuration files, the following data types are specified as part of the problem definition:
Example configurations#
Standard single-precision GEMM
- # SGEMM
- {M: 5504, N: 5504, K: 5504, transposeA: false, transposeB: true, dataType: S}
Half-precision with single-precision accumulation
- # GEMM_EX (HHS)
- {M: 5504, N: 5504, K: 5504, transposeA: false, transposeB: true, dataType: H, destDataType: H, computeDataType: S}
BFloat16 input with float32 output
- # GEMM_EX (BSS)
- {M: 4096, N: 4096, K: 4096, transposeA: false, transposeB: true, dataType: B, destDataType: S, computeDataType: S}
8-bit integer operations
- # GEMM_EX (I8II)
- {M: 4096, N: 4096, K: 4096, transposeA: false, transposeB: true, dataType: I8, destDataType: I, computeDataType: I}
Mixed F8/B8 input with half precision output
- # GEMM_EX
- {M: 5504, N: 5504, K: 5504, transposeA: false, transposeB: true, dataType: F8B8, destDataType: H, computeDataType: S}
Library logic file naming#
Tensile uses specific naming conventions for library logic files based on the precision types:
For standard GEMM types (non-HPA):
*_TiB*.yaml
For HPA types:
*_TiToTc_BH*.yaml