MI200 matrix fused multiply-add (MFMA) behavior specifics#
The MI200 MFMA_F16
, MFMA_BF16
, and MFMA_BF16_1K
flush subnormal input/output data to
zero. This behavior might affect the convolution operation in certain workloads due to the limited
exponent range of the half-precision floating-point datatypes.
MIOpen offers an alternate implementation for the half-precision datatype via conversion instructions to utilize the BFloat16 datatype’s larger exponent range, albeit with reduced accuracy. The following salients apply to this alternate implementation:
It’s disabled by default in the forward convolution operations.
It’s enabled by default in the backward data and backward weights convolution operations.
You can override the default MIOpen behaviors by using the
miopenSetConvolutionAttribute
API call: Pass the convolution descriptor for the appropriate convolution operation, and theMIOPEN_CONVOLUTION_ATTRIB_FP16_ALT_IMPL
convolution attribute (with a non-zero value), to engage the alternate implementation.You can also override the behavior using the
MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPL
environment variable. When set to1
,MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPL
engages the alternate implementation; when set to0
, it’s disabled. Keep in mind that the environment variable impacts the convolution operation in all directions.