MI200 matrix fused multiply-add (MFMA) behavior specifics#
On the MI200, MFMA_F16, MFMA_BF16, and MFMA_BF16_1K flush subnormal input/output data to
zero. This behavior might affect the convolution operation in certain workloads due to the limited
exponent range of the half-precision floating-point datatypes.
MIOpen offers an alternate implementation for the half-precision datatype via conversion instructions
to utilize the larger exponent range of the BFloat16 data type, albeit with reduced accuracy. The following
caveats apply to this alternate implementation:
- It’s disabled by default in the forward convolution operations. 
- It’s enabled by default in the backward data and backward weights convolution operations. 
- You can override the default MIOpen behavior by using the - miopenSetConvolutionAttributeAPI call. To use the alternate implementation, pass the convolution descriptor for the appropriate convolution operation and the- MIOPEN_CONVOLUTION_ATTRIB_FP16_ALT_IMPLconvolution attribute (with a non-zero value).
- You can also override the behavior using the - MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPLenvironment variable. When set to- 1,- MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPLengages the alternate implementation. When set to- 0, it’s disabled. This environment variable impacts the convolution operation in all directions.