Deep learning compilation with MIGraphX

Deep learning compilation with MIGraphX#

2025-10-28

2 min read time

Applies to Linux

The MIGraphX deep learning (DL) compiler improves inference by analyzing a model’s compute graph and applying program transformations. After optimization, the compiler lowers graph operations to device kernels (from libraries or via code generation) for efficient execution.

A common transformation is kernel fusion; where compatible operations are merged into a single kernel launch. Fusion reduces launch overhead and avoids extra reads or writes between host and device, which typically improves latency and throughput. By applying graph-level optimizations and choosing or generating efficient device kernels, MIGraphX delivers high-performance over uncompiled models and less optimized compiled solutions.

An overview of the compilation process for MIGraphX is shown below in compilation-label. One type of optimization that MIGraphX performs are kernel fusions such as the Attention fusion seen in attention-label.

Simplified overview of the MIGraphX compilation process Attention operations fused into a single kernel

What MIGraphX provides#

  • End-to-end: compilation and execution of DL models on AMD GPUs

  • C++ implementation: with Python and C++ APIs

  • Model inputs:
  • Hardware targets: AMD Navi (consumer) and MI (server) GPUs

  • Supported data types: FP16, BF16, OCP FP8, INT8, INT4