Install Triton for ROCm

Install Triton for ROCm#

OpenAI has developed a powerful GPU focused programming language and compiler called Triton that works seamlessly with AMD GPUs. The goal of Triton is to enable AI engineers and scientists to write high-performant GPU code with minimal expertise.

Triton kernels are performant because of their blocked program representation, allowing them to be compiled into highly optimized binary code. Triton also leverages Python for kernel development, making it both familiar and accessible.

The kernels can be compiled by declaring the triton.jit python decorator before the kernel.

Pre-requisites#

  • Compatible AMD GPU

  • Linux and ROCm 5.7+ is installed

See Compatibility matrices for support information.

Install libraries#

If ROCm 6.0 and the latest version of PyTorch is not installed, the required libraries must first be installed. However, if you encounter issues running any of the commands, we recommend updating with the nightly wheels. This will also install the version of Triton that is compatible with PyTorch for ROCm.

  1. Enter the following command to install the libraries.

    pip install matplotlib pandas -q
    pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.0/ -q
    
  2. Enter the following command to import the libraries.

    import torch
    import triton
    import triton.language as tl
    

Now, a Triton kernel that approximates the GELU (Gaussian Error Linear Unit) kernel using tanh can be developed.

For more information on how to develop a kernel for GELU and benchmark its performance with its PyTorch analogues, see Developing Triton Kernels on AMD GPUs.