ROCm-LLMExt 26.04 release notes#
3 min read time
This is the fifth release of the AMD ROCm LLMExt toolkit (ROCm-LLMExt), an open-source software toolkit built on the ROCm platform for large language model (LLM) extensions, integrations, and performance enablement on AMD GPUs. The domain brings together training, post-training, inference, and orchestration components to make modern LLM stacks practical and reproducible on AMD hardware.
Release highlights#
Note
ROCm-LLMExt 26.04 introduces two agentic libraries (ComfyUI and ROCm-RAG) as part of the toolkit; other components remain unchanged (FlashInfer, llama.cpp, Ray, Triton Inference Server, and verl).
This release introduces the following component with support for ROCm 7.2.0:
ComfyUI is an open-source, node-based interface for building and running image generation workflows with diffusion models such as Stable Diffusion.
This release introduces the following component with support for ROCm 6.4.1:
ROCm-RAG (Retrieval-Augmented Generation) is a machine learning architecture that enhances Large Language Models by combining generation with information retrieval from external sources.
System requirements#
ROCm‑LLMExt components span a range of ROCm version requirements depending on the specific extension. Ensure you follow the installation instructions for each component, which list the exact ROCm dependencies, or refer to the Compatibility matrix to verify the supported ROCm versions.
ROCm-LLMExt components#
The following table lists ROCm-LLMExt component versions for the 26.04 release. Click to go to the component’s source on GitHub.
| Name | Version | Source |
|---|---|---|
| verl | 0.6.0 | |
| Ray | 2.51.1 | |
| llama.cpp | b6652 | |
| FlashInfer | 0.5.3 | |
| Triton Inference Server | 25.12 | |
| ComfyUI | 0.18.2 | |
| ROCm-RAG | 1.0.0 |
Detailed component changelogs#
ComfyUI 0.18.2#
ComfyUI is a newly-supported component as part of the ROCm-LLMExt toolkit. ComfyUI is a graphical node-based interface that lets you create images, videos, and audio with minimal coding. You can even create diffusion workflows by dragging and dropping nodes in a visual interface. This release is supported on ROCm 7.2.0 on AMD Instinct MI355X GPUs.
ROCm-RAG 1.0.0#
ROCm-RAG is a newly-supported component as part of the ROCm-LLMExt toolkit. Build and deploy end-to-end AI pipelines with ROCm Retrieval-Augmented Generation (RAG) on AMD Instinct MI300X GPUs with support on ROCm 6.4.1. RAG is a machine learning architecture that enhances Large Language Models by combining generation with information retrieval from external sources.