AMD ROCm LLMExt documentation#
2026-05-04
2 min read time
AMD ROCm LLMExt (ROCm-LLMExt) is an open-source software toolkit built on the ROCm platform for large language model (LLM) extensions, integrations, and performance enablement on AMD GPUs. The domain brings together training, post-training, inference, and orchestration components to make modern LLM stacks practical and reproducible on AMD hardware.
LLM Task |
Features |
|---|---|
Training |
|
Post-training and alignment |
|
Inference and serving |
|
Distributed execution |
|
The ROCm-LLMExt source code is hosted on GitHub at ROCm/ROCm-LLMExt.
Note
ROCm-LLMExt 26.04 introduces two agentic libraries (ComfyUI and ROCm-RAG) as part of the toolkit; other components remain unchanged (FlashInfer, llama.cpp, Ray, Triton Inference Server, and verl).
ROCm-LLMExt documentation is organized into the following categories:
To contribute to the documentation, see Contributing to ROCm.
You can find licensing information on the Licensing page.