ROCm support for SPIR-V (beta)#
ROCm 6.4 introduces beta support for generating portable, AMD GPU target agnostic SPIR-V from HIP source, by way of picking the amdgcnspirv
offload architecture. For example:
clang++ -x hip --offload-arch=amdgcnspirv main.cpp -o main
# or
clang++ -target spirv64-amd-amdhsa -x hip main.cpp -o main
The amdgcnspirv
offload architecture represents AMDGCN SPIRV, which enables the following AMD specific features on top of the baseline SPIR-V 1.6 capabilities:
AMDGCN inline ASM is supported
AMDGCN target specific builtins are supported
The feature set matches the union of AMDGCN targets’ features
LLVM provides additional details:
Abstract target versus concrete gfx targets#
The amdgcnspirv
target is abstract. It is not tied to a specific GPU, but can stand-in for any AMD GPU. A consequence of the abstract nature of the target is that some information only becomes available at run time, when SPIR-V gets lowered to native code for a concrete GPU:
The concrete GPU architecture is not established at compile time:
The
__<ArchName>__
,__<GFXN>__
,__amdgcn_processor__
and__amdgcn_target_id__
macros are not defined at compilation
The physical wavefront size is not available at compile time:
warpSize
constant value is notconstexpr
/consteval
when targetingamdgcnspirv
The
__AMDGCN_WAVEFRONT_SIZE
and__AMDGCN_WAVEFRONT_SIZE__
macros are not defined at compilation, but these macros are deprecated and should no longer be used.
Given that an additional run time compilation element is needed in the SPIR-V workflow, extra run time overhead might be observed. You should consider this overhead when measuring the timing and performance of this workflow.
Compatibility with precompiled ROCm libraries#
A client application or library that targets SPIR-V can work with precompiled ROCm components with concrete targets. For example, a program P which has both its own HIP kernels (__global__
functions) and calls to the rocBLAS library can use --offload-arch=amdgcnspirv
without any additional changes to its compilation flow or its set of library dependencies. However, the general effect is that the abstract targets are limited to the concrete targets of the precompiled ROCm library.