C++ language support#
The ROCm platform enables the power of combined C++ and HIP (Heterogeneous-computing
Interface for Portability) code. This code is compiled with a clang
or clang++
compiler. The official compilers support the HIP platform, or you can use the
amdclang
or amdclang++
included in the ROCm installation, which are a wrapper for
the official versions.
The source code is compiled according to the C++03
, C++11
, C++14
, C++17
,
and C++20
standards, along with HIP-specific extensions, but is subject to
restrictions. The key restriction is the reduced support of standard library in device
code. This is due to the fact that by default a function is considered to run on host,
except for constexpr
functions, which can run on host and device as well.
Modern C++ support#
C++ is considered a modern programming language as of C++11. This section describes how HIP supports these new C++ features.
C++11 support#
The C++11 standard introduced many new features. These features are supported in HIP host
code, with some notable omissions on the device side. The rule of thumb here is that
constexpr
functions work on device, the rest doesn’t. This means that some important
functionality like std::function
is missing on the device, but unfortunately the
standard library wasn’t designed with HIP in mind, which means that the support is in a
state of “works as-is”.
Certain features have restrictions and clarifications. For example, any functions using
the constexpr
qualifier or the new initializer lists
, std::move
or
std::forward
features are implicitly considered to have the __host__
and
__device__
execution space specifier. Also, constexpr
variables that are static
members or namespace scoped can be used from both host and device, but only for read
access. Dereferencing a static constexpr
outside its specified execution space causes
an error.
Lambdas are supported, but there are some extensions and restrictions on their usage. For more information, see the Extended lambdas section below.
C++14 support#
The C++14 language features are supported.
C++17 support#
All C++17 language features are supported.
C++20 support#
All C++20 language features are supported, but extensions and restrictions apply. C++20
introduced coroutines and modules, which fundamentally changed how programs are written.
HIP doesn’t support these features. However, consteval
functions can be called from
host and device, even if specified for host use only.
The three-way comparison operator (spaceship operator <=>
) works with host and device
code.
Extensions and restrictions#
In addition to the deviations from the standard, there are some general extensions and restrictions to consider.
Global functions#
Functions that serve as an entry point for device execution are called kernels and are
specified with the __global__
qualifier. To call a kernel function, use the triple
chevron operator: <<< >>>
. Kernel functions must have a void
return type. These
functions can’t:
have a
constexpr
specifierhave a parameter of type
std::initializer_list
orva_list
use an rvalue reference as a parameter.
use parameters having different sizes in host and device code, e.g. long double arguments, or structs containing long double members.
use struct-type arguments which have different layout in host and device code.
Kernels can have variadic template parameters, but only one parameter pack, which must be the last item in the template parameter list.
Device space memory specifiers#
HIP includes device space memory specifiers to indicate whether a variable is allocated
in host or device memory and how its memory should be allocated. HIP supports the
__device__
, __shared__
, __managed__
, and __constant__
specifiers.
The __device__
and __constant__
specifiers define global variables, which are
allocated within global memory on the HIP devices. The only difference is that
__constant__
variables can’t be changed after allocation. The __shared__
specifier allocates the variable within shared memory, which is available for all threads
in a block.
The __managed__
variable specifier creates global variables that are initially
undefined and unaddressed within the global symbol table. The HIP runtime allocates
managed memory and defines the symbol when it loads the device binary. A managed variable
can be accessed in both device and host code.
It’s important to know where a variable is stored because it is only available from
certain locations. Generally, variables allocated in the host memory are not accessible
from the device code, while variables allocated in the device memory are not directly
accessible from the host code. Dereferencing a pointer to device memory on the host
results in a segmentation fault. Accessing device variables in host code should be done
through kernel execution or HIP functions like hipMemCpyToSymbol
.
Exception handling#
An important difference between the host and device code is exception handling. In device code, this control flow isn’t available due to the hardware architecture. The device code must use return codes to handle errors.
Kernel parameters#
There are some restrictions on kernel function parameters. They cannot be passed by reference, because these functions are called from the host but run on the device. Also, a variable number of arguments is not allowed.
Classes#
Classes work on both the host and device side, but there are some constraints. The
static
member functions can’t be __global__
. Virtual
member functions work,
but a virtual
function must not be called from the host if the parent object was
created on the device, or the other way around, because this behavior is undefined.
Another minor restriction is that __device__
variables, that are global scoped must
have trivial constructors.
Polymorphic function wrappers#
HIP doesn’t support the polymorphic function wrapper std::function
, which was
introduced in C++11.
Extended lambdas#
HIP supports Lambdas, which by default work as expected.
Lambdas have implicit host device attributes. This means that they can be executed by
both host and device code, and works the way you would expect. To make a lambda callable
only by host or device code, users can add __host__
or __device__
attribute. The
only restriction is that host variables can only be accessed through copy on the device.
Accessing through reference will cause undefined behavior.
Inline namespaces#
Inline namespaces are supported, but with a few exceptions. The following entities can’t be declared in namespace scope within an inline unnamed namespace:
__managed__
,__device__
,__shared__
and__constant__
variables__global__
function and function templatesvariables with surface or texture type