Build ROCm Systems Profiler from source#

To build ROCm Systems Profiler as part of the ROCm Core SDK, see TheRock build instructions. TheRock is the recommended way to build ROCm components from source.

To build ROCm Systems Profiler standalone, without TheRock, use the instructions on this page.

See also

If you encounter problems after installation, consult the Troubleshooting section.

Operating system support#

ROCm Systems Profiler is only supported on Linux. For more information, see ROCm Core SDK components.

Identifying the operating system#

If you are unsure of the Linux distribution and version, the /etc/os-release and /usr/lib/os-release files contain this information.

$ cat /etc/os-release
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.3 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu

The relevant fields are ID and the VERSION_ID.

Build ROCm Systems Profiler from source#

ROCm Systems Profiler needs a GCC compiler with full support for C++17 and CMake v3.25 or higher. The Clang compiler may be used instead of the GCC compiler if Dyninst is already installed.

Build requirements#

  • GCC compiler v10+

    • Older GCC compilers may be supported but are not tested

    • Clang compilers are generally supported for ROCm Systems Profiler but not Dyninst

  • CMake v3.25 or later

    Note

    If the CMake installed on the system is too old, you can install a new version using various methods. One of the easiest options is to use PyPi (Python’s pip).

    pip install --user 'cmake==3.25.0'
    export PATH=${HOME}/.local/bin:${PATH}
    

Required third-party packages#

  • Dyninst for dynamic or static instrumentation. Dyninst uses the following required and optional components.

    The Dyninst sources bundled with ROCm Systems Profiler do not use Boost. If you build against an external, older Dyninst install instead, that layout may still require Boost development packages.

  • libunwind for call-stack sampling

  • SQLite for database output

  • spdlog for logging

Any of the third-party packages required by Dyninst, along with Dyninst itself, can be built and installed during the ROCm Systems Profiler build. The following list indicates the package, the version, the application that requires the package (for example, ROCm Systems Profiler requires Dyninst while Dyninst requires TBB), and the CMake option to build the package alongside ROCm Systems Profiler:

Third-Party Library

Minimum Version

Required By

CMake Option

Dyninst

13.0

ROCm Systems Profiler

ROCPROFSYS_BUILD_DYNINST (default: OFF)

Libunwind

ROCm Systems Profiler

ROCPROFSYS_BUILD_LIBUNWIND (default: ON)

Nlohmann/JSON

ROCm Systems Profiler

ROCPROFSYS_BUILD_NLOHMANN_JSON (default: ON)

spdlog

ROCm Systems Profiler

ROCPROFSYS_BUILD_SPDLOG (default: ON)

SQLite

ROCm Systems Profiler

ROCPROFSYS_BUILD_SQLITE (default: OFF)

TBB

2018.6

Dyninst

ROCPROFSYS_BUILD_TBB (default: OFF)

ElfUtils

0.178

Dyninst

ROCPROFSYS_BUILD_ELFUTILS (default: OFF)

LibIberty

Dyninst

ROCPROFSYS_BUILD_LIBIBERTY (default: OFF)

OpenMP

4.x

Dyninst

ROCm dependencies#

ROCm is required for GPU profiling features such as GPU hardware counter collection, tracing, and GPU and AI NIC monitoring.

  • ROCm

    • AMD SMI library for GPU and AI NIC monitoring

    • ROCprofiler-SDK for GPU hardware counters and ROCm tracing

Optional third-party packages#

The following packages are optional and can be enabled via the corresponding CMake options.

  • Python

    • ROCPROFSYS_USE_PYTHON enables Python support.

  • PAPI

  • MPI

    • ROCPROFSYS_USE_MPI enables full MPI support

    • ROCPROFSYS_USE_MPI_HEADERS enables wrapping of the dynamically-linked MPI C function calls. (By default, if ROCm Systems Profiler cannot find an OpenMPI MPI distribution, it uses a local copy of the OpenMPI mpi.h.)

Third-Party Library

CMake Enable Option

PAPI

ROCPROFSYS_USE_PAPI (default: ON)

MPI

ROCPROFSYS_USE_MPI (default: OFF)

MPI (header-only)

ROCPROFSYS_USE_MPI_HEADERS (default: ON)

Installing Dyninst#

The easiest way to install Dyninst is alongside ROCm Systems Profiler.

Building Dyninst alongside ROCm Systems Profiler#

To install Dyninst alongside ROCm Systems Profiler, configure ROCm Systems Profiler with ROCPROFSYS_BUILD_DYNINST=ON. Depending on the version of Ubuntu, the apt package manager might have current enough versions of the Dyninst TBB and LibIberty dependencies (use apt-get install libtbb-dev libiberty-dev). However, it is possible to also build and install the Dyninst dependencies via ROCPROFSYS_BUILD_<DEP>=ON, as follows:

git clone --filter=blob:none --sparse https://github.com/ROCm/rocm-systems.git
git -C rocm-systems sparse-checkout set projects/rocprofiler-systems

cmake -B rocprof-sys-build -DROCPROFSYS_BUILD_DYNINST=ON \
      -DROCPROFSYS_BUILD_{TBB,ELFUTILS,LIBIBERTY}=ON \
      -S rocm-systems/projects/rocprofiler-systems

where -DROCPROFSYS_BUILD_{TBB,ELFUTILS,LIBIBERTY}=ON is expanded by the shell to -DROCPROFSYS_BUILD_TBB=ON -DROCPROFSYS_BUILD_ELFUTILS=ON ...

Building and installing ROCm Systems Profiler#

ROCm Systems Profiler has CMake configuration options for MPI support (ROCPROFSYS_USE_MPI or ROCPROFSYS_USE_MPI_HEADERS), OpenMP-Tools (ROCPROFSYS_USE_OMPT), hardware counters via PAPI (ROCPROFSYS_USE_PAPI), among other features. ROCm support is always enabled. Various additional features can be enabled via the TIMEMORY_USE_* CMake options. Any ROCPROFSYS_USE_<VAL> option which has a corresponding TIMEMORY_USE_<VAL> option means that the Timemory support for this feature has been integrated into Perfetto support for ROCm Systems Profiler, for example, ROCPROFSYS_USE_PAPI=<VAL> also configures TIMEMORY_USE_PAPI=<VAL>. This means the data that Timemory is able to collect via this package is passed along to Perfetto and is displayed when the .proto file is visualized in the Perfetto UI.

git clone --filter=blob:none --sparse https://github.com/ROCm/rocm-systems.git
git -C rocm-systems sparse-checkout set projects/rocprofiler-systems

cmake                                                 \
    -B rocprof-sys-build                              \
    -D CMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems  \
    -D ROCPROFSYS_USE_PYTHON=ON                       \
    -D ROCPROFSYS_BUILD_DYNINST=ON                    \
    -D ROCPROFSYS_BUILD_TBB=ON                        \
    -D ROCPROFSYS_BUILD_ELFUTILS=ON                   \
    -D ROCPROFSYS_BUILD_LIBIBERTY=ON                  \
    -S rocm-systems/projects/rocprofiler-systems
cmake --build rocprof-sys-build --target all --parallel 8
cmake --build rocprof-sys-build --target install
source /opt/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh

Using the build script#

This method automates the CMake process with a script that wraps the CMake commands and handles build logic, environment variables, and packaging. Run ./scripts/build-release.sh with your desired options to generate packages.

Use ./scripts/build-release.sh --help for more information.

./scripts/build-release.sh --help
Options:
    --core       [+nopython] [+python]                    Core (Use '+nopython' to build w/o python, use '+python' to python build with python)
    --mpi        [+nopython] [+python]                    MPI (Use '+nopython' to build w/o python, use '+python' to python build with python)
    --rocm       [+nopython] [+python]                    ROCm (Use '+nopython' to build w/o python, use '+python' to python build with python)
    --rocm-mpi   [+nopython] [+python]                    ROCm + MPI (Use '+nopython' to build w/o python, use '+python' to python build with python)
    --mpi-impl   [openmpi|mpich]                          MPI implementation

    --lto                  [on|off]                       Enable LTO (default: off)
    --strip                [on|off]                       Strip libraries (default: off)
    --perfetto-tools       [on|off]                       Install perfetto tools (default: on)
    --static-libgcc        [on|off]                       Build with static libgcc (default: on)
    --static-libstdcxx     [on|off]                       Build with static libstdc++ (default: on)
    --hidden-visibility    [on|off]                       Build with hidden visibility (default: on)
    --max-threads          N                              Max number of threads supported (default: 2048)
    --parallel             N                              Number of parallel build jobs (default: 12)
    --generators           [STGZ][DEB][RPM][+others]      CPack generators (default: stgz deb rpm)

MPI support within ROCm Systems Profiler#

ROCm Systems Profiler can have full (ROCPROFSYS_USE_MPI=ON) or partial (ROCPROFSYS_USE_MPI_HEADERS=ON) MPI support. The only difference between these two modes is whether or not the results collected via Timemory and/or Perfetto can be aggregated into a single output file during finalization. When full MPI support is enabled, combining the Timemory results always occurs, whereas combining the Perfetto results is configurable via the ROCPROFSYS_PERFETTO_COMBINE_TRACES setting.

The primary benefits of partial or full MPI support are the automatic wrapping of MPI functions and the ability to label output with suffixes which correspond to the MPI_COMM_WORLD rank ID instead of having to use the system process identifier (i.e. PID). In general, it’s recommended to use partial MPI support with the OpenMPI headers as this is the most portable configuration. If full MPI support is selected, make sure your target application is built against the same MPI distribution as ROCm Systems Profiler. For example, do not build ROCm Systems Profiler with MPICH and use it on a target application built against OpenMPI. If partial support is selected, the reason the OpenMPI headers are recommended instead of the MPICH headers is because the MPI_COMM_WORLD in OpenMPI is a pointer to ompi_communicator_t (8 bytes), whereas MPI_COMM_WORLD in MPICH is an int (4 bytes). Building ROCm Systems Profiler with partial MPI support and the MPICH headers and then using ROCm Systems Profiler on an application built against OpenMPI causes a segmentation fault. This happens because the value of the MPI_COMM_WORLD is truncated during the function wrapping before being passed along to the underlying MPI function.

Python support within ROCm Systems Profiler#

ROCm Systems Profiler supports profiling Python code via the ROCPROFSYS_USE_PYTHON CMake option. Python support is enabled via the ROCPROFSYS_USE_PYTHON and the ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR> CMake options. Alternatively, to build multiple Python versions, use ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR>;[<MAJOR>.<MINOR>]", and ROCPROFSYS_PYTHON_ROOT_DIRS="/path/to/version;[/path/to/version]" instead of just ROCPROFSYS_PYTHON_VERSIONS. When building multiple Python versions, the length of the ROCPROFSYS_PYTHON_VERSIONS and ROCPROFSYS_PYTHON_ROOT_DIRS lists must be the same size.

cmake --preset release -D ROCPROFSYS_PYTHON_ROOT_DIRS="/usr/bin;/usr/bin" -D ROCPROFSYS_PYTHON_VERSIONS="3.10;3.12"

Post-installation#

See Post-installation steps and Troubleshooting for more information.