Installing and deploying ROCm Compute Profiler#

  • Provides the core application profiling capability.

  • Allows the collection of performance counters, filtering by hardware block, dispatch, kernel, and more.

  • Provides a CLI-based analysis mode.

  • Provides a standalone web interface for importing analysis metrics.

Core installation#

The core ROCm Compute Profiler application requires the following basic software dependencies. As of ROCm 6.2, the core ROCm Compute Profiler is included with your ROCm installation.

  • Python >= 3.8

  • CMake >= 3.19

  • ROCm >= 5.7.1

Note

ROCm Compute Profiler will use the first version of python3 found in your system’s PATH. If the default version of Python is older than 3.8, you may need to update your system’s PATH to point to a newer version.

ROCm Compute Profiler depends on a number of Python packages documented in the top-level requirements.txt file. Install these before configuring ROCm Compute Profiler.

Tip

If looking to build ROCm Compute Profiler as a developer, consider these additional requirements.

docs/sphinx/requirements.txt

Python packages required to build this documentation from source.

requirements-test.txt

Python packages required to run ROCm Compute Profiler’s CI suite using PyTest.

The recommended procedure for ROCm Compute Profiler usage is to install into a shared file system so that multiple users can access the final installation. The following steps illustrate how to install the necessary Python dependencies using pip and ROCm Compute Profiler into a shared location controlled by the INSTALL_DIR environment variable.

Configuration variables#

The following installation steps leverage several CMake project variables defined as follows.

CMake variable

Description

CMAKE_INSTALL_PREFIX

Controls the install path for ROCm Compute Profiler files.

PYTHON_DEPS

Specifies an optional path to resolve Python package dependencies.

MOD_INSTALL_PATH

Specifies an optional path for separate ROCm Compute Profiler modulefile installation.

rocprofiler-sdk_DIR

Specifies the path to the ROCprofiler-SDK CMake package configuration directory used to build the rocprofiler-compute counter collection tool. This directory should contain rocprofiler-sdkConfig.cmake (for example, <rocprofiler-sdk-install-path>/lib/cmake/rocprofiler-sdk).

STANDALONEBINARY_EXTRACT_DIR

Specifies an optional temporary path to be used for extraction by the ROCm Compute Profiler standalone binary.

STANDALONEBINARY

Should be ON to enable the build of a standalone binary for ROCm Compute Profiler.

TEST_FROM_INSTALL

Should be ON to enable testing from the installation location without dependency on the source directory.

SKIP_NATIVE_TOOL_BUILD

Should be ON to skip building the native profiling tool. When enabled, the native tool will be compiled at runtime instead of build time. This is useful when ROCprofiler-SDK is not available during build time.

Install from the TheRock nightly releases#

  1. For detailed instructions on installing TheRock nightly release artifacts, refer to TheRock/Release.

Install from the source#

  1. Sparse clone the repository ROCm/rocm-systems to get the ROCm Compute Profiler source code.

    git clone --no-checkout --filter=blob:none https://github.com/ROCm/rocm-systems.git
    cd rocm-systems
    git sparse-checkout init --cone
    git sparse-checkout set projects/rocprofiler-compute
    git checkout develop
    
  2. Navigate to the rocprofiler-compute project root.

    cd projects/rocprofiler-compute
    
  3. Install Python dependencies in a virtual environment, complete the ROCm Compute Profiler configuration and install process.

    # define top-level install path
    export INSTALL_DIR=<your-top-level-desired-install-path>
    
    # install python deps
    python3 -m pip install -t ${INSTALL_DIR}/python-libs -r requirements.txt
    
    # configure ROCm Compute Profiler for shared install
    mkdir build
    cd build
    cmake -DCMAKE_INSTALL_PREFIX=${INSTALL_DIR}/3.6.0 \
            -DPYTHON_DEPS=${INSTALL_DIR}/python-libs \
            -DMOD_INSTALL_PATH=${INSTALL_DIR}/modulefiles/rocprofiler-compute ..
    
    # install
    make -j$(nproc) install
    

    Tip

    You might need to sudo the final installation step if you don’t have write access for the chosen installation path.

  4. Upon successful installation, your top-level installation directory should look like this.

    $ ls $INSTALL_DIR
    modulefiles  3.6.0  python-libs
    

Install from the tarball#

  1. Download the rocprofiler-compute specific tarball for the latest release from ROCm/rocm-systems.

  2. Untar the downloaded tarball and navigate to the rocprofiler-compute directory.

  3. Follow the installation steps under Install from the source.

Execution using modulefiles#

The installation process includes the creation of an environment modulefile for use with Lmod. On systems that support Lmod, you can register the ROCm Compute Profiler modulefile directory and setup your environment for execution of ROCm Compute Profiler as follows.

$ module use $INSTALL_DIR/modulefiles
$ module load rocprofiler-compute
$ which rocprof-compute
/opt/apps/rocprofiler-compute/3.6.0/bin/rocprof-compute

$ rocprof-compute --version
----------------------------------------
rocprofiler-compute version: 3.6.0 (release)
Git revision:     abc1234
----------------------------------------

Tip

If you’re relying on an Lmod Python module locally, you may wish to customize the resulting ROCm Compute Profiler modulefile post-installation to include extra module dependencies.

Execution without modulefiles#

To use ROCm Compute Profiler without the companion modulefile, update your PATH settings to enable access to the command line binary. If you installed Python dependencies in a shared location, also update your PYTHONPATH configuration.

export PATH=$INSTALL_DIR/3.6.0/bin:$PATH
export PYTHONPATH=$INSTALL_DIR/python-libs

Tip

To always run ROCm Compute Profiler with a particular version of Python, you can create a bash alias. For example, to run ROCm Compute Profiler with Python 3.8, you can run the following command:

alias rocprof-compute-mypython="/usr/bin/python3.8 /opt/rocm/bin/rocprof-compute"

Configuring the environment for profiling#

ROCm Compute Profiler supports two profiling backends, selectable via the ROCPROF environment variable.

Backend

How it is selected

How it works

rocprofiler-sdk (default)

ROCPROF unset, or ROCPROF=rocprofiler-sdk

Injects librocprofiler-sdk-tool.so into the target application process via LD_PRELOAD. The application runs directly; profiling is configured through environment variables.

rocprofv3

ROCPROF=rocprofv3 or ROCPROF=<path-to-rocprofv3>

Launches the rocprofv3 binary as a wrapper process around the target application. Profiling is configured via rocprofv3 command-line arguments.

Both backends build on the same underlying ROCprofiler-SDK infrastructure. The rocprofiler-sdk backend is recommended because it supports the full feature set, including iteration multiplexing.

Native counter collection tool#

When using the rocprofiler-sdk backend on ROCm 7.0 or later, ROCm Compute Profiler also injects a native counter collection tool (librocprofiler-compute-tool.so) alongside the SDK tool via LD_PRELOAD. This tool is a shared library built as part of ROCm Compute Profiler that directly uses the ROCprofiler-SDK public C API to collect hardware performance counter data per kernel dispatch.

The division of responsibility between the two injected libraries is:

  • Native tool (librocprofiler-compute-tool.so): collects hardware performance counters per dispatch.

  • SDK tool (librocprofiler-sdk-tool.so): handles kernel tracing and output database generation.

The native tool is required for iteration multiplexing. Use --no-native-tool to disable it, but note that doing so also disables iteration multiplexing. The native tool is not used in dynamic process attachment mode or with the rocprofv3 backend.