Basic usage#
The following section outlines basic ROCm Compute Profiler workflows, modes, options, and operations.
Command line profiler#
Launch and profile the target application using the command line profiler.
The command line profiler launches the target application, calls the
ROCProfiler API via the rocprof
binary, and collects profile results for
the specified kernels, dispatches, and hardware components. If not
specified, ROCm Compute Profiler defaults to collecting all available counters for all
kernels and dispatches launched by the your executable.
To collect the default set of data for all kernels in the target application, launch, for example:
$ rocprof-compute profile -n vcopy_data -- ./vcopy -n 1048576 -b 256
This runs the app, launches each kernel, and generates profiling results. By
default, results are written to a subdirectory with your accelerator’s name;
for example, ./workloads/vcopy_data/MI200/
, where name is configurable
via the -n
argument.
Note
To collect all requested profile information, ROCm Compute Profiler might replay kernels multiple times.
Customize data collection#
Options are available to specify for which kernels and metrics data should be collected. Note that you can apply filtering in either the profiling or analysis stage. Filtering at profiling collection often speeds up your aggregate profiling run time.
Common filters to customize data collection include:
-k
,--kernel
Enables filtering kernels by name.
-d
,--dispatch
Enables filtering based on dispatch ID.
-b
,--block
Enables collection metrics for only the specified (one or more) hardware component blocks.
See Filtering for an in-depth walkthrough.
To view available metrics by hardware block, use the --list-metrics
argument:
$ rocprof-compute analyze --list-metrics <sys_arch>
Analyze in the command line#
After generating a local output folder (for example,
./workloads/vcopy_data/MI200
), use the command line tool to quickly
interface with profiling results. View different metrics derived from your
profiled results and get immediate access all metrics organized by hardware
blocks.
If you don’t apply kernel, dispatch, or hardware block filters at this stage, analysis is reflective of the entirety of the profiling data.
To interact with profiling results from a different session, provide the workload path.
-p
,--path
Enables you to analyze existing profiling data in the ROCm Compute Profiler CLI.
See CLI analysis for more detailed information.
Analyze in the Grafana GUI#
To conduct a more in-depth analysis of profiling results, it’s suggested to use a Grafana GUI with ROCm Compute Profiler. To interact with profiling results, import your data to the MongoDB instance included in the ROCm Compute Profiler Dockerfile. See Setting up Grafana server for ROCm Compute Profiler.
To interact with Grafana data, stored in the ROCm Compute Profiler database, enter
database
mode; for example:
$ rocprof-compute database --import [CONNECTION OPTIONS]
See Grafana GUI analysis for more detailed information.
Modes#
Modes change the fundamental behavior of the ROCm Compute Profiler command line tool. Depending on which mode you choose, different command line options become available.
Profile mode#
profile
Launches the target application on the local system using ROCProfiler. Depending on the profiling options chosen, selected kernels, dispatches, and or hardware components used by the application are profiled. It stores results locally in an output folder:
./workloads/\<name>
.$ rocprof-compute profile --help
See Profile mode to learn about this mode in depth and to get started profiling with ROCm Compute Profiler.
Analyze mode#
analyze
Loads profiling data from the
--path
(-p
) directory into the ROCm Compute Profiler CLI analyzer where you have immediate access to profiling results and generated metrics. It generates metrics from the entirety of your profiled application or a subset identified through the ROCm Compute Profiler CLI analysis filters.To generate a lightweight GUI interface, you can add the
--gui
flag to your analysis command.This mode is a middle ground to the highly detailed ROCm Compute Profiler Grafana GUI and is great if you want immediate access to a hardware component you’re already familiar with.
$ rocprof-compute analyze --help
See Analyze mode to learn about this mode in depth and to get started with analysis using ROCm Compute Profiler.
Database mode#
database
The Grafana analyzer GUI is built on a MongoDB database.
--import
profiling results to the DB to interact with the workload in Grafana or--remove
the workload from the DB.Connection options need to be specified. See Grafana GUI analysis for more details.
$ rocprof-compute database --help
See Setting up Grafana server for ROCm Compute Profiler to learn about setting up a Grafana server and database instance to make your profiling data more digestible and shareable.
Global options#
The ROCm Compute Profiler command line tool has a set of global utility options that are available across all modes.
-v
,--version
Prints the ROCm Compute Profiler version and exits.
-V
,--verbose
Increases output verbosity. Use multiple times for higher levels of verbosity.
-q
,--quiet
Reduces output verbosity and runs quietly.
-s
,--specs
Prints system specs and exits.
Note
ROCm Compute Profiler also recognizes the project variable, ROCPROFCOMPUTE_COLOR
should you
choose to disable colorful output. To disable default colorful behavior, set
this variable to 0
.
Basic operations#
The following table lists ROCm Compute Profiler’s basic operations, their modes, and required arguments.
Operation description |
Mode |
Required arguments |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|