What is Omniperf?#
Omniperf is a kernel-level profiling tool for machine learning and high performance computing (HPC) workloads running on AMD Instinct™ accelerators.
AMD Instinct MI-series accelerators are data center-class GPUs designed for compute and have some graphics capabilities disabled or removed. Omniperf primarily targets use with accelerators in the MI300, MI200, and MI100 families. Development is in progress to support Radeon™ (RDNA) GPUs.
Omniperf is built on top of ROCProfiler to monitor hardware performance counters.
High-level design of Omniperf#
The architecture of Omniperf consists of three major components shown in the following diagram.
Core Omniperf profiler#
Acquires raw performance counters via application replay using rocprof
.
Counters are stored in a comma-separated-values format for further
analysis. It runs a set of accelerator-specific
micro-benchmarks to acquire hierarchical roofline data. The roofline model is
not available on accelerators pre-MI200.
Grafana server for Omniperf#
Grafana database import: All raw performance counters are imported into a backend MongoDB database to support analysis and visualization in the Grafana GUI. Compatibility with previously generated data using older Omniperf versions is not guaranteed.
Grafana analysis dashboard GUI: The Grafana dashboard retrieves the raw counters information from the backend database. It displays the relevant performance metrics and visualization.
Omniperf standalone GUI analyzer#
Omniperf provides a standalone GUI to enable basic performance analysis without the need to import data into a database instance. Find setup instructions in Setting up a Grafana server for Omniperf
Omniperf features#
Omniperf offers comprehensive profiling based on all available hardware counters for the target accelerator. It delivers advanced performance analysis features, such as system Speed-of-Light (SOL) and hardware block-level SOL evaluations. Additionally, Omniperf provides in-depth memory chart analysis, roofline analysis, baseline comparisons, and more, ensuring a thorough understanding of system performance.
Omniperf supports analysis through both the command line or a GUI. The following list describes Omniperf’s features at a high level.
Support for AMD Instinct MI300, MI200, and MI100 accelerators
GUI analyzer via Grafana and MongoDB
Roofline Analysis panel (Supported on MI200 only, Ubuntu 20.04, SLES 15 SP3 or RHEL8)
L1 Address Processing Unit, or, Texture Addresser (TA) and L1 Backend Data Processing Unit, or, Texture Data (TD) panels
Filtering to reduce profiling time
Filtering by dispatch
Filter by kernel
Filtering by GPU ID