Installation#
Install ROCm#
To begin, install ROCm for your platform. For installation instructions, refer to the Linux or Windows installation guide.
Tip
If using Bash, we recommend you to set PATH=/opt/rocm/bin/:$PATH
in your ~/.bashrc
and refresh your shell using source ~/.bashrc
.
Alternatively, export the path for your current shell session only, using export PATH=/opt/rocm/bin/:$PATH
.
Install OS dependencies#
Note
The steps below are for Ubuntu. For other distributions, use the appropriate package manager.
Install dependencies:
apt-get install libyaml python3-yaml \ libomp-dev libboost-program-options-dev libboost-filesystem-dev
Install one of the following, depending on your preferred Tensile data format. If both are installed,
msgpack
is preferred:apt-get install libmsgpack-dev # If using the msgpack backend # OR apt-get install libtinfo-dev # If using the YAML backend
Install build tools. For additional installation methods for the latest versions of CMake, see the CMake installation page.
apt-get install build-essential cmake
Install Tensile from source#
To install Tensile from source, it is recommended to create a virtual environment first:
python3 -m venv .venv
source .venv/bin/activate
Then, you can install Tensile using pip or git.
Option 1: Install with pip#
pip3 install git+https://github.com/ROCmSoftwarePlatform/Tensile.git@develop
Option 2: Install with git#
git clone [email protected]:ROCm/Tensile.git && cd Tensile
pip3 install .
You can now run Tensile’s Python applications.
Running benchmark#
To run a benchmark, pass a tuning config to the Tensile
program located in Tensile/bin
.
For demonstration purposes, we use the sample tuning file available in Tensile/Configs/rocblas_sgemm_example.yaml
.
The sample tuning file allows you to specify the target architecture for which the benchmark will generate a library.
To find your device architecture, run:
rocminfo | grep gfx
Specify the device architecture in the sample tuning file using ArchitectureName:
. Based on the device architecture, use ArchitectureName: "gfx90a"
or ArchitectureName: "gfx1030"
.
You can now run benchmarks using Tensile. From the top-level directory, run:
mkdir build && cd build
../Tensile/bin/Tensile ../Tensile/Configs/rocblas_sgemm_example.yaml ./
After the benchmark completes, Tensile creates the following directories:
0_Build: Contains a client executable. Use this to launch Tensile from a library viewpoint.
1_BenchmarkProblems: Contains all the problem descriptions and executables generated during benchmarking. Use the
run.sh
script to reproduce results.2_BenchmarkData: Contains the raw performance results of all kernels in CSV and YAML formats.
3_LibraryLogic: Contains the winning (optimal) kernel configurations in YAML format. Typically, rocBLAS takes the YAML files from this folder.
4_LibraryClient: Contains the code objects, kernels, and library code. This is the output of running
TensileCreateLibrary
using the3_LibraryLogic
directory as an input.