Profiling Python scripts#

ROCm Systems Profiler supports profiling Python code at the source level and the script level. Python support is enabled via the ROCPROFSYS_USE_PYTHON and the ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR> CMake options. Alternatively, to build multiple Python versions, use ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR>;[<MAJOR>.<MINOR>]", and ROCPROFSYS_PYTHON_ROOT_DIRS="/path/to/version;[/path/to/version]" instead of ROCPROFSYS_PYTHON_VERSION. When building multiple Python versions, the length of the ROCPROFSYS_PYTHON_VERSIONS and ROCPROFSYS_PYTHON_ROOT_DIRS lists must be the same size.

Note

When using ROCm Systems Profiler with Python programs, the Python interpreter major and minor version (e.g. 3.7) must match the interpreter major and minor version used when compiling the Python bindings. When building ROCm Systems Profiler, the shared object file libpyrocprofsys.<IMPL>-<VERSION>-<ARCH>-<OS>-<ABI>.so is generated where IMPL is the Python implementation, VERSION is the major and minor version, ARCH is the architecture, OS is the operating system, and ABI is the application binary interface, for example, libpyrocprofsys.cpython-38-x86_64-linux-gnu.so.

Getting started#

The ROCm Systems Profiler Python package is installed in lib/pythonX.Y/site-packages/rocprofsys. To ensure the Python interpreter can find the ROCm Systems Profiler package, add this path to the PYTHONPATH environment variable, as in the following example:

export PYTHONPATH=/opt/rocprofiler-systems/lib/python3.8/site-packages:${PYTHONPATH}

Both the share/rocprofiler-systems/setup-env.sh script and the module file in share/modulefiles/rocprofiler-systems automatically handle the prefixing of the PYTHONPATH environment variable.

Running ROCm Systems Profiler on a Python script#

ROCm Systems Profiler provides an rocprof-sys-python helper bash script which ensures PYTHONPATH is properly set and the correct Python interpreter is used. This means the following commands are effectively equivalent:

rocprof-sys-python --help

and

export PYTHONPATH=/opt/rocprofiler-systems/lib/python3.8/site-packages:${PYTHONPATH}
python3.8 -m rocprofsys --help

Note

rocprof-sys-python and python -m rocprofsys use the same command-line syntax as the other rocprof-sys executables (rocprof-sys-python <ROCPROFSYS_ARGS> -- <SCRIPT> <SCRIPT_ARGS>) and has similar options.

Command line options#

Use rocprof-sys-python --help to view the available options:

usage: rocprof-sys [-h] [-v VERBOSITY] [-b] [-c FILE] [-s FILE] [-F [BOOL]] [--label [{args,file,line} [{args,file,line} ...]]] [-I FUNC [FUNC ...]] [-E FUNC [FUNC ...]] [-R FUNC [FUNC ...]] [-MI FILE [FILE ...]] [-ME FILE [FILE ...]] [-MR FILE [FILE ...]] [--trace-c [BOOL]]

optional arguments:
-h, --help            show this help message and exit
-v VERBOSITY, --verbosity VERBOSITY
                        Logging verbosity
-b, --builtin         Put 'profile' in the builtins. Use '@profile' to decorate a single function, or 'with profile:' to profile a single section of code.
-c FILE, --config FILE
                        ROCm Systems Profiler configuration file
-s FILE, --setup FILE
                        Code to execute before the code to profile
-F [BOOL], --full-filepath [BOOL]
                        Encode the full function filename (instead of basename)
--label [{args,file,line} [{args,file,line} ...]]
                        Encode the function arguments, filename, and/or line number into the profiling function label
-I FUNC [FUNC ...], --function-include FUNC [FUNC ...]
                        Include any entries with these function names
-E FUNC [FUNC ...], --function-exclude FUNC [FUNC ...]
                        Filter out any entries with these function names
-R FUNC [FUNC ...], --function-restrict FUNC [FUNC ...]
                        Select only entries with these function names
-MI FILE [FILE ...], --module-include FILE [FILE ...]
                        Include any entries from these files
-ME FILE [FILE ...], --module-exclude FILE [FILE ...]
                        Filter out any entries from these files
-MR FILE [FILE ...], --module-restrict FILE [FILE ...]
                        Select only entries from these files
--trace-c [BOOL]      Enable profiling C functions

usage: python3 -m rocprofsys <ROCPROFSYS_ARGS> -- <SCRIPT> <SCRIPT_ARGS>

Note

The --trace-c option does not incorporate ROCm Systems Profiler’s dynamic instrumentation support. It only enables profiling the underlying C function call within the Python interpreter.

Selective instrumentation#

Similar to the rocprof-sys-instrument executable, command-line options exist for restricting, including, and excluding certain functions and modules, for example, --function-exclude "^__init__$". Alternatively, add the @profile decorator to the primary function of interest in your program and use the -b / --builtin command-line option to narrow the scope of the instrumentation to this function and its children.

Consider the following Python code (example.py):

import sys

def fib(n):
   return n if n < 2 else (fib(n - 1) + fib(n - 2))


def inefficient(n):
   a = 0
   for i in range(n):
      a += i
      for j in range(n):
            a += j
   return a


def run(n):
   return fib(n) + inefficient(n)


if __name__ == "__main__":
   run(20)

Running rocprof-sys-python ./example.py with ROCPROFSYS_PROFILE=ON and ROCPROFSYS_TIMEMORY_COMPONENTS=trip_count produces the following:

|-------------------------------------------------------------------------------------------|
|                                COUNTS NUMBER OF INVOCATIONS                               |
|-------------------------------------------------------------------------------------------|
|                      LABEL                        | COUNT  | DEPTH  |   METRIC   |  SUM   |
|---------------------------------------------------|--------|--------|------------|--------|
| |0>>> run                                         |      1 |      0 | trip_count |      1 |
| |0>>> |_fib                                       |      1 |      1 | trip_count |      1 |
| |0>>>   |_fib                                     |      2 |      2 | trip_count |      2 |
| |0>>>     |_fib                                   |      4 |      3 | trip_count |      4 |
| |0>>>       |_fib                                 |      8 |      4 | trip_count |      8 |
| |0>>>         |_fib                               |     16 |      5 | trip_count |     16 |
| |0>>>           |_fib                             |     32 |      6 | trip_count |     32 |
| |0>>>             |_fib                           |     64 |      7 | trip_count |     64 |
| |0>>>               |_fib                         |    128 |      8 | trip_count |    128 |
| |0>>>                 |_fib                       |    256 |      9 | trip_count |    256 |
| |0>>>                   |_fib                     |    512 |     10 | trip_count |    512 |
| |0>>>                     |_fib                   |   1024 |     11 | trip_count |   1024 |
| |0>>>                       |_fib                 |   2026 |     12 | trip_count |   2026 |
| |0>>>                         |_fib               |   3632 |     13 | trip_count |   3632 |
| |0>>>                           |_fib             |   5020 |     14 | trip_count |   5020 |
| |0>>>                             |_fib           |   4760 |     15 | trip_count |   4760 |
| |0>>>                               |_fib         |   2942 |     16 | trip_count |   2942 |
| |0>>>                                 |_fib       |   1152 |     17 | trip_count |   1152 |
| |0>>>                                   |_fib     |    274 |     18 | trip_count |    274 |
| |0>>>                                     |_fib   |     36 |     19 | trip_count |     36 |
| |0>>>                                       |_fib |      2 |     20 | trip_count |      2 |
| |0>>> |_inefficient                               |      1 |      1 | trip_count |      1 |
|-------------------------------------------------------------------------------------------|

If the inefficient function is decorated with @profile as follows:

@profile
def inefficient(n):
   # ...

And then run using the command rocprof-sys-python -b -- ./example.py, ROCm Systems Profiler produces this output:

|-----------------------------------------------------------|
|                COUNTS NUMBER OF INVOCATIONS               |
|-----------------------------------------------------------|
|      LABEL        | COUNT  | DEPTH  |   METRIC   |  SUM   |
|-------------------|--------|--------|------------|--------|
| |0>>> inefficient |      1 |      0 | trip_count |      1 |
|-----------------------------------------------------------|

ROCm Systems Profiler Python source instrumentation#

Starting with the unmodified example.py script above, import the rocprofsys module:

import sys
import rocprofsys  # import rocprofsys

def fib(n):
   # ... etc. ...

Next, add @rocprofsys.profile() to the run function:

@rocprofsys.profile()
def run(n):
   # ...

Alternatively, use rocprofsys.profile() as a context-manager around run(20):

if __name__ == "__main__":
   with rocprofsys.profile():
      run(20)

The results for both of the source-level instrumentation modes are identical to the original rocprofsys-python ./example.py results:

|-------------------------------------------------------------------------------------------|
|                                COUNTS NUMBER OF INVOCATIONS                               |
|-------------------------------------------------------------------------------------------|
|                      LABEL                        | COUNT  | DEPTH  |   METRIC   |  SUM   |
|---------------------------------------------------|--------|--------|------------|--------|
| |0>>> run                                         |      1 |      0 | trip_count |      1 |
| |0>>> |_fib                                       |      1 |      1 | trip_count |      1 |
| |0>>>   |_fib                                     |      2 |      2 | trip_count |      2 |
| |0>>>     |_fib                                   |      4 |      3 | trip_count |      4 |
| |0>>>       |_fib                                 |      8 |      4 | trip_count |      8 |
| |0>>>         |_fib                               |     16 |      5 | trip_count |     16 |
| |0>>>           |_fib                             |     32 |      6 | trip_count |     32 |
| |0>>>             |_fib                           |     64 |      7 | trip_count |     64 |
| |0>>>               |_fib                         |    128 |      8 | trip_count |    128 |
| |0>>>                 |_fib                       |    256 |      9 | trip_count |    256 |
| |0>>>                   |_fib                     |    512 |     10 | trip_count |    512 |
| |0>>>                     |_fib                   |   1024 |     11 | trip_count |   1024 |
| |0>>>                       |_fib                 |   2026 |     12 | trip_count |   2026 |
| |0>>>                         |_fib               |   3632 |     13 | trip_count |   3632 |
| |0>>>                           |_fib             |   5020 |     14 | trip_count |   5020 |
| |0>>>                             |_fib           |   4760 |     15 | trip_count |   4760 |
| |0>>>                               |_fib         |   2942 |     16 | trip_count |   2942 |
| |0>>>                                 |_fib       |   1152 |     17 | trip_count |   1152 |
| |0>>>                                   |_fib     |    274 |     18 | trip_count |    274 |
| |0>>>                                     |_fib   |     36 |     19 | trip_count |     36 |
| |0>>>                                       |_fib |      2 |     20 | trip_count |      2 |
| |0>>> |_inefficient                               |      1 |      1 | trip_count |      1 |
|-------------------------------------------------------------------------------------------|

Note

When rocprof-sys-python is used without built-ins, the profiling results can be cluttered by the numerous functions called when more complex modules are imported, such as import numpy.

ROCm Systems Profiler Python source instrumentation configuration#

Within the Python source code, the profiler can be configured by directly modifying the rocprof-sys.profiler.config data fields.

import sys

def fib(n):
   return n if n < 2 else (fib(n - 1) + fib(n - 2))


def inefficient(n):
   a = 0
   for i in range(n):
      a += i
      for j in range(n):
            a += j
   return a


def run(n):
   return fib(n) + inefficient(n)


if __name__ == "__main__":
   from rocprofsys.profiler import config
   from rocprofsys import profile

   config.include_args = True
   config.include_filename = False
   config.include_line = False
   config.restrict_functions += ["fib", "run"]

   with profile():
      run(5)

Executing this script produces the following:

|------------------------------------------------------------------|
|                   COUNTS NUMBER OF INVOCATIONS                   |
|------------------------------------------------------------------|
|          LABEL           | COUNT  | DEPTH  |   METRIC   |  SUM   |
|--------------------------|--------|--------|------------|--------|
| |0>>> run(n=5)           |      1 |      0 | trip_count |      1 |
| |0>>> |_fib(n=5)         |      1 |      1 | trip_count |      1 |
| |0>>>   |_fib(n=4)       |      1 |      2 | trip_count |      1 |
| |0>>>     |_fib(n=3)     |      1 |      3 | trip_count |      1 |
| |0>>>       |_fib(n=2)   |      1 |      4 | trip_count |      1 |
| |0>>>         |_fib(n=1) |      1 |      5 | trip_count |      1 |
| |0>>>         |_fib(n=0) |      1 |      5 | trip_count |      1 |
| |0>>>       |_fib(n=1)   |      1 |      4 | trip_count |      1 |
| |0>>>     |_fib(n=2)     |      1 |      3 | trip_count |      1 |
| |0>>>       |_fib(n=1)   |      1 |      4 | trip_count |      1 |
| |0>>>       |_fib(n=0)   |      1 |      4 | trip_count |      1 |
| |0>>>   |_fib(n=3)       |      1 |      2 | trip_count |      1 |
| |0>>>     |_fib(n=2)     |      1 |      3 | trip_count |      1 |
| |0>>>       |_fib(n=1)   |      1 |      4 | trip_count |      1 |
| |0>>>       |_fib(n=0)   |      1 |      4 | trip_count |      1 |
| |0>>>     |_fib(n=1)     |      1 |      3 | trip_count |      1 |
|------------------------------------------------------------------|