Using rocsys#
rocsys
is a command-line utility tool used to invoke and control a
profiling session (launch/start/stop/exit) on an application being
traced or profiled. rocsys
is especially useful for selective profiling
of applications with long-running workloads (such as DNN training) as it
allows you to profile and control the application while it is running.
You can also launch the session from one terminal and control the
application using rocsys
from another terminal.
To see all the rocsys
options, run:
rocsys -help
rocsys: launch must be preceded by --session <name>
e.g. rocsys --session <SESSION_NAME> launch <MPI_COMMAND> <MPI_ARGUMENTS> rocprofv2
<ROCPROFV2_OPTIONS> <APP_EXEC>
where all mpiexec options must come before rocsys
rocsys: start must be preceded by --session <name>
rocsys --session <name> start
rocsys: stop must be preceded by --session <name>
rocsys --session <name> stop
rocsys: exit must be preceded by --session <name>
rocsys --session <name> exit
The following are the session management options used with rocsys
in the given
order to achieve selective profiling on the rocprofv2
run:
Launch - Creates a session. After launching the application stops until the session is started as shown in step 2.
/opt/rocm/bin/rocsys --session session1 launch rocprofv2 -i ../samples/input.txt <long_running_app>
ROCSYS:: Session ID: 2109
ROCSYS Session Created!
ROCProfilerV2: Collecting the following counters:
- SQ_WAVES
- GRBM_COUNT
- GRBM_GUI_ACTIVE
- SQ_INSTS_VALU
- FETCH_SIZE
Enabling Counter Collection
Start - Starts the halted after launching session on the same or another terminal, and begins dumping kernel profiling information. The start command triggers the halted application to run.
/opt/rocm/bin/rocsys --session session1 start
ROCSYS:: Starting Tools Session...
Dispatch_ID(1), GPU_ID(1), ... // All the metrics of a kernel
Dispatch_ID(2), GPU_ID(1), ... // All the metrics of a kernel
Dispatch_ID(3), GPU_ID(1), ... // All the metrics of a kernel
Stop - Stops the session. The information displayed on the terminal is a result of kernel profiling between the current and the previous
rocsys
command. Note that this command stops only the profiling session without affecting the application on the run.
/opt/rocm/bin/rocsys --session session1 stop
ROCSYS:: Stopping Tools Session...
Dispatch_ID(22397), GPU_ID(1), ... // All the metrics of a kernel
Dispatch_ID(22398), GPU_ID(1), ... // All the metrics of a kernel
Dispatch_ID(22399), GPU_ID(1), ... // All the metrics of a kernel
Start (to restart) - rocsys allows you to start and stop the session innumerable times once the session is created. This helps in analyzing batches of kernel profiling information.
/opt/rocm/bin/rocsys --session session1 start
ROCSYS:: Starting Tools Session...
Dispatch_ID(22400), GPU_ID(1), ... // All the metrics of a kernel
Dispatch_ID(22401), GPU_ID(1), ... // All the metrics of a kernel
Exit - Exits the profiling session. Once the session is exited, it cannot be restarted.
/opt/rocm/bin/rocsys --session session1 exit
Dispatch_ID(16828), GPU_ID(1), ... // All the metrics of a kernel
Dispatch_ID(16829), GPU_ID(1), ... // All the metrics of a kernel
ROCSYS:: Exiting Tools Session...Application might still be finishing up..
Note
Exiting the session only stops profiling. The application could
continue running to completion in the background. If you don’t want to
wait for the application to finish, use CTRL+C
to stop the
application after exit
.