Enabling logging in rocFFT#
rocFFT can write a variety of log messages to aid troubleshooting. Here are the different logs that rocFFT supports.
Trace logging: Logs the library entry points (for example,
rocfft_plan_create
orrocfft_execute
) and their parameter values when they are called. Error messages during plan creation and execution are also logged here.Benchmark logging: Logs the
rocfft-bench
command line when a plan is created. You can use this command to rerun the same transform later.Profile logging: Logs a message for each kernel launched during plan execution. This message contains the following elements:
Kernel duration
The size of the user data buffers seen by the kernel
Estimates for the observed memory bandwidth and bandwidth efficiency
Note
To provide the kernel duration, rocFFT must use
hipEvents
and wait for each kernel to complete. This might interfere with time measurement at higher levels, for example, forrocfft-bench
.Plan logging: Logs the plan details when a transform is executed, including the following:
Each TreeNode in the plan
The work buffer size required by the plan
The kernel grid and block dimensions
The kernel maximum occupancy (estimated by HIP)
Kernel I/O logging: Logs the kernel details during plan execution, including the input to each kernel (the data provided by the user) and the final output of the transform.
Note
The amount of data logged can become very large, particularly for 2D and 3D transforms, so logging it to a file instead of stderr is usually a good idea. See the next section for more details.
Writing the data involves extra
hipMemcpy
operations and serializing the data to the log can also take a significant amount of time. Both of these factors affect performance.Runtime compilation logging: Logs details about runtime compilation during plan creation, including the following:
The source code
Messages indicating a kernel was found in a cache, and did not need to be compiled at runtime
Compilation errors (if any)
Duration measurements indicating the time it took to generate source code for the kernel and compile the kernel
The source code for the kernels is delimited by lines containing the strings
ROCFFT_RTC_BEGIN
andROCFFT_RTC_END
. This lets you isolate the source code for each kernel if a single log contains code for multiple kernels.Note
All non-code messages (except for compile errors) are written as C++ comments, so you can pass the whole file to clang-format to inspect the source code.
The source code details for the runtime compilation can be very large, so consider writing this log to a file instead of stderr.
Tuning logging: Logs details about any kernels that are tried and rejected while tuning is running. It also logs messages when tuned solutions are used during plan building.
Graph logging: Logs the graph of subplans during multi-GPU or multi-process plan execution. Subplans include FFT plans, transpose plans (to reshape data for communication), and communication steps. This is written as Graphviz data. The view of the global graph might be slightly different from different nodes. This is because the current node has more visibility into subplans that run locally than those that run on other nodes.
Configuring the logging output#
The logging output can be controlled using the ROCFFT_LAYER
environment variable.
ROCFFT_LAYER
is a numerical bitmask, where zero or more bits can be set to enable one or more logging layers.
The log output is written to stderr by default.
The following table maps the different logging layers to a ROCFFT_LAYER
bit field value.
To determine what value to set for ROCFFT_LAYER
, add up the values of all the layers you want to see.
For example, to see the output for trace, profile, and plan logging, set ROCFFT_LAYER
to 13
(1
+ 4
+ 8
).
Log type |
ROCFFT_LAYER bit field value |
---|---|
Trace logging |
1 |
Benchmark logging |
2 |
Profile logging |
4 |
Plan logging |
8 |
Kernel I/O logging |
16 |
Runtime compilation logging |
32 |
Tuning logging |
64 |
Graph logging |
128 |
Logging to a file#
By default, messages are written to stderr, but they can be redirected to
output files using the environment variables described in this section.
Each type of log can be redirected separately using a unique environment variable.
The corresponding log must be enabled using the ROCFFT_LAYER
variable
before any details can be logged to the file.
For example, to redirect the trace log to a file, trace logging must
also be enabled in the ROCFFT_LAYER
bit field.
Note
Some log types, such as kernel I/O logging and runtime compilation logging, can generate a large number of log entries, so redirecting their output to a file is recommended.
The following table lists the environment variable to redirect logging for each
log type. Set this variable to a valid file path to redirect the output of the corresponding log type.
For example, to send the trace logging output to a file, enable the trace log, then set the
ROCFFT_LOG_TRACE_PATH
variable to the name of the destination file.
Log type |
File redirection variable |
---|---|
Trace logging |
|
Benchmark logging |
|
Profile logging |
|
Plan logging |
|
Kernel I/O logging |
|
Runtime compilation logging |
|
Tuning logging |
|
Graph logging |
|