hipBLASLt environment variables#
This section describes the important hipBLASLt environment variables, which are grouped by functionality.
Logging and debugging#
The logging and debugging environment variables for hipBLASLt are collected in the following table. For more information, see Use logging and heuristics.
| Environment variable | Value | 
|---|---|
| HIPBLASLT_LOG_LEVELControls the verbosity level of hipBLASLt logging output. | 0: Off (logging disabled, default) 1: Error (only errors are logged) 2: Trace (API calls with kernel launches log parameters) 3: Hints (performance improvement suggestions) 4: Info (general library execution information) 5: API trace (detailed API call parameters) | 
| HIPBLASLT_LOG_MASKControls logging output using bit mask flags (can be combined). | 0: Off 1: Error 2: Trace 4: Hints 8: Info 16: API trace 32: Bench 64: Profile 128: Extended profile | 
| HIPBLASLT_LOG_FILESpecifies path to logging file. Can contain  %ifor process ID replacement. | Path to log file (for example,  logfile_%i.log)If not defined: log messages printed to stdout | 
| HIPBLASLT_ENABLE_MARKEREnables marker trace for ROCProfiler profiling. | 0 or unset: Disable marker trace 1: Enable marker trace | 
Offline tuning#
The offline tuning environment variables for hipBLASLt are collected in the following table. For more information, see Use hipBLASLt offline tuning.
| Environment variable | Value | 
|---|---|
| HIPBLASLT_TUNING_FILESpecifies file to store tuning results with best solution indices for GEMM problems. | Path to tuning file (for example,  tuning.txt)File stores optimal kernel indices for reuse | 
| HIPBLASLT_TUNING_OVERRIDE_FILESpecifies file to load tuning results and override default kernel selection. | Path to tuning file (for example,  tuning.txt)Loads previously saved optimal kernel choices | 
| HIPBLASLT_TUNING_USER_MAX_WORKSPACESets maximum workspace size constraint during tuning stage. | Integer value in bytes (default: 128 * 1024 * 1024) Limits workspace size for solution selection | 
Stream-K configuration#
The Stream-K configuration environment variables for hipBLASLt are collected in the following table. For more information, see Use Stream-K with hipBLASLt.
| Environment variable | Value | 
|---|---|
| TENSILE_SOLUTION_SELECTION_METHODControls hipBLASLt kernel selection strategy for GEMM operations. | 0: Default (standard tuned libraries, no Stream-K) 2: Stream-K (enables Stream-K library for consistent performance) This variable has no effect on the AMD Instinctâ„¢ MI350 series. Stream-K is always used. | 
| TENSILE_STREAMK_DYNAMIC_GRIDControls Stream-K dynamic grid size selection behavior. | 0: Disable dynamic grid (use all available compute units) 3: Default (automatically pick optimal workgroup count) | 
| TENSILE_STREAMK_FIXED_GRIDOverrides default grid size with specified number of workgroups for Stream-K kernels. | Integer value specifying number of workgroups Example: 64 (limits GEMM kernels to 64 workgroups) | 
| TENSILE_STREAMK_MAX_CUSSets maximum number of compute units for Stream-K kernels. | Integer value specifying maximum compute units Example: 32 (limits GEMM kernels to 32 compute units) Default: All available compute units | 
Type overrides#
Overrides for specific types.
| Environment variable | Value | 
|---|---|
| HIPBLASLT_OVERRIDE_COMPUTE_TYPE_XF32Overrides the compute type used for GEMMs which specify a compute type of  XF32. | -1: Off (Default) 0: F32 1: XF32(eg TF32) 2: F32_BF16 |