HIP environment variables#
In this section, the reader can find all the important HIP environment variables on AMD platform, which are grouped by functionality.
GPU isolation variables#
Restricting the access of applications to a subset of GPUs, also known as GPU isolation, allows users to hide GPU resources from programs. The GPU isolation environment variables in HIP are collected in the following table.
Environment variable |
Links |
Value |
|---|---|---|
ROCR_VISIBLE_DEVICESA list of device indices or UUIDs that will be exposed to applications.
|
Example: ``0,GPU-4b2c1a9f-8d3e-6f7a-b5c9-2e4d8a1f6c3b` |
|
GPU_DEVICE_ORDINALDevices indices exposed to OpenCL and HIP applications.
|
Example: |
|
HIP_VISIBLE_DEVICES or CUDA_VISIBLE_DEVICESDevice indices exposed to HIP applications.
|
Example: |
Recommendation
On Linux, use
ROCR_VISIBLE_DEVICES.On Windows, use
HIP_VISIBLE_DEVICES.For portability across different vendors, use
CUDA_VISIBLE_DEVICES.
Profiling variables#
The profiling environment variables in HIP are collected in the following table. For more information, check setting the number of CUs page.
Environment variable |
Value |
|---|---|
HSA_CU_MASKSets the mask on a lower level of queue creation in the driver, this mask will also be set for queues being profiled.
|
Example: |
ROC_GLOBAL_CU_MASKSets the mask on queues created by the HIP or the OpenCL runtimes, this mask will also be set for queues being profiled.
|
Example: |
HIP_FORCE_QUEUE_PROFILINGUsed to run the app as if it were run in rocprof. Forces command queue profiling on by default.
|
0: Disable
1: Enable
|
Debug variables#
The debugging environment variables in HIP are collected in the following table. For more information, check Logging HIP activity, Debugging with HIP and GPU isolation.
Environment variable |
Default value |
Value |
|---|---|---|
AMD_LOG_LEVELEnables HIP log on various level.
|
|
0: Disable log.
1: Enables error logs.
2: Enables warning logs next to lower-level logs.
3: Enables information logs next to lower-level logs.
4: Enables debug logs next to lower-level logs.
5: Enables debug extra logs next to lower-level logs.
|
AMD_LOG_LEVEL_FILESets output file for
AMD_LOG_LEVEL. |
stderr output |
|
AMD_LOG_MASKSpecifies HIP log filters. Here is the ` complete list of log masks <ROCm/clr>`_.
|
|
0x1: Log API calls.
0x2: Kernel and copy commands and barriers.
0x4: Synchronization and waiting for commands to finish.
0x8: Decode and display AQL packets.
0x10: Queue commands and queue contents.
0x20: Signal creation, allocation, pool.
0x40: Locks and thread-safety code.
0x80: Kernel creations and arguments, etc.
0x100: Copy debug.
0x200: Detailed copy debug.
0x400: Resource allocation, performance-impacting events.
0x800: Initialization and shutdown.
0x1000: Misc debug, not yet classified.
0x2000: Show raw bytes of AQL packet.
0x4000: Show code creation debug.
0x8000: More detailed command info, including barrier commands.
0x10000: Log message location.
0x20000: Memory allocation.
0x40000: Memory pool allocation, including memory in graphs.
0x80000: Timestamp details.
0xFFFFFFFF: Log always even mask flag is zero.
|
HIP_FORCE_DEV_KERNARGForces kernel arguments to be stored in device memory to reduce latency.
Can improve performance by 2-3 µs for some kernels.
|
|
0: Disable
1: Enable
|
HIP_LAUNCH_BLOCKINGUsed for serialization on kernel execution.
|
|
0: Disable. Kernel executes normally.
1: Enable. Serializes kernel enqueue, behaves the same as
AMD_SERIALIZE_KERNEL. |
HIP_VISIBLE_DEVICES (or CUDA_VISIBLE_DEVICES)Only devices whose index is present in the sequence are visible to HIP
|
Unset by default. |
0,1,2: Depending on the number of devices on the system. |
GPU_DUMP_CODE_OBJECTDump code object.
|
|
0: Disable
1: Enable
|
AMD_SERIALIZE_KERNELSerialize kernel enqueue.
|
|
0: Disable
1: Wait for completion before enqueue.
2: Wait for completion after enqueue.
3: Both
|
AMD_SERIALIZE_COPYSerialize copies
|
|
0: Disable
1: Wait for completion before enqueue.
2: Wait for completion after enqueue.
3: Both
|
AMD_DIRECT_DISPATCHEnable direct kernel dispatch (Currently for Linux; under development for Windows).
|
|
0: Disable
1: Enable
|
GPU_MAX_HW_QUEUESThe maximum number of hardware queues allocated per device.
|
|
The variable controls how many independent hardware queues HIP runtime can create per process, per device. If an application allocates more HIP streams than this number, then HIP runtime reuses the same hardware queues for the new streams in a round-robin manner. Note that this maximum number does not apply to hardware queues that are created for CU-masked HIP streams, or cooperative queues for HIP Cooperative Groups (single queue per device). |
Other useful variables#
The following table lists environment variables that are useful but relate to different features in HIP.
Environment variable |
Default value |
Value |
|---|---|---|
HIPRTC_COMPILE_OPTIONS_APPENDSets compile options needed for
hiprtc compilation. |
Unset by default. |
|
AMD_COMGR_SAVE_TEMPSControls the deletion of temporary files generated during the compilation of COMGR. These files do not appear in the current working directory, but are instead left in a platform-specific temporary directory.
|
Unset by default. |
0: Temporary files are deleted automatically.
Non zero integer: Turn off the temporary files deletion.
|
AMD_COMGR_EMIT_VERBOSE_LOGSSets logging of COMGR to include additional Comgr-specific informational messages.
|
Unset by default. |
0: Verbose log disabled.
Non zero integer: Verbose log enabled.
|
AMD_COMGR_REDIRECT_LOGSControls redirect logs of COMGR.
|
Unset by default. |
stdout / -: Redirected to the standard output.
stderr: Redirected to the error stream.
|