HIP API 7.0 changes#

To improve code portability between AMD and NVIDIA GPU programming models, changes were made to the HIP API in ROCm 7.0 to simplify cross-platform programming. These changes align HIP C++ even more closely with NVIDIA CUDA. These changes are incompatible with prior releases, and might require recompiling existing HIP applications for use with ROCm 7.0, or editing and recompiling code in some cases. In the best case, the change requires no modification of existing applications. These changes were made available in a preview release based on ROCm 6.4.1 to help you prepare.

Behavior changes in HIP Runtime API#

Update hipGetLastError#

Prior to the 7.0 release of the HIP API, hipGetLastError() was not fully compliant with CUDA’s behavior. The purpose of this change is to have hipGetLastError return the last actual error caught in the current thread during the application execution. Neither hipSuccess nor hipErrorNotReady is considered an error. Take the following code as an example:

1: hipError_t err = hipMalloc(...); // returns hipOutOfMemory
2: err = hipSetDevice(0); // returns hipSuccess
3: err = hipGetLastError();

The prior behavior was for hipGetLastError at line 3 to return hipSuccess from line 2. In the 7.0 release, the value of err at line 3 is hipOutOfMemory which is the error returned in Line 1, rather than simply the result returned in line 2. This matches CUDA behavior.

You can still use the prior functionality by using the hipExtGetLastError function. Notice that the function begins with hipExt which denotes a function call that is unique to HIP, without correlation to CUDA. This function was introduced with the 6.0 release.

Cooperative groups changes#

For hipLaunchCooperativeKernelMultiDevice() function, HIP now includes additional input parameter validation checks.

  • If the input launch stream is a NULLPTR or it is hipStreamLegacy, the function now returns hipErrorInvalidResourceHandle.

  • If the stream capturing is active, the function returns the error code hipErrorStreamCaptureUnsupported.

  • If the stream capture status is invalidated, the function returns the error hipErrorStreamCaptureInvalidated.

The hipLaunchCooperativeKernel() function now checks the input stream handle. If it’s invalid, the returned error is changed to hipErrorInvalidHandle from hipErrorContextIsDestroyed.

Update hipPointerGetAttributes#

hipPointerGetAttributes() now matches the functionality of cudaPointerGetAttributes which changed in CUDA 11. If a NULL host or attribute pointer is passed as input parameter, hipPointerGetAttributes now returns hipSuccess instead of the error code hipErrorInvalidValue.

Any application which is expecting the API to return an error instead of success could be impacted and a code change may need to handle the error properly.

Update hipFree#

hipFree() previously had an implicit wait for synchronization purpose which is applicable for all memory allocations. This wait has been disabled in the HIP 7.0 runtime for allocations made with hipMallocAsync and hipMallocFromPoolAsync to match the behavior of CUDA API cudaFree

HIP runtime compiler (hipRTC) changes#

Runtime compilation for HIP is available through the hipRTC library as described in Programming for HIP runtime compiler (RTC). The library grew organically within the main HIP runtime code. However, segregation of the hipRTC code is now needed to ensure better compatibility and easier code portability.

Removal of hipRTC symbols from HIP Runtime Library#

hipRTC has been an independent library since the 6.0 release, but the hipRTC symbols were still available in the HIP runtime library. Starting with the 7.0 release hipRTC is no longer included in the HIP runtime, and any application using hipRTC APIs should link explicitly with the hipRTC library.

This change makes the usage of hipRTC library on Linux the same as on Windows and matches the behavior of CUDA nvRTC.

hipRTC compilation#

The device code compilation via hipRTC now uses namespace __hip_internal, instead of the standard headers std, to avoid namespace collision. These changes are made in the HIP header files.

No code change is required in any application, but rebuilding is necessary.

Removal of datatypes from hipRTC#

In hipRTC, datatype definitions such as int64_t, uint64_t, int32_t, and uint32_t could result in conflicts in some applications, as they use their own definitions for these types. nvRTC doesn’t define these datatypes either. These datatypes are removed and replaced by HIP internal datatypes prefixed with __hip, for example, __hip_int64_t.

Any application relying on HIP internal datatypes during hipRTC compilation might be affected. These changes have no impact on any application if it compiles as expected using nvRTC.

HIP header clean up#

HIP header files previously included unnecessary Standard Template Libraries (STL) headers. With the 7.0 release, unnecessary STL headers are no longer included, and only the required STL headers are included.

Applications relying on HIP runtime header files might need to be updated to include STL header files that have been removed in 7.0.

API signature and struct changes#

API signature changes#

Signatures in some APIs have been modified to match corresponding CUDA APIs, as described below.

The RTC method definition is changed in the following hipRTC APIs:

In these APIs, the input parameter type changes from const char** to const char* const*.

In addition, the following APIs have signature changes:

  • hipMemcpyHtoD(), the type of the second argument pointer changes from const void* to void*.

  • hipCtxGetApiVersion(), the type of second argument is changed from int* to unsigned int*.

These signature changes do not require code modifications but do require rebuilding the application.

Deprecated struct HIP_MEMSET_NODE_PARAMS#

The deprecated structure HIP_MEMSET_NODE_PARAMS is removed. You can use the definition hipMemsetParams instead, as input parameter, while using these two APIs:

hipMemsetParams struct change#

The struct hipMemsetParams is updated to be compatible with CUDA. The change is from the old struct definition shown below:

typedef struct hipMemsetParams {
  void* dst;
  unsigned int elementSize;
  size_t height;
  size_t pitch;
  unsigned int value;
  size_t width;
} hipMemsetParams;

To the new struct definition as follows:

typedef struct hipMemsetParams {
  void* dst;
  size_t pitch;
  unsigned int value;
  unsigned int elementSize;
  size_t width;
  size_t height;
} hipMemsetParams;

No code change is required in any application using this structure, but rebuilding is necessary.

HIP vector constructor change#

Changes have been made to HIP vector constructors for hipComplex initialization to generate values in alignment with CUDA. The affected constructors are small vector types such as float2 and int4 for example. If your code previously relied on a single value to initialize all components within a vector or complex type, you might need to update your code. Otherwise, rebuilding the application is necessary but no code change is required in any application using these constructors.

Stream capture updates#

Restrict stream capture modes#

Stream capture mode has been restricted in the following APIs to relaxed (hipStreamCaptureModeRelaxed) mode:

These APIs are allowed only in relaxed stream capture mode. If the functions are used with stream capture, the HIP runtime the will return hipErrorStreamCaptureUnsupported on unsupported stream capture modes.

Check stream capture mode#

The following APIs will check the stream capture mode and return error codes to match the behavior of CUDA. No impact if stream capture is working correctly on CUDA. Otherwise, the application would need to modify the graph being captured.

  • hipLaunchCooperativeKernelMultiDevice() - Returns error code while stream capture status is active. The usage is restricted during stream capture

  • hipEventQuery() - Returns an error hipErrorStreamCaptureUnsupported in global capture mode

  • hipStreamAddCallback() - The stream capture behavior is updated. The function now checks if any of the blocking streams are capturing. If so, it returns an error and invalidates all capturing streams. The usage of this API is restricted during stream capture to match CUDA.

Stream capture error return#

During stream capture, the following HIP APIs return the hipErrorStreamCaptureUnsupported error on the HIP runtime, but not always hipSuccess, to match behavior with CUDA.

The usage of these APIs is restricted during stream capture. No impact if stream capture is working fine on CUDA.

Error code changes#

The following HIP APIs have been updated to return new or additional error codes to match the corresponding CUDA APIs. Most existing applications just check if hipSuccess is returned and no change is needed. However, if an application checks for a specific error code, the application code may need to be updated to match/handle the new error code accordingly.

Invalid stream input parameter handling matches CUDA#

In order to match the CUDA runtime behavior more closely, HIP APIs with streams passed as input parameters no longer check the stream validity. Prior to the 7.0 release, the HIP runtime returns an error code hipErrorContextIsDestroyed if the stream is invalid. In CUDA 12 and later, the equivalent behavior is to raise a segmentation fault. With HIP 7.0, the HIP runtime matches CUDA by causing a segmentation fault. The list of APIs impacted by this change are as follows:

Developers porting CUDA code to HIP no longer need to modify their error handling code. However, if you have come to expect the HIP runtime to return the error code hipErrorContextIsDestroyed, you might need to adjust your code.

warpSize Change#

To match the CUDA specification, warpSize is no longer a constexpr. In general, this should be a transparent change. However, if an application was using warpSize as a compile-time constant, it will have to be updated to handle the new definition. For more information, see warpSize in HIP C++ language extensions.