Introduction#
Overview#
hipCUB is a thin header-only wrapper library on top of rocPRIM or CUB. It enables developers to port project using CUB library to the HIP layer and to run them on AMD hardware. In the ROCm environment, hipCUB uses rocPRIM library as the backend, however, on CUDA platforms it uses CUB instead.
- When using hipCUB you should only include - <hipcub/hipcub.hpp>header.
- When rocPRIM is used as backend - HIPCUB_ROCPRIM_APIis defined.
- When CUB is used as backend - HIPCUB_CUB_APIis defined.
- Backends are automaticaly selected based on platform detected by HIP layer ( - __HIP_PLATFORM_AMD__,- __HIP_PLATFORM_NVIDIA__).
rocPRIM backend#
hipCUB with the rocPRIM backend may not support all function and features CUB has because of the differences between the ROCm (HIP) platform and the CUDA platform.
Not-supported features and differences:
- Functions, classes and macros which are not in the public API or not documented are not supported. 
- Device-wide primitives can’t be called from kernels (dynamic parallelism is not supported in HIP on ROCm). 
- DeviceSpmvis not supported.
- Fancy iterators: - CacheModifiedInputIterator,- CacheModifiedOutputIterator, and- TexRefInputIteratorare not supported.
- Thread I/O: - CacheLoadModifier,- CacheStoreModifiercache modifiers are not supported.
- ThreadLoad,- ThreadStorefunctions are not supported.
 
- Storage management and debug functions: - Debug,- PtxVersion,- SmVersionfunctions and- CubDebug,- CubDebugExit,- _CubLogmacros are not supported.
 
- Intrinsics: - ThreadExit,- ThreadTrap- not supported.
- Warp thread masks (when used) are 64-bit unsigned integers. 
- member_maskinput argument is ignored in- WARP_*functions.
- Arguments - first_lane,- last_lane, and- member_maskare ignored in- Shuffle*functions.
 
- Utilities: - SwizzleScanOp,- ReduceBySegmentOp,- ReduceByKeyOp,- CastOp- not supported.