Occupancy#
- 
hipError_t hipModuleOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit)#
- determine the grid and block sizes to achieves maximum occupancy for a kernel - Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. - Parameters:
- gridSize – [out] minimum grid size for maximum potential occupancy 
- blockSize – [out] block size for maximum potential occupancy 
- f – [in] kernel function for which occupancy is calulated 
- dynSharedMemPerBlk – [in] dynamic shared memory usage (in bytes) intended for each block 
- blockSizeLimit – [in] the maximum block size for the kernel, use 0 for no limit 
 
- Returns:
- hipSuccess, hipErrorInvalidValue 
 
- 
hipError_t hipModuleOccupancyMaxPotentialBlockSizeWithFlags(int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit, unsigned int flags)#
- determine the grid and block sizes to achieves maximum occupancy for a kernel - Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. - Parameters:
- gridSize – [out] minimum grid size for maximum potential occupancy 
- blockSize – [out] block size for maximum potential occupancy 
- f – [in] kernel function for which occupancy is calulated 
- dynSharedMemPerBlk – [in] dynamic shared memory usage (in bytes) intended for each block 
- blockSizeLimit – [in] the maximum block size for the kernel, use 0 for no limit 
- flags – [in] Extra flags for occupancy calculation (only default supported) 
 
- Returns:
- hipSuccess, hipErrorInvalidValue 
 
- 
hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk)#
- Returns occupancy for a device function. - Parameters:
- numBlocks – [out] Returned occupancy 
- f – [in] Kernel function (hipFunction) for which occupancy is calulated 
- blockSize – [in] Block size the kernel is intended to be launched with 
- dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block 
 
- Returns:
- hipSuccess, hipErrorInvalidValue 
 
- 
hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#
- Returns occupancy for a device function. - Parameters:
- numBlocks – [out] Returned occupancy 
- f – [in] Kernel function(hipFunction_t) for which occupancy is calulated 
- blockSize – [in] Block size the kernel is intended to be launched with 
- dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block 
- flags – [in] Extra flags for occupancy calculation (only default supported) 
 
- Returns:
- hipSuccess, hipErrorInvalidValue 
 
- 
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk)#
- Returns occupancy for a device function. - Parameters:
- numBlocks – [out] Returned occupancy 
- f – [in] Kernel function for which occupancy is calulated 
- blockSize – [in] Block size the kernel is intended to be launched with 
- dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block 
 
- Returns:
- hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue 
 
- 
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#
- Returns occupancy for a device function. - Parameters:
- numBlocks – [out] Returned occupancy 
- f – [in] Kernel function for which occupancy is calulated 
- blockSize – [in] Block size the kernel is intended to be launched with 
- dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block 
- flags – [in] Extra flags for occupancy calculation (currently ignored) 
 
- Returns:
- hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue 
 
- 
hipError_t hipOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, const void *f, size_t dynSharedMemPerBlk, int blockSizeLimit)#
- determine the grid and block sizes to achieves maximum occupancy for a kernel - Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32. - Parameters:
- gridSize – [out] minimum grid size for maximum potential occupancy 
- blockSize – [out] block size for maximum potential occupancy 
- f – [in] kernel function for which occupancy is calulated 
- dynSharedMemPerBlk – [in] dynamic shared memory usage (in bytes) intended for each block 
- blockSizeLimit – [in] the maximum block size for the kernel, use 0 for no limit 
 
- Returns:
- hipSuccess, hipErrorInvalidValue 
 
- 
template<class T>
 inline hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk)#
- Returns occupancy for a kernel function. - Parameters:
- numBlocks – [out] - Pointer of occupancy in number of blocks. 
- f – [in] - The kernel function to launch on the device. 
- blockSize – [in] - The block size as kernel launched. 
- dynSharedMemPerBlk – [in] - Dynamic shared memory in bytes per block. 
 
- Returns:
- hipSuccess, hipErrorInvalidValue 
 
- 
template<class T>
 inline hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#
- Returns occupancy for a device function with the specified flags. - Parameters:
- numBlocks – [out] - Pointer of occupancy in number of blocks. 
- f – [in] - The kernel function to launch on the device. 
- blockSize – [in] - The block size as kernel launched. 
- dynSharedMemPerBlk – [in] - Dynamic shared memory in bytes per block. 
- flags – [in] - Flag to handle the behavior for the occupancy calculator. 
 
- Returns:
- hipSuccess, hipErrorInvalidValue 
 
- 
template<typename UnaryFunction, class T>
 static inline hipError_t hipOccupancyMaxPotentialBlockSizeVariableSMemWithFlags(int *min_grid_size, int *block_size, T func, UnaryFunction block_size_to_dynamic_smem_size, int block_size_limit = 0, unsigned int flags = 0)#
- Returns grid and block size that achieves maximum potential occupancy for a device function. - Returns in - *min_grid_sizeand- *block_sizea suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).- Parameters:
- min_grid_size – [out] minimum grid size needed to achieve the best potential occupancy 
- block_size – [out] block size required for the best potential occupancy 
- func – [in] device function symbol 
- block_size_to_dynamic_smem_size – [in] - a unary function/functor that takes block size, and returns the size, in bytes, of dynamic shared memory needed for a block 
- block_size_limit – [in] the maximum block size - funcis designed to work with. 0 means no limit.
- flags – [in] reserved 
 
- Returns:
- hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown 
 
- 
template<typename UnaryFunction, class T>
 static inline hipError_t hipOccupancyMaxPotentialBlockSizeVariableSMem(int *min_grid_size, int *block_size, T func, UnaryFunction block_size_to_dynamic_smem_size, int block_size_limit = 0)#
- Returns grid and block size that achieves maximum potential occupancy for a device function. - Returns in - *min_grid_sizeand- *block_sizea suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).- Parameters:
- min_grid_size – [out] minimum grid size needed to achieve the best potential occupancy 
- block_size – [out] block size required for the best potential occupancy 
- func – [in] device function symbol 
- block_size_to_dynamic_smem_size – [in] - a unary function/functor that takes block size, and returns the size, in bytes, of dynamic shared memory needed for a block 
- block_size_limit – [in] the maximum block size - funcis designed to work with. 0 means no limit.
 
- Returns:
- hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown 
 
- 
template<typename F>
 inline hipError_t hipOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, F kernel, size_t dynSharedMemPerBlk, uint32_t blockSizeLimit)#
- Returns grid and block size that achieves maximum potential occupancy for a device function. - Returns in - *min_grid_sizeand- *block_sizea suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).- See also - Returns:
- hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue