Occupancy#
-
hipError_t hipModuleOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit)#
determine the grid and block sizes to achieves maximum occupancy for a kernel
Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Parameters:
gridSize – [out] minimum grid size for maximum potential occupancy
blockSize – [out] block size for maximum potential occupancy
f – [in] kernel function for which occupancy is calulated
dynSharedMemPerBlk – [in] dynamic shared memory usage (in bytes) intended for each block
blockSizeLimit – [in] the maximum block size for the kernel, use 0 for no limit
- Returns:
hipSuccess, hipErrorInvalidValue
-
hipError_t hipModuleOccupancyMaxPotentialBlockSizeWithFlags(int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit, unsigned int flags)#
determine the grid and block sizes to achieves maximum occupancy for a kernel
Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Parameters:
gridSize – [out] minimum grid size for maximum potential occupancy
blockSize – [out] block size for maximum potential occupancy
f – [in] kernel function for which occupancy is calulated
dynSharedMemPerBlk – [in] dynamic shared memory usage (in bytes) intended for each block
blockSizeLimit – [in] the maximum block size for the kernel, use 0 for no limit
flags – [in] Extra flags for occupancy calculation (only default supported)
- Returns:
hipSuccess, hipErrorInvalidValue
-
hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk)#
Returns occupancy for a device function.
- Parameters:
numBlocks – [out] Returned occupancy
f – [in] Kernel function (hipFunction) for which occupancy is calulated
blockSize – [in] Block size the kernel is intended to be launched with
dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block
- Returns:
hipSuccess, hipErrorInvalidValue
-
hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#
Returns occupancy for a device function.
- Parameters:
numBlocks – [out] Returned occupancy
f – [in] Kernel function(hipFunction_t) for which occupancy is calulated
blockSize – [in] Block size the kernel is intended to be launched with
dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block
flags – [in] Extra flags for occupancy calculation (only default supported)
- Returns:
hipSuccess, hipErrorInvalidValue
-
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk)#
Returns occupancy for a device function.
- Parameters:
numBlocks – [out] Returned occupancy
f – [in] Kernel function for which occupancy is calulated
blockSize – [in] Block size the kernel is intended to be launched with
dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block
- Returns:
hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue
-
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#
Returns occupancy for a device function.
- Parameters:
numBlocks – [out] Returned occupancy
f – [in] Kernel function for which occupancy is calulated
blockSize – [in] Block size the kernel is intended to be launched with
dynSharedMemPerBlk – [in] Dynamic shared memory usage (in bytes) intended for each block
flags – [in] Extra flags for occupancy calculation (currently ignored)
- Returns:
hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue
-
hipError_t hipOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, const void *f, size_t dynSharedMemPerBlk, int blockSizeLimit)#
determine the grid and block sizes to achieves maximum occupancy for a kernel
Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Parameters:
gridSize – [out] minimum grid size for maximum potential occupancy
blockSize – [out] block size for maximum potential occupancy
f – [in] kernel function for which occupancy is calulated
dynSharedMemPerBlk – [in] dynamic shared memory usage (in bytes) intended for each block
blockSizeLimit – [in] the maximum block size for the kernel, use 0 for no limit
- Returns:
hipSuccess, hipErrorInvalidValue
-
template<class T>
inline hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk)# Returns occupancy for a kernel function.
- Parameters:
numBlocks – [out] - Pointer of occupancy in number of blocks.
f – [in] - The kernel function to launch on the device.
blockSize – [in] - The block size as kernel launched.
dynSharedMemPerBlk – [in] - Dynamic shared memory in bytes per block.
- Returns:
hipSuccess, hipErrorInvalidValue
-
template<class T>
inline hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)# Returns occupancy for a device function with the specified flags.
- Parameters:
numBlocks – [out] - Pointer of occupancy in number of blocks.
f – [in] - The kernel function to launch on the device.
blockSize – [in] - The block size as kernel launched.
dynSharedMemPerBlk – [in] - Dynamic shared memory in bytes per block.
flags – [in] - Flag to handle the behavior for the occupancy calculator.
- Returns:
hipSuccess, hipErrorInvalidValue
-
template<typename UnaryFunction, class T>
static inline hipError_t hipOccupancyMaxPotentialBlockSizeVariableSMemWithFlags(int *min_grid_size, int *block_size, T func, UnaryFunction block_size_to_dynamic_smem_size, int block_size_limit = 0, unsigned int flags = 0)# Returns grid and block size that achieves maximum potential occupancy for a device function.
Returns in
*min_grid_size
and*block_size
a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).- Parameters:
min_grid_size – [out] minimum grid size needed to achieve the best potential occupancy
block_size – [out] block size required for the best potential occupancy
func – [in] device function symbol
block_size_to_dynamic_smem_size – [in] - a unary function/functor that takes block size, and returns the size, in bytes, of dynamic shared memory needed for a block
block_size_limit – [in] the maximum block size
func
is designed to work with. 0 means no limit.flags – [in] reserved
- Returns:
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown
-
template<typename UnaryFunction, class T>
static inline hipError_t hipOccupancyMaxPotentialBlockSizeVariableSMem(int *min_grid_size, int *block_size, T func, UnaryFunction block_size_to_dynamic_smem_size, int block_size_limit = 0)# Returns grid and block size that achieves maximum potential occupancy for a device function.
Returns in
*min_grid_size
and*block_size
a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).- Parameters:
min_grid_size – [out] minimum grid size needed to achieve the best potential occupancy
block_size – [out] block size required for the best potential occupancy
func – [in] device function symbol
block_size_to_dynamic_smem_size – [in] - a unary function/functor that takes block size, and returns the size, in bytes, of dynamic shared memory needed for a block
block_size_limit – [in] the maximum block size
func
is designed to work with. 0 means no limit.
- Returns:
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown
-
template<typename F>
inline hipError_t hipOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, F kernel, size_t dynSharedMemPerBlk, uint32_t blockSizeLimit)# Returns grid and block size that achieves maximum potential occupancy for a device function.
Returns in
*min_grid_size
and*block_size
a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).See also
- Returns:
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue