GridEvenShare< OffsetT > Struct Template Reference#
- hipcub
- GridEvenShare
hipcub::GridEvenShare< OffsetT > Struct Template Reference
GridEvenShare is a descriptor utility for distributing input among CUDA thread blocks in an "even-share" fashion. Each thread block gets roughly the same number of input tiles. More...
#include <grid_even_share.hpp>
Public Member Functions | |
__host__ __device__ __forceinline__ | GridEvenShare () |
Constructor. | |
__host__ __device__ __forceinline__ void | DispatchInit (OffsetT num_items_, int max_grid_size, int tile_items) |
Dispatch initializer. To be called prior to kernel launch. More... | |
template<int TILE_ITEMS> | |
__device__ __forceinline__ void | BlockInit (int block_id, Int2Type< GRID_MAPPING_RAKE >) |
Initializes ranges for the specified thread block index. Specialized for a "raking" access pattern in which each thread block is assigned a consecutive sequence of input tiles. | |
template<int TILE_ITEMS> | |
__device__ __forceinline__ void | BlockInit (int block_id, Int2Type< GRID_MAPPING_STRIP_MINE >) |
Block-initialization, specialized for a "raking" access pattern in which each thread block is assigned a consecutive sequence of input tiles. | |
template<int TILE_ITEMS, GridMappingStrategy STRATEGY> | |
__device__ __forceinline__ void | BlockInit () |
Block-initialization, specialized for "strip mining" access pattern in which the input tiles assigned to each thread block are separated by a stride equal to the the extent of the grid. | |
template<int TILE_ITEMS> | |
__device__ __forceinline__ void | BlockInit (OffsetT block_offset, OffsetT block_end) |
Block-initialization, specialized for a "raking" access pattern in which each thread block is assigned a consecutive sequence of input tiles. More... | |
Public Attributes | |
OffsetT | num_items |
Total number of input items. | |
int | grid_size |
Grid size in thread blocks. | |
OffsetT | block_offset |
OffsetT into input marking the beginning of the owning thread block's segment of input tiles. | |
OffsetT | block_end |
OffsetT into input of marking the end (one-past) of the owning thread block's segment of input tiles. | |
OffsetT | block_stride |
Stride between input tiles. | |
Detailed Description
template<typename OffsetT>
struct hipcub::GridEvenShare< OffsetT >
GridEvenShare is a descriptor utility for distributing input among CUDA thread blocks in an "even-share" fashion. Each thread block gets roughly the same number of input tiles.
- Overview
- Each thread block is assigned a consecutive sequence of input tiles. To help preserve alignment and eliminate the overhead of guarded loads for all but the last thread block, to GridEvenShare assigns one of three different amounts of work to a given thread block: "big", "normal", or "last". The "big" workloads are one scheduling grain larger than "normal". The "last" work unit for the last thread block may be partially-full if the input is not an even multiple of the scheduling grain size.
- Before invoking a child grid, a parent thread will typically construct an instance of GridEvenShare. The instance can be passed to child thread blocks which can initialize their per-thread block offsets using
BlockInit()
.
Member Function Documentation
◆ DispatchInit()
template<typename OffsetT >
|
inline |
Dispatch initializer. To be called prior to kernel launch.
- Parameters
-
num_items_ Total number of input items max_grid_size Maximum grid size allowable (actual grid size may be less if not warranted by the the number of input items) tile_items Number of data items per input tile
◆ BlockInit()
template<typename OffsetT >
template<int TILE_ITEMS>
|
inline |
Block-initialization, specialized for a "raking" access pattern in which each thread block is assigned a consecutive sequence of input tiles.
- Parameters
-
[in] block_offset Threadblock begin offset (inclusive) [in] block_end Threadblock end offset (exclusive)
The documentation for this struct was generated from the following file:
- /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hipcub/checkouts/docs-5.4.2/hipcub/include/hipcub/backend/rocprim/grid/grid_even_share.hpp