GridModule

GridModule#

hipCUB: GridModule

Classes

class  hipcub::GridBarrier
 GridBarrier implements a software global barrier among thread blocks within a hip grid. More...
 
class  hipcub::GridBarrierLifetime
 GridBarrierLifetime extends GridBarrier to provide lifetime management of the temporary device storage needed for cooperation. More...
 
struct  hipcub::GridEvenShare< OffsetT >
 GridEvenShare is a descriptor utility for distributing input among CUDA thread blocks in an "even-share" fashion. Each thread block gets roughly the same number of input tiles. More...
 
class  hipcub::GridQueue< OffsetT >
 GridQueue is a descriptor utility for dynamic queue management. More...
 

Typedefs

typedef unsigned int hipcub::GridBarrier::SyncFlag
 Synchronize.
 

Enumerations

enum  hipcub::GridMappingStrategy { hipcub::GRID_MAPPING_RAKE , hipcub::GRID_MAPPING_STRIP_MINE , hipcub::GRID_MAPPING_DYNAMIC }
 cub::GridMappingStrategy enumerates alternative strategies for mapping constant-sized tiles of device-wide data onto a grid of CUDA thread blocks. More...
 

Functions

 hipcub::GridBarrier::GridBarrier ()
 
__device__ __forceinline__ void hipcub::GridBarrier::Sync () const
 
 hipcub::GridBarrierLifetime::GridBarrierLifetime ()
 
hipError_t hipcub::GridBarrierLifetime::HostReset ()
 
virtual hipcub::GridBarrierLifetime::~GridBarrierLifetime ()
 
hipError_t hipcub::GridBarrierLifetime::Setup (int sweep_grid_size)
 

Variables

SyncFlaghipcub::GridBarrier::d_sync
 
size_t hipcub::GridBarrierLifetime::sync_bytes
 

Detailed Description

Enumeration Type Documentation

◆ GridMappingStrategy

cub::GridMappingStrategy enumerates alternative strategies for mapping constant-sized tiles of device-wide data onto a grid of CUDA thread blocks.

Enumerator
GRID_MAPPING_RAKE 

An a "raking" access pattern in which each thread block is assigned a consecutive sequence of input tiles.

Overview
The input is evenly partitioned into p segments, where p is constant and corresponds loosely to the number of thread blocks that may actively reside on the target device. Each segment is comprised of consecutive tiles, where a tile is a small, constant-sized unit of input to be processed to completion before the thread block terminates or obtains more work. The kernel invokes p thread blocks, each of which iteratively consumes a segment of n/p elements in tile-size increments.
GRID_MAPPING_STRIP_MINE 

An a "strip mining" access pattern in which the input tiles assigned to each thread block are separated by a stride equal to the the extent of the grid.

Overview
The input is evenly partitioned into p sets, where p is constant and corresponds loosely to the number of thread blocks that may actively reside on the target device. Each set is comprised of data tiles separated by stride tiles, where a tile is a small, constant-sized unit of input to be processed to completion before the thread block terminates or obtains more work. The kernel invokes p thread blocks, each of which iteratively consumes a segment of n/p elements in tile-size increments.
GRID_MAPPING_DYNAMIC 

A dynamic "queue-based" strategy for assigning input tiles to thread blocks.

Overview
The input is treated as a queue to be dynamically consumed by a grid of thread blocks. Work is atomically dequeued in tiles, where a tile is a unit of input to be processed to completion before the thread block terminates or obtains more work. The grid size p is constant, loosely corresponding to the number of thread blocks that may actively reside on the target device.

Function Documentation

◆ GridBarrier()

hipcub::GridBarrier::GridBarrier ( )
inline

Constructor

◆ GridBarrierLifetime()

hipcub::GridBarrierLifetime::GridBarrierLifetime ( )
inline

Constructor

◆ HostReset()

hipError_t hipcub::GridBarrierLifetime::HostReset ( )
inline

DeviceFrees and resets the progress counters

◆ ~GridBarrierLifetime()

virtual hipcub::GridBarrierLifetime::~GridBarrierLifetime ( )
inlinevirtual

Destructor

◆ Setup()

hipError_t hipcub::GridBarrierLifetime::Setup ( int  sweep_grid_size)
inline

Sets up the progress counters for the next kernel launch (lazily allocating and initializing them if necessary)