BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH > Class Template Reference

BlockShuffle&lt; T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH &gt; Class Template Reference#

hipCUB: hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH > Class Template Reference
hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH > Class Template Reference
Inheritance diagram for hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >:

Public Types

using TempStorage = typename base_type::storage_type
 

Public Member Functions

__device__ BlockShuffle (TempStorage &temp_storage)
 
__device__ void Offset (T input, T &output, int distance=1)
 Each threadi obtains the input provided by threadi+distance. The offset distance may be negative. More...
 
__device__ void Rotate (T input, T &output, unsigned int distance=1)
 Each threadi obtains the input provided by threadi+distance. More...
 
template<int ITEMS_PER_THREAD>
__device__ void Up (T(&input)[ITEMS_PER_THREAD], T(&prev)[ITEMS_PER_THREAD])
 The thread block rotates its blocked arrangement of input items, shifting it up by one item. More...
 
template<int ITEMS_PER_THREAD>
__device__ void Up (T(&input)[ITEMS_PER_THREAD], T(&prev)[ITEMS_PER_THREAD], T &block_suffix)
 The thread block rotates its blocked arrangement of input items, shifting it up by one item. All threads receive the input provided by threadBLOCK_THREADS-1. More...
 
template<int ITEMS_PER_THREAD>
__device__ void Down (T(&input)[ITEMS_PER_THREAD], T(&next)[ITEMS_PER_THREAD])
 The thread block rotates its blocked arrangement of input items, shifting it down by one item. More...
 
template<int ITEMS_PER_THREAD>
__device__ void Down (T(&input)[ITEMS_PER_THREAD], T(&next)[ITEMS_PER_THREAD], T &block_prefix)
 The thread block rotates its blocked arrangement of input items, shifting it down by one item. All threads receive input[0] provided by thread0. More...
 

Constructor & Destructor Documentation

◆ BlockShuffle()

template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
__device__ hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >::BlockShuffle ( TempStorage &  temp_storage)
inline
Parameters
[in]temp_storageReference to memory allocation having layout type TempStorage

Member Function Documentation

◆ Offset()

template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
__device__ void hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >::Offset ( input,
T &  output,
int  distance = 1 
)
inline

Each threadi obtains the input provided by threadi+distance. The offset distance may be negative.

  • \smemreuse
Parameters
[in]inputThe input item from the calling thread (threadi)
[out]outputThe input item from the successor (or predecessor) thread threadi+distance (may be aliased to input). This value is only updated for for threadi when 0 <= (i + distance) < BLOCK_THREADS-1
[in]distanceOffset distance (may be negative)

◆ Rotate()

template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
__device__ void hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >::Rotate ( input,
T &  output,
unsigned int  distance = 1 
)
inline

Each threadi obtains the input provided by threadi+distance.

  • \smemreuse
Parameters
[in]inputThe calling thread's input item
[out]outputThe input item from thread thread(i+distance>)%<BLOCK_THREADS> (may be aliased to input). This value is not updated for threadBLOCK_THREADS-1
[in]distanceOffset distance (0 < distance < BLOCK_THREADS)

◆ Up() [1/2]

template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
__device__ void hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >::Up ( T(&)  input[ITEMS_PER_THREAD],
T(&)  prev[ITEMS_PER_THREAD] 
)
inline

The thread block rotates its blocked arrangement of input items, shifting it up by one item.

  • \blocked
  • \granularity
  • \smemreuse
Parameters
[in]inputThe calling thread's input items
[out]prevThe corresponding predecessor items (may be aliased to input). The item prev[0] is not updated for thread0.

◆ Up() [2/2]

template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
__device__ void hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >::Up ( T(&)  input[ITEMS_PER_THREAD],
T(&)  prev[ITEMS_PER_THREAD],
T &  block_suffix 
)
inline

The thread block rotates its blocked arrangement of input items, shifting it up by one item. All threads receive the input provided by threadBLOCK_THREADS-1.

  • \blocked
  • \granularity
  • \smemreuse
Parameters
[in]inputThe calling thread's input items
[out]prevThe corresponding predecessor items (may be aliased to input). The item prev[0] is not updated for thread0.
[out]block_suffixThe item input[ITEMS_PER_THREAD-1] from threadBLOCK_THREADS-1, provided to all threads

◆ Down() [1/2]

template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
__device__ void hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >::Down ( T(&)  input[ITEMS_PER_THREAD],
T(&)  next[ITEMS_PER_THREAD] 
)
inline

The thread block rotates its blocked arrangement of input items, shifting it down by one item.

  • \blocked
  • \granularity
  • \smemreuse
Parameters
[in]inputThe calling thread's input items
[out]nextThe corresponding predecessor items (may be aliased to input). The value next[0] is not updated for threadBLOCK_THREADS-1.

◆ Down() [2/2]

template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
__device__ void hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >::Down ( T(&)  input[ITEMS_PER_THREAD],
T(&)  next[ITEMS_PER_THREAD],
T &  block_prefix 
)
inline

The thread block rotates its blocked arrangement of input items, shifting it down by one item. All threads receive input[0] provided by thread0.

  • \blocked
  • \granularity
  • \smemreuse
Parameters
[in]inputThe calling thread's input items
[out]nextThe corresponding predecessor items (may be aliased to input). The value next[0] is not updated for threadBLOCK_THREADS-1.
[out]block_prefixThe item input[0] from thread0, provided to all threads

The documentation for this class was generated from the following file:
  • /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hipcub/checkouts/docs-5.4.3/hipcub/include/hipcub/backend/rocprim/block/block_shuffle.hpp