BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH > Class Template Reference#
hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH > Class Template Reference
Inheritance diagram for hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >:
Public Types | |
| using | TempStorage = typename base_type::storage_type |
Public Member Functions | |
| __device__ | BlockShuffle (TempStorage &temp_storage) |
| __device__ void | Offset (T input, T &output, int distance=1) |
Each threadi obtains the input provided by threadi+distance. The offset distance may be negative. More... | |
| __device__ void | Rotate (T input, T &output, unsigned int distance=1) |
Each threadi obtains the input provided by threadi+distance. More... | |
| template<int ITEMS_PER_THREAD> | |
| __device__ void | Up (T(&input)[ITEMS_PER_THREAD], T(&prev)[ITEMS_PER_THREAD]) |
The thread block rotates its blocked arrangement of input items, shifting it up by one item. More... | |
| template<int ITEMS_PER_THREAD> | |
| __device__ void | Up (T(&input)[ITEMS_PER_THREAD], T(&prev)[ITEMS_PER_THREAD], T &block_suffix) |
The thread block rotates its blocked arrangement of input items, shifting it up by one item. All threads receive the input provided by threadBLOCK_THREADS-1. More... | |
| template<int ITEMS_PER_THREAD> | |
| __device__ void | Down (T(&input)[ITEMS_PER_THREAD], T(&next)[ITEMS_PER_THREAD]) |
The thread block rotates its blocked arrangement of input items, shifting it down by one item. More... | |
| template<int ITEMS_PER_THREAD> | |
| __device__ void | Down (T(&input)[ITEMS_PER_THREAD], T(&next)[ITEMS_PER_THREAD], T &block_prefix) |
The thread block rotates its blocked arrangement of input items, shifting it down by one item. All threads receive input[0] provided by thread0. More... | |
Constructor & Destructor Documentation
◆ BlockShuffle()
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
|
inline |
- Parameters
-
[in] temp_storage Reference to memory allocation having layout type TempStorage
Member Function Documentation
◆ Offset()
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
|
inline |
Each threadi obtains the input provided by threadi+distance. The offset distance may be negative.
- \smemreuse
- Parameters
-
[in] input The input item from the calling thread (threadi) [out] output The inputitem from the successor (or predecessor) thread threadi+distance(may be aliased toinput). This value is only updated for for threadi when 0 <= (i +distance) <BLOCK_THREADS-1[in] distance Offset distance (may be negative)
◆ Rotate()
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
|
inline |
Each threadi obtains the input provided by threadi+distance.
- \smemreuse
- Parameters
-
[in] input The calling thread's input item [out] output The inputitem from thread thread(i+distance>)%<BLOCK_THREADS>(may be aliased toinput). This value is not updated for threadBLOCK_THREADS-1[in] distance Offset distance (0 < distance<BLOCK_THREADS)
◆ Up() [1/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input items, shifting it up by one item.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] prev The corresponding predecessor items (may be aliased to input). The itemprev[0] is not updated for thread0.
◆ Up() [2/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input items, shifting it up by one item. All threads receive the input provided by threadBLOCK_THREADS-1.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] prev The corresponding predecessor items (may be aliased to input). The itemprev[0] is not updated for thread0.[out] block_suffix The item input[ITEMS_PER_THREAD-1] from threadBLOCK_THREADS-1, provided to all threads
◆ Down() [1/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input items, shifting it down by one item.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] next The corresponding predecessor items (may be aliased to input). The valuenext[0] is not updated for threadBLOCK_THREADS-1.
◆ Down() [2/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input items, shifting it down by one item. All threads receive input[0] provided by thread0.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] next The corresponding predecessor items (may be aliased to input). The valuenext[0] is not updated for threadBLOCK_THREADS-1.[out] block_prefix The item input[0] from thread0, provided to all threads
The documentation for this class was generated from the following file:
- /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hipcub/checkouts/docs-5.4.3/hipcub/include/hipcub/backend/rocprim/block/block_shuffle.hpp