BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH > Class Template Reference#
hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH > Class Template Reference
Inheritance diagram for hipcub::BlockShuffle< T, BLOCK_DIM_X, BLOCK_DIM_Y, BLOCK_DIM_Z, ARCH >:
Public Types | |
using | TempStorage = typename base_type::storage_type |
Public Member Functions | |
__device__ | BlockShuffle (TempStorage &temp_storage) |
__device__ void | Offset (T input, T &output, int distance=1) |
Each threadi obtains the input provided by threadi+distance . The offset distance may be negative. More... | |
__device__ void | Rotate (T input, T &output, unsigned int distance=1) |
Each threadi obtains the input provided by threadi+distance . More... | |
template<int ITEMS_PER_THREAD> | |
__device__ void | Up (T(&input)[ITEMS_PER_THREAD], T(&prev)[ITEMS_PER_THREAD]) |
The thread block rotates its blocked arrangement of input items, shifting it up by one item. More... | |
template<int ITEMS_PER_THREAD> | |
__device__ void | Up (T(&input)[ITEMS_PER_THREAD], T(&prev)[ITEMS_PER_THREAD], T &block_suffix) |
The thread block rotates its blocked arrangement of input items, shifting it up by one item. All threads receive the input provided by threadBLOCK_THREADS-1 . More... | |
template<int ITEMS_PER_THREAD> | |
__device__ void | Down (T(&input)[ITEMS_PER_THREAD], T(&next)[ITEMS_PER_THREAD]) |
The thread block rotates its blocked arrangement of input items, shifting it down by one item. More... | |
template<int ITEMS_PER_THREAD> | |
__device__ void | Down (T(&input)[ITEMS_PER_THREAD], T(&next)[ITEMS_PER_THREAD], T &block_prefix) |
The thread block rotates its blocked arrangement of input items, shifting it down by one item. All threads receive input [0] provided by thread0 . More... | |
Constructor & Destructor Documentation
◆ BlockShuffle()
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
|
inline |
- Parameters
-
[in] temp_storage Reference to memory allocation having layout type TempStorage
Member Function Documentation
◆ Offset()
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
|
inline |
Each threadi obtains the input
provided by threadi+distance
. The offset distance
may be negative.
- \smemreuse
- Parameters
-
[in] input The input item from the calling thread (threadi) [out] output The input
item from the successor (or predecessor) thread threadi+distance
(may be aliased toinput
). This value is only updated for for threadi when 0 <= (i +distance
) <BLOCK_THREADS-1
[in] distance Offset distance (may be negative)
◆ Rotate()
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
|
inline |
Each threadi obtains the input
provided by threadi+distance
.
- \smemreuse
- Parameters
-
[in] input The calling thread's input item [out] output The input
item from thread thread(i+distance>
)%<BLOCK_THREADS>
(may be aliased toinput
). This value is not updated for threadBLOCK_THREADS-1[in] distance Offset distance (0 < distance
<BLOCK_THREADS
)
◆ Up() [1/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input
items, shifting it up by one item.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] prev The corresponding predecessor items (may be aliased to input
). The itemprev
[0] is not updated for thread0.
◆ Up() [2/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input
items, shifting it up by one item. All threads receive the input
provided by threadBLOCK_THREADS-1
.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] prev The corresponding predecessor items (may be aliased to input
). The itemprev
[0] is not updated for thread0.[out] block_suffix The item input
[ITEMS_PER_THREAD-1] from threadBLOCK_THREADS-1
, provided to all threads
◆ Down() [1/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input
items, shifting it down by one item.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] next The corresponding predecessor items (may be aliased to input
). The valuenext
[0] is not updated for threadBLOCK_THREADS-1.
◆ Down() [2/2]
template<typename T , int BLOCK_DIM_X, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1, int ARCH = 1>
template<int ITEMS_PER_THREAD>
|
inline |
The thread block rotates its blocked arrangement of input items, shifting it down by one item. All threads receive input
[0] provided by thread0
.
- \blocked
- \granularity
- \smemreuse
- Parameters
-
[in] input The calling thread's input items [out] next The corresponding predecessor items (may be aliased to input
). The valuenext
[0] is not updated for threadBLOCK_THREADS-1.[out] block_prefix The item input
[0] from thread0
, provided to all threads
The documentation for this class was generated from the following file:
- /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hipcub/checkouts/docs-5.4.3/hipcub/include/hipcub/backend/rocprim/block/block_shuffle.hpp