BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy > Class Template Reference#
Generalized merge sort algorithm. More...
#include <block_merge_sort.hpp>
Classes | |
struct | TempStorage |
\smemstorage{BlockMergeSort} More... | |
Public Member Functions | |
__device__ __forceinline__ | BlockMergeSortStrategy (unsigned int linear_tid) |
__device__ __forceinline__ | BlockMergeSortStrategy (TempStorage &temp_storage, unsigned int linear_tid) |
__device__ __forceinline__ unsigned int | get_linear_tid () const |
template<typename CompareOp > | |
__device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
template<typename CompareOp > | |
__device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
template<typename CompareOp > | |
__device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
template<typename CompareOp , bool IS_LAST_TILE = true> | |
__device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
template<typename CompareOp > | |
__device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
template<typename CompareOp > | |
__device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
template<typename CompareOp > | |
__device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
template<typename CompareOp , bool IS_LAST_TILE = true> | |
__device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
Detailed Description
template<typename KeyT, typename ValueT, int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy>
class hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >
Generalized merge sort algorithm.
This class is used to reduce code duplication. Warp and Block merge sort differ only in how they compute thread index and how they synchronize threads. Since synchronization might require access to custom data (like member mask), CRTP is used.
- The code snippet below illustrates the way this class can be used.
- #include <hipcub/hipcub.hpp> // or equivalently <hipcub/block/block_merge_sort.hpp>constexpr int BLOCK_THREADS = 256;constexpr int ITEMS_PER_THREAD = 9;class BlockMergeSort : public BlockMergeSortStrategy<int,hipcub::NullType,BLOCK_THREADS,ITEMS_PER_THREAD,BlockMergeSort>{using BlockMergeSortStrategyT =BlockMergeSortStrategy<int,hipcub::NullType,BLOCK_THREADS,ITEMS_PER_THREAD,BlockMergeSort>;public:__device__ __forceinline__ explicit BlockMergeSort(typename BlockMergeSortStrategyT::TempStorage &temp_storage): BlockMergeSortStrategyT(temp_storage, threadIdx.x){}__device__ __forceinline__ void SyncImplementation() const{__syncthreads();}};
- Template Parameters
-
KeyT KeyT type ValueT ValueT type. hipcub::NullType indicates a keys-only sort SynchronizationPolicy Provides a way of synchronizing threads. Should be derived from BlockMergeSortStrategy
.
Member Function Documentation
◆ Sort() [1/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that i and j are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
.CompareOp
is a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ Sort() [2/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that
i
andj
are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort. - The value of
oob_default
is assigned to all elements that are out ofvalid_items
boundaries. It's expected thatoob_default
is ordered after any value in thevalid_items
boundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS
. If there is a value that is ordered afteroob_default
, it won't be placed withinvalid_items
boundaries.
- Sort is not guaranteed to be stable. That is, suppose that
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
.CompareOp
is a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
◆ Sort() [3/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that
i
andj
are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
.CompareOp
is a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ Sort() [4/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that
i
andj
are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort. - The value of
oob_default
is assigned to all elements that are out ofvalid_items
boundaries. It's expected thatoob_default
is ordered after any value in thevalid_items
boundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS
. If there is a value that is ordered afteroob_default
, it won't be placed withinvalid_items
boundaries.
- Sort is not guaranteed to be stable. That is, suppose that
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
CompareOp
is a model of Strict Weak Ordering.IS_LAST_TILE True if valid_items
isn't equal to theITEMS_PER_TILE
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
◆ StableSort() [1/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
x
andy
are elements such thatx
precedesy
, and if the two elements are equivalent (neitherx < y
nory < x
) then a postcondition of StableSort is thatx
still precedesy
.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
.CompareOp
is a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ StableSort() [2/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
x
andy
are elements such thatx
precedesy
, and if the two elements are equivalent (neitherx < y
nory < x
) then a postcondition of StableSort is thatx
still precedesy
.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
.CompareOp
is a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ StableSort() [3/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
x
andy
are elements such thatx
precedesy
, and if the two elements are equivalent (neitherx < y
nory < x
) then a postcondition of StableSort is thatx
still precedesy
. - The value of
oob_default
is assigned to all elements that are out ofvalid_items
boundaries. It's expected thatoob_default
is ordered after any value in thevalid_items
boundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS
. If there is a value that is ordered afteroob_default
, it won't be placed withinvalid_items
boundaries.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
.CompareOp
is a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
◆ StableSort() [4/4]
|
inline |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
x
andy
are elements such thatx
precedesy
, and if the two elements are equivalent (neitherx < y
nory < x
) then a postcondition of StableSort is thatx
still precedesy
. - The value of
oob_default
is assigned to all elements that are out ofvalid_items
boundaries. It's expected thatoob_default
is ordered after any value in thevalid_items
boundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS
. If there is a value that is ordered afteroob_default
, it won't be placed withinvalid_items
boundaries.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)
.CompareOp
is a model of Strict Weak Ordering.IS_LAST_TILE True if valid_items
isn't equal to theITEMS_PER_TILE
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
The documentation for this class was generated from the following file:
- /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hipcub/checkouts/docs-5.3.3/hipcub/include/hipcub/backend/rocprim/block/block_merge_sort.hpp