BlockMergeSort< KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, BLOCK_DIM_Y, BLOCK_DIM_Z > Class Template Reference#
The BlockMergeSort class provides methods for sorting items partitioned across a CUDA thread block using a merge sorting method. More...
#include <block_merge_sort.hpp>
Public Member Functions | |
| __device__ __forceinline__ | BlockMergeSort (typename BlockMergeSortStrategyT::TempStorage &temp_storage) |
| __device__ __forceinline__ unsigned int | get_linear_tid () const |
| __device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
| __device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
| __device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], NullType(&items)[ITEMS_PER_THREAD], CompareOp compare_op) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
| __device__ __forceinline__ void | Sort (KeyT(&keys)[ITEMS_PER_THREAD], NullType(&items)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
| __device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
| __device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], NullType(&items)[ITEMS_PER_THREAD], CompareOp compare_op) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
| __device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
| __device__ __forceinline__ void | StableSort (KeyT(&keys)[ITEMS_PER_THREAD], NullType(&items)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default) |
| Sorts items partitioned across a CUDA thread block using a merge sorting method. More... | |
Detailed Description
template<typename KeyT, int BLOCK_DIM_X, int ITEMS_PER_THREAD, typename ValueT = NullType, int BLOCK_DIM_Y = 1, int BLOCK_DIM_Z = 1>
class hipcub::BlockMergeSort< KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, BLOCK_DIM_Y, BLOCK_DIM_Z >
The BlockMergeSort class provides methods for sorting items partitioned across a CUDA thread block using a merge sorting method.
- Template Parameters
-
KeyT KeyT type BLOCK_DIM_X The thread block length in threads along the X dimension ITEMS_PER_THREAD The number of items per thread ValueT [optional] ValueT type (default: hipcub::NullType, which indicates a keys-only sort)BLOCK_DIM_Y [optional] The thread block length in threads along the Y dimension (default: 1) BLOCK_DIM_Z [optional] The thread block length in threads along the Z dimension (default: 1)
- Overview
- BlockMergeSort arranges items into ascending order using a comparison functor with less-than semantics. Merge sort can handle arbitrary types and comparison functors, but is slower than BlockRadixSort when sorting arithmetic types into ascending/descending order.
- A Simple Example
- @blockcollective{BlockMergeSort}
- The code snippet below illustrates a sort of 512 integer keys that are partitioned across 128 threads * where each thread owns 4 consecutive items.
- #include <hipcub/hipcub.hpp> // or equivalently <hipcub/block/block_merge_sort.hpp>struct CustomLess{template <typename DataType>__device__ bool operator()(const DataType &lhs, const DataType &rhs){return lhs < rhs;}};__global__ void ExampleKernel(...){// Specialize BlockMergeSort for a 1D block of 128 threads owning 4 integer items eachtypedef hipcub::BlockMergeSort<int, 128, 4> BlockMergeSort;// Allocate shared memory for BlockMergeSort__shared__ typename BlockMergeSort::TempStorage temp_storage_shuffle;// Obtain a segment of consecutive items that are blocked across threadsint thread_keys[4];...BlockMergeSort(temp_storage_shuffle).Sort(thread_keys, CustomLess());...}The BlockMergeSort class provides methods for sorting items partitioned across a CUDA thread block us...Definition: block_merge_sort.hpp:771
- Suppose the set of input
thread_keysacross the block of threads is{ [0,511,1,510], [2,509,3,508], [4,507,5,506], ..., [254,257,255,256] }. The corresponding outputthread_keysin those threads will be{ [0,1,2,3], [4,5,6,7], [8,9,10,11], ..., [508,509,510,511] }.
- Re-using dynamically allocating shared memory
- The following example under the examples/block folder illustrates usage of dynamically shared memory with BlockReduce and how to re-purpose the same memory region: example_block_reduce_dyn_smem.cu
This example can be easily adapted to the storage required by BlockMergeSort.
Member Function Documentation
◆ Sort() [1/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that i and j are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs).CompareOpis a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ Sort() [2/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that
iandjare equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort. - The value of
oob_defaultis assigned to all elements that are out ofvalid_itemsboundaries. It's expected thatoob_defaultis ordered after any value in thevalid_itemsboundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered afteroob_default, it won't be placed withinvalid_itemsboundaries.
- Sort is not guaranteed to be stable. That is, suppose that
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs).CompareOpis a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
◆ Sort() [3/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that
iandjare equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs).CompareOpis a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ Sort() [4/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- Sort is not guaranteed to be stable. That is, suppose that
iandjare equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort. - The value of
oob_defaultis assigned to all elements that are out ofvalid_itemsboundaries. It's expected thatoob_defaultis ordered after any value in thevalid_itemsboundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered afteroob_default, it won't be placed withinvalid_itemsboundaries.
- Sort is not guaranteed to be stable. That is, suppose that
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs)CompareOpis a model of Strict Weak Ordering.IS_LAST_TILE True if valid_itemsisn't equal to theITEMS_PER_TILE
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
◆ StableSort() [1/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
xandyare elements such thatxprecedesy, and if the two elements are equivalent (neitherx < ynory < x) then a postcondition of StableSort is thatxstill precedesy.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs).CompareOpis a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ StableSort() [2/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
xandyare elements such thatxprecedesy, and if the two elements are equivalent (neitherx < ynory < x) then a postcondition of StableSort is thatxstill precedesy.
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs).CompareOpis a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second
◆ StableSort() [3/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
xandyare elements such thatxprecedesy, and if the two elements are equivalent (neitherx < ynory < x) then a postcondition of StableSort is thatxstill precedesy. - The value of
oob_defaultis assigned to all elements that are out ofvalid_itemsboundaries. It's expected thatoob_defaultis ordered after any value in thevalid_itemsboundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered afteroob_default, it won't be placed withinvalid_itemsboundaries.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs).CompareOpis a model of Strict Weak Ordering.
- Parameters
-
[in,out] keys Keys to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
◆ StableSort() [4/4]
|
inlineinherited |
Sorts items partitioned across a CUDA thread block using a merge sorting method.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
xandyare elements such thatxprecedesy, and if the two elements are equivalent (neitherx < ynory < x) then a postcondition of StableSort is thatxstill precedesy. - The value of
oob_defaultis assigned to all elements that are out ofvalid_itemsboundaries. It's expected thatoob_defaultis ordered after any value in thevalid_itemsboundaries. The algorithm always sorts a fixed amount of elements, which is equal toITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered afteroob_default, it won't be placed withinvalid_itemsboundaries.
- StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if
- Template Parameters
-
CompareOp functor type having member bool operator()(KeyT lhs, KeyT rhs).CompareOpis a model of Strict Weak Ordering.IS_LAST_TILE True if valid_itemsisn't equal to theITEMS_PER_TILE
- Parameters
-
[in,out] keys Keys to sort [in,out] items Values to sort [in] compare_op Comparison function object which returns true if the first argument is ordered before the second [in] valid_items Number of valid items to sort [in] oob_default Default value to assign out-of-bound items
The documentation for this class was generated from the following file:
- /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hipcub/checkouts/docs-5.5.0/hipcub/include/hipcub/backend/rocprim/block/block_merge_sort.hpp