BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy > Class Template Reference

BlockMergeSortStrategy&lt; KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy &gt; Class Template Reference#

hipCUB: hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy > Class Template Reference
hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy > Class Template Reference

Generalized merge sort algorithm. More...

#include <block_merge_sort.hpp>

Classes

struct  TempStorage
 \smemstorage{BlockMergeSort} More...
 

Public Member Functions

__device__ __forceinline__ BlockMergeSortStrategy (unsigned int linear_tid)
 
__device__ __forceinline__ BlockMergeSortStrategy (TempStorage &temp_storage, unsigned int linear_tid)
 
__device__ __forceinline__ unsigned int get_linear_tid () const
 
template<typename CompareOp >
__device__ __forceinline__ void Sort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 
template<typename CompareOp >
__device__ __forceinline__ void Sort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 
template<typename CompareOp >
__device__ __forceinline__ void Sort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 
template<typename CompareOp , bool IS_LAST_TILE = true>
__device__ __forceinline__ void Sort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 
template<typename CompareOp >
__device__ __forceinline__ void StableSort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 
template<typename CompareOp >
__device__ __forceinline__ void StableSort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 
template<typename CompareOp >
__device__ __forceinline__ void StableSort (KeyT(&keys)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 
template<typename CompareOp , bool IS_LAST_TILE = true>
__device__ __forceinline__ void StableSort (KeyT(&keys)[ITEMS_PER_THREAD], ValueT(&items)[ITEMS_PER_THREAD], CompareOp compare_op, int valid_items, KeyT oob_default)
 Sorts items partitioned across a CUDA thread block using a merge sorting method. More...
 

Detailed Description

template<typename KeyT, typename ValueT, int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy>
class hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >

Generalized merge sort algorithm.

This class is used to reduce code duplication. Warp and Block merge sort differ only in how they compute thread index and how they synchronize threads. Since synchronization might require access to custom data (like member mask), CRTP is used.

The code snippet below illustrates the way this class can be used.
#include <hipcub/hipcub.hpp> // or equivalently <hipcub/block/block_merge_sort.hpp>
constexpr int BLOCK_THREADS = 256;
constexpr int ITEMS_PER_THREAD = 9;
class BlockMergeSort : public BlockMergeSortStrategy<int,
hipcub::NullType,
BLOCK_THREADS,
ITEMS_PER_THREAD,
BlockMergeSort>
{
using BlockMergeSortStrategyT =
BlockMergeSortStrategy<int,
hipcub::NullType,
BLOCK_THREADS,
ITEMS_PER_THREAD,
BlockMergeSort>;
public:
__device__ __forceinline__ explicit BlockMergeSort(
typename BlockMergeSortStrategyT::TempStorage &temp_storage)
: BlockMergeSortStrategyT(temp_storage, threadIdx.x)
{}
__device__ __forceinline__ void SyncImplementation() const
{
__syncthreads();
}
};
Template Parameters
KeyTKeyT type
ValueTValueT type. hipcub::NullType indicates a keys-only sort
SynchronizationPolicyProvides a way of synchronizing threads. Should be derived from BlockMergeSortStrategy.

Member Function Documentation

◆ Sort() [1/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp >
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::Sort ( KeyT(&)  keys[ITEMS_PER_THREAD],
CompareOp  compare_op 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

Sort is not guaranteed to be stable. That is, suppose that i and j are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs). CompareOp is a model of Strict Weak Ordering.
Parameters
[in,out]keysKeys to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second

◆ Sort() [2/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp >
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::Sort ( KeyT(&)  keys[ITEMS_PER_THREAD],
CompareOp  compare_op,
int  valid_items,
KeyT  oob_default 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

  • Sort is not guaranteed to be stable. That is, suppose that i and j are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
  • The value of oob_default is assigned to all elements that are out of valid_items boundaries. It's expected that oob_default is ordered after any value in the valid_items boundaries. The algorithm always sorts a fixed amount of elements, which is equal to ITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered after oob_default, it won't be placed within valid_items boundaries.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs). CompareOp is a model of Strict Weak Ordering.
Parameters
[in,out]keysKeys to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second
[in]valid_itemsNumber of valid items to sort
[in]oob_defaultDefault value to assign out-of-bound items

◆ Sort() [3/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp >
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::Sort ( KeyT(&)  keys[ITEMS_PER_THREAD],
ValueT(&)  items[ITEMS_PER_THREAD],
CompareOp  compare_op 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

Sort is not guaranteed to be stable. That is, suppose that i and j are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs). CompareOp is a model of Strict Weak Ordering.
Parameters
[in,out]keysKeys to sort
[in,out]itemsValues to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second

◆ Sort() [4/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp , bool IS_LAST_TILE = true>
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::Sort ( KeyT(&)  keys[ITEMS_PER_THREAD],
ValueT(&)  items[ITEMS_PER_THREAD],
CompareOp  compare_op,
int  valid_items,
KeyT  oob_default 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

  • Sort is not guaranteed to be stable. That is, suppose that i and j are equivalent: neither one is less than the other. It is not guaranteed that the relative order of these two elements will be preserved by sort.
  • The value of oob_default is assigned to all elements that are out of valid_items boundaries. It's expected that oob_default is ordered after any value in the valid_items boundaries. The algorithm always sorts a fixed amount of elements, which is equal to ITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered after oob_default, it won't be placed within valid_items boundaries.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs) CompareOp is a model of Strict Weak Ordering.
IS_LAST_TILETrue if valid_items isn't equal to the ITEMS_PER_TILE
Parameters
[in,out]keysKeys to sort
[in,out]itemsValues to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second
[in]valid_itemsNumber of valid items to sort
[in]oob_defaultDefault value to assign out-of-bound items

◆ StableSort() [1/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp >
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::StableSort ( KeyT(&)  keys[ITEMS_PER_THREAD],
CompareOp  compare_op 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if x and y are elements such that x precedes y, and if the two elements are equivalent (neither x < y nor y < x) then a postcondition of StableSort is that x still precedes y.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs). CompareOp is a model of Strict Weak Ordering.
Parameters
[in,out]keysKeys to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second

◆ StableSort() [2/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp >
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::StableSort ( KeyT(&)  keys[ITEMS_PER_THREAD],
ValueT(&)  items[ITEMS_PER_THREAD],
CompareOp  compare_op 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if x and y are elements such that x precedes y, and if the two elements are equivalent (neither x < y nor y < x) then a postcondition of StableSort is that x still precedes y.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs). CompareOp is a model of Strict Weak Ordering.
Parameters
[in,out]keysKeys to sort
[in,out]itemsValues to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second

◆ StableSort() [3/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp >
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::StableSort ( KeyT(&)  keys[ITEMS_PER_THREAD],
CompareOp  compare_op,
int  valid_items,
KeyT  oob_default 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

  • StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if x and y are elements such that x precedes y, and if the two elements are equivalent (neither x < y nor y < x) then a postcondition of StableSort is that x still precedes y.
  • The value of oob_default is assigned to all elements that are out of valid_items boundaries. It's expected that oob_default is ordered after any value in the valid_items boundaries. The algorithm always sorts a fixed amount of elements, which is equal to ITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered after oob_default, it won't be placed within valid_items boundaries.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs). CompareOp is a model of Strict Weak Ordering.
Parameters
[in,out]keysKeys to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second
[in]valid_itemsNumber of valid items to sort
[in]oob_defaultDefault value to assign out-of-bound items

◆ StableSort() [4/4]

template<typename KeyT , typename ValueT , int NUM_THREADS, int ITEMS_PER_THREAD, typename SynchronizationPolicy >
template<typename CompareOp , bool IS_LAST_TILE = true>
__device__ __forceinline__ void hipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >::StableSort ( KeyT(&)  keys[ITEMS_PER_THREAD],
ValueT(&)  items[ITEMS_PER_THREAD],
CompareOp  compare_op,
int  valid_items,
KeyT  oob_default 
)
inline

Sorts items partitioned across a CUDA thread block using a merge sorting method.

  • StableSort is stable: it preserves the relative ordering of equivalent elements. That is, if x and y are elements such that x precedes y, and if the two elements are equivalent (neither x < y nor y < x) then a postcondition of StableSort is that x still precedes y.
  • The value of oob_default is assigned to all elements that are out of valid_items boundaries. It's expected that oob_default is ordered after any value in the valid_items boundaries. The algorithm always sorts a fixed amount of elements, which is equal to ITEMS_PER_THREAD * BLOCK_THREADS. If there is a value that is ordered after oob_default, it won't be placed within valid_items boundaries.
Template Parameters
CompareOpfunctor type having member bool operator()(KeyT lhs, KeyT rhs). CompareOp is a model of Strict Weak Ordering.
IS_LAST_TILETrue if valid_items isn't equal to the ITEMS_PER_TILE
Parameters
[in,out]keysKeys to sort
[in,out]itemsValues to sort
[in]compare_opComparison function object which returns true if the first argument is ordered before the second
[in]valid_itemsNumber of valid items to sort
[in]oob_defaultDefault value to assign out-of-bound items

The documentation for this class was generated from the following file:
  • /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hipcub/checkouts/docs-5.4.3/hipcub/include/hipcub/backend/rocprim/block/block_merge_sort.hpp