Partial Sort#
Configuring the kernel#
-
template<class NthElementConfig, class MergeSortConfig = default_config>
struct partial_sort_config# Configuration of device-level partial sort.
- Template Parameters:
NthElementConfig – - configuration of device-level nth element operation. Must be
nth_element_config
ordefault_config
.MergeSortConfig – - configuration of device-level merge sort operation. Must be
merge_sort_config
ordefault_config
.
partial_sort#
-
template<class Config = default_config, class KeysIterator, class BinaryFunction = ::rocprim::less<typename std::iterator_traits<KeysIterator>::value_type>>
hipError_t rocprim::partial_sort(void *temporary_storage, size_t &storage_size, KeysIterator keys, size_t middle, size_t size, BinaryFunction compare_function = BinaryFunction(), hipStream_t stream = 0, bool debug_synchronous = false)# Rearranges elements such that the range [0, middle) contains the sorted middle smallest elements in the range [0, size).
- Overview
The contents of the inputs are not altered by the function.
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
is a null pointer.Accepts custom compare_functions for partial_sort across the device.
Streams in graph capture mode are not supported
- Stability
partial_sort
is not stable: it doesn’t necessarily preserve the relative ordering of equivalent keys. That is, given two keysa
andb
and a binary boolean operationop
such that:a
precedesb
in the input keys, andop(a, b) and op(b, a) are both false, then it is not guaranteed that
a
will precedeb
as well in the output (ordered) keys.
- Example
In this example a device-level partial_sort is performed where input keys are represented by an array of unsigned integers.
#include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 size_t middle; // e.g., 4 unsigned int * keys; // e.g., [ 6, 3, 5, 4, 1, 8, 2, 7 ] size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partial_sort( temporary_storage_ptr, temporary_storage_size_bytes, keys, middle, input_size ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partial_sort rocprim::partial_sort( temporary_storage_ptr, temporary_storage_size_bytes, keys, middle, input_size ); // possible keys: [ 1, 2, 3, 4, 5, 8, 7, 6 ]
- Template Parameters:
Config – [optional] configuration of the primitive. It has to be
partial_sort_config
.KeysIterator – [inferred] random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
CompareFunction – [inferred] Type of binary function that accepts two arguments of the type
KeysIterator
and returns a value convertible to bool. Default type isrocprim::less<>.
- Parameters:
temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the partial_sort rearrangement.storage_size – [inout] reference to a size (in bytes) of
temporary_storage
.keys – [inout] iterator to the input range.
middle – [in] The index of the point till where it is sorted in the input range.
size – [in] number of element in the input range.
compare_function – [in] binary operation function object that will be used for comparison. The signature of the function should be equivalent to the following:
bool f(const T &a, const T &b);
. The signature does not need to haveconst &
, but function object must not modify the objects passed to it. The comparator must meet the C++ named requirement Compare. The default value isBinaryFunction()
.stream – [in] [optional] HIP stream object. Default is
0
(default stream).debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is
false
.
- Returns:
hipSuccess
(0
) after successful rearrangement; otherwise a HIP runtime error of typehipError_t
.
-
template<class Config = default_config, class KeysInputIterator, class KeysOutputIterator, class BinaryFunction = ::rocprim::less<typename std::iterator_traits<KeysInputIterator>::value_type>>
hipError_t rocprim::partial_sort_copy(void *temporary_storage, size_t &storage_size, KeysInputIterator keys_input, KeysOutputIterator keys_output, size_t middle, size_t size, BinaryFunction compare_function = BinaryFunction(), hipStream_t stream = 0, bool debug_synchronous = false)# Rearranges elements such that the range [0, middle) contains the sorted middle smallest elements in the range [0, size).
- Overview
The contents of the inputs are not altered by the function.
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
is a null pointer.Accepts custom compare_functions for partial_sort_copy across the device.
Streams in graph capture mode are not supported
- Stability
partial_sort_copy
is not stable: it doesn’t necessarily preserve the relative ordering of equivalent keys. That is, given two keysa
andb
and a binary boolean operationop
such that:a
precedesb
in the input keys, andop(a, b) and op(b, a) are both false, then it is not guaranteed that
a
will precedeb
as well in the output (ordered) keys.
- Example
In this example a device-level partial_sort_copy is performed where input keys are represented by an array of unsigned integers.
#include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 size_t middle; // e.g., 4 unsigned int * keys_input; // e.g., [ 6, 3, 5, 4, 1, 8, 2, 7 ] unsigned int * keys_output; // e.g., [ 9, 9, 9, 9, 9, 9, 9, 9 ] size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partial_sort_copy( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, keys_output, middle, input_size ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partial_sort rocprim::partial_sort_copy( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, keys_output, middle, input_size ); // possible keys_output: [ 1, 2, 3, 4, 5, 9, 9, 9 ]
- Template Parameters:
Config – [optional] configuration of the primitive. It has to be
partial_sort_config
.KeysInputIterator – [inferred] random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
KeysOutputIterator – [inferred] random-access iterator type of the output range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
CompareFunction – [inferred] Type of binary function that accepts two arguments of the type
KeysIterator
and returns a value convertible to bool. Default type isrocprim::less<>.
- Parameters:
temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the partial_sort_copy rearrangement.storage_size – [inout] reference to a size (in bytes) of
temporary_storage
.keys_input – [in] iterator to the input range.
keys_output – [out] iterator to the output range. No overlap at all is allowed between
keys_input
andkeys_output
.keys_output
should be able to be write to at leastmiddle
+ 1 elements.middle – [in] The index of the point till where it is sorted in the input range.
size – [in] number of element in the input range.
compare_function – [in] binary operation function object that will be used for comparison. The signature of the function should be equivalent to the following:
bool f(const T &a, const T &b);
. The signature does not need to haveconst &
, but function object must not modify the objects passed to it. The comparator must meet the C++ named requirement Compare. The default value isBinaryFunction()
.stream – [in] [optional] HIP stream object. Default is
0
(default stream).debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is
false
.
- Returns:
hipSuccess
(0
) after successful rearrangement; otherwise a HIP runtime error of typehipError_t
.