Nth Element#
Configuring the kernel#
-
template<unsigned int BlockSize, unsigned int ItemsPerThread, unsigned int StopRecursionSize, unsigned int NumberOfBuckets, block_radix_rank_algorithm RadixRankAlgorithm>
struct nth_element_config : public rocprim::detail::nth_element_config_params# Configuration of device-level nth_element.
- Template Parameters:
BlockSize – number of threads in a block.
ItemsPerThread – number of items processed by each thread.
StopRecursionSize – the size from where recursion is stopped to do a block sort
NumberOfBuckets – the number of buckets that are used in the algorithm
RadixRankAlgorithm – algorithm for radix rank
nth_element#
-
template<class Config = default_config, class KeysIterator, class BinaryFunction = ::rocprim::less<typename std::iterator_traits<KeysIterator>::value_type>>
inline hipError_t rocprim::nth_element(void *temporary_storage, size_t &storage_size, KeysIterator keys, size_t nth, size_t size, BinaryFunction compare_function = BinaryFunction(), hipStream_t stream = 0, bool debug_synchronous = false)# Rearrange elements smaller than the n-th before and bigger than n-th after the n-th element.
The element at index
n
is set to the element that would be at the n-th position if the input was sorted. Additionally the other elements are rearranged such that for all values ofi
in [keys_output, keys_output + n) and all values ofj
in [keys_output + n, keys_output + size):comp(*i, *j)
is false. Smaller elements than the n-th will be arranged before, and bigger ones after the n-th element.- Overview
The contents of the inputs are not altered by the function.
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
is a null pointer.Accepts custom compare_functions for nth_element across the device.
Streams in graph capture mode are not supported
- Example
In this example a device-level nth_element is performed where input keys are represented by an array of unsigned integers.
#include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 size_t nth; // e.g., 4 unsigned int * keys; // e.g., [ 6, 3, 5, 4, 1, 8, 2, 7 ] size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::nth_element( temporary_storage_ptr, temporary_storage_size_bytes, keys, nth, input_size ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform nth_element rocprim::nth_element( temporary_storage_ptr, temporary_storage_size_bytes, keys, nth, input_size ); // possible keys: [ 1, 3, 4, 2, 5, 8, 7, 6 ]
- Template Parameters:
Config – [optional] configuration of the primitive. It has to be
nth_element_config
.KeysIterator – [inferred] random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
CompareFunction – [inferred] Type of binary function that accepts two arguments of the type
KeysIterator
and returns a value convertible to bool. Default type isrocprim::less<>.
- Parameters:
temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the nth_element rearrangement.storage_size – [inout] reference to a size (in bytes) of
temporary_storage
.keys – [inout] iterator to the input range.
nth – [in] The index of the nth_element in the input range.
size – [in] number of element in the input range.
compare_function – [in] binary operation function object that will be used for comparison. The signature of the function should be equivalent to the following:
bool f(const T &a, const T &b);
. The signature does not need to haveconst &
, but function object must not modify the objects passed to it. The comparator must meet the C++ named requirement Compare. The default value isBinaryFunction()
.stream – [in] [optional] HIP stream object. Default is
0
(default stream).debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is
false
.
- Returns:
hipSuccess
(0
) after successful rearrangement; otherwise a HIP runtime error of typehipError_t
.
-
template<class Config = default_config, class KeysInputIterator, class KeysOutputIterator, class BinaryFunction = ::rocprim::less<typename std::iterator_traits<KeysInputIterator>::value_type>>
inline hipError_t rocprim::nth_element(void *temporary_storage, size_t &storage_size, KeysInputIterator keys_input, KeysOutputIterator keys_output, size_t nth, size_t size, BinaryFunction compare_function = BinaryFunction(), hipStream_t stream = 0, bool debug_synchronous = false)# Rearrange elements smaller than the n-th before and bigger than n-th after the n-th element.
The element at index
n
is set to the element that would be at the n-th position if the input was sorted. Additionally the other elements are rearranged such that for all values ofi
in [keys_output, keys_output + n) and all values ofj
in [keys_output + n, keys_output + size):comp(*i, *j)
is false. Smaller elements than the n-th will be arranged before, and bigger ones after the n-th element.- Overview
The contents of the inputs are not altered by the function.
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
is a null pointer.Accepts custom compare_functions for nth_element across the device.
Streams in graph capture mode are not supported
- Example
In this example a device-level nth_element is performed where input keys are represented by an array of unsigned integers.
#include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 size_t nth; // e.g., 4 unsigned int * keys_input; // e.g., [ 6, 3, 5, 4, 1, 8, 2, 7 ] unsigned int * keys_output; // empty array of 8 elements size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::nth_element( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, keys_output, nth, input_size ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform nth_element rocprim::nth_element( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, keys_output, nth, input_size ); // possible keys_output: [ 1, 3, 4, 2, 5, 8, 7, 6 ]
- Template Parameters:
Config – [optional] configuration of the primitive. It has to be
nth_element_config
.KeysInputIterator – [inferred] random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
KeysOutputIterator – [inferred] random-access iterator type of the output range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
CompareFunction – [inferred] Type of binary function that accepts two arguments of the type
KeysIterator
and returns a value convertible to bool. Default type isrocprim::less<>.
- Parameters:
temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the nth_element rearrangement.storage_size – [inout] reference to a size (in bytes) of
temporary_storage
.keys_input – [in] iterator to the input range.
keys_output – [out] iterator to the output range. No overlap at all is allowed between
keys_input
andkeys_output
.keys_output
should be able to be written and read from forsize
elements.nth – [in] The index of the nth_element in the input range.
size – [in] number of element in the input range.
compare_function – [in] binary operation function object that will be used for comparison. The signature of the function should be equivalent to the following:
bool f(const T &a, const T &b);
. The signature does not need to haveconst &
, but function object must not modify the objects passed to it. The comparator must meet the C++ named requirement Compare. The default value isBinaryFunction()
.stream – [in] [optional] HIP stream object. Default is
0
(default stream).debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is
false
.
- Returns:
hipSuccess
(0
) after successful rearrangement; otherwise a HIP runtime error of typehipError_t
.