Partition#
partition#
-
template<class Config = default_config, class InputIterator, class OutputIterator, class SelectedCountOutputIterator, class UnaryPredicate, bool UsingOrderedBlockId = false>
inline hipError_t rocprim::partition(void *temporary_storage, size_t &storage_size, InputIterator input, OutputIterator output, SelectedCountOutputIterator selected_count_output, const size_t size, UnaryPredicate predicate, const hipStream_t stream = 0, const bool debug_synchronous = false)# Parallel select primitive for device level using selection predicate.
Performs a device-wide partition using selection predicate. Partition copies the values from
inputtooutputin such a way that all values for which thepredicatereturnstrueprecede the elements for which it returnsfalse.- Overview
Returns the required size of
temporary_storageinstorage_sizeiftemporary_storagein a null pointer.Range specified by
selected_count_outputmust have at least 1 element.Relative order is preserved for the elements for which the
predicatereturnstrue. Other elements are copied in reverse order.
- Example
In this example a device-level partition operation is performed on an array of integer values, even values are copied before odd values.
#include <rocprim/rocprim.hpp> auto predicate = [] (int a) -> bool { return (a%2) == 0; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * output; // empty array of 8 elements size_t * output_count; // empty array of 1 element size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partition( temporary_storage_ptr, temporary_storage_size_bytes, input, output, output_count, input_size, predicate ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partition rocprim::partition( temporary_storage_ptr, temporary_storage_size_bytes, input, output, output_count, input_size, predicate ); // output: [2, 4, 6, 8, 7, 5, 3, 1] // output_count: 4
- Template Parameters:
Config – [optional] Configuration of the primitive, must be
default_configorselect_config.InputIterator – random-access iterator type of the input range. It can be a simple pointer type.
OutputIterator – random-access iterator type of the output range. It can be a simple pointer type.
SelectedCountOutputIterator – random-access iterator type of the selected_count_output value. It can be a simple pointer type.
UnaryPredicate – type of a unary selection predicate.
UsingOrderedBlockId – If true, uses an atomic counter to assign block id instead of natural blockIdx-based ordering. Can increase performance on MI3xx architectures when using streams. The default is false.
- Parameters:
temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_sizeand function returns without performing the select operation.storage_size – [inout] reference to a size (in bytes) of
temporary_storage.input – [in] iterator to the first element in the range to select values from.
output – [out] iterator to the first element in the output range.
selected_count_output – [out] iterator to the total number of selected values (length of
output).size – [in] number of elements in the input range.
predicate – [in] unary function object which returns
trueif the element should be ordered before other elements. The signature of the function should be equivalent to the following:bool f(const T &a);. The signature does not need to haveconst &, but function object must not modify the object passed to it.stream – [in] [optional] HIP stream object. The default is
0(default stream).debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is
false.
partition_two_way#
-
template<class Config = default_config, class InputIterator, class SelectedOutputIterator, class RejectedOutputIterator, class SelectedCountOutputIterator, class Predicate, bool UsingOrderedBlockId = false>
inline hipError_t rocprim::partition_two_way(void *temporary_storage, size_t &storage_size, InputIterator input, SelectedOutputIterator output_selected, RejectedOutputIterator output_rejected, SelectedCountOutputIterator selected_count_output, const size_t size, Predicate predicate, const hipStream_t stream = 0, const bool debug_synchronous = false)# Two-way parallel select primitive for device level using selection predicate.
Performs a device-wide partition using selection predicate. Partition copies the values from
inputtooutput_selectedandoutput_rejectedfor all values for which thepredicatereturnstrueandfalserespectively.- Overview
Returns the required size of
temporary_storageinstorage_sizeiftemporary_storagein a null pointer.The number of elements written to
output_selectedis equal to the number elements in the input for whichpredicatereturnstrue.The number of elements written to
output_rejectedis equal to the number elements in the input for whichpredicatereturnsfalse.Range specified by
selected_count_outputmust have at least 1 element.Relative order is preserved.
- Example
In this example a device-level two-way partition operation is performed on an array of integer values, even values are copied into the selected output and odd values are copied into rejected output.
The full example is on GitHub.
#include <rocprim/rocprim.hpp> auto predicate = [] (int a) -> bool { return (a%2) == 0; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * selected_output; // empty array of at least 4 elements int * rejected_output; // empty array of at least 4 elements size_t * output_count; // empty array of 1 element size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partition_two_way( temporary_storage_ptr, temporary_storage_size_bytes, input, selected_output, rejected_output, selected_output_count, input_size, predicate ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partition rocprim::partition_two_way( temporary_storage_ptr, temporary_storage_size_bytes, input, selected_output, rejected_output, selected_output_count, input_size, predicate ); // output_selected: [2, 4, 6, 8] // output_rejected: [1, 3, 5, 7] // selected_output_count: 4
- Template Parameters:
Config – [optional] Configuration of the primitive, must be
default_configorselect_config.InputIterator – random-access iterator type of the input range. It can be a simple pointer type.
SelectedOutputIterator – random-access iterator type of the selected output range. It can be a simple pointer type.
RejectedOutputIterator – random-access iterator type of the rejected output range. It can be a simple pointer type.
SelectedCountOutputIterator – random-access iterator type of the selected_count_output value. It can be a simple pointer type.
Predicate – type of the selection predicate.
UsingOrderedBlockId – If true, uses an atomic counter to assign block id instead of natural blockIdx-based ordering. Can increase performance on MI3xx architectures when using streams. The default is false.
- Parameters:
temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_sizeand function returns without performing the select operation.storage_size – [inout] reference to a size (in bytes) of
temporary_storage.input – [in] iterator to the first element in the range to select values from.
output_selected – [out] iterator to the first element in the selected output range.
output_rejected – [out] iterator to the first element in the rejected output range.
selected_count_output – [out] iterator to the total number of selected values (length of
output_selected).size – [in] number of elements in the input range.
predicate – [in] the unary selection predicate to select values into the select and reject outputs.
stream – [in] [optional] HIP stream object. The default is
0(default stream).debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is
false.
partition_three_way#
-
template<class Config = default_config, typename InputIterator, typename FirstOutputIterator, typename SecondOutputIterator, typename UnselectedOutputIterator, typename SelectedCountOutputIterator, typename FirstUnaryPredicate, typename SecondUnaryPredicate, bool UsingOrderedBlockId = false>
inline hipError_t rocprim::partition_three_way(void *temporary_storage, size_t &storage_size, InputIterator input, FirstOutputIterator output_first_part, SecondOutputIterator output_second_part, UnselectedOutputIterator output_unselected, SelectedCountOutputIterator selected_count_output, const size_t size, FirstUnaryPredicate select_first_part_op, SecondUnaryPredicate select_second_part_op, const hipStream_t stream = 0, const bool debug_synchronous = false)# Parallel select primitive for device level using two selection predicates.
Performs a device-wide three-way partition using two selection predicates. Partition copies the values from
inputto eitheroutput_first_partoroutput_second_partoroutput_unselectedaccording to the following criteria: The value is copied tooutput_first_partif the predicateselect_first_part_opinvoked with the value returnstrue. It is copied tooutput_second_partifselect_first_part_opreturnsfalseandselect_second_part_opreturnstrue, and it is copied tooutput_unselectedotherwise.- Overview
Returns the required size of
temporary_storageinstorage_sizeiftemporary_storageis a null pointer.Range specified by
selected_count_outputmust have at least 2 elements.Relative order is preserved for the elements.
The number of elements written to
output_first_partis equal to the number of elements in the input for whichselect_first_part_opreturnedtrue.The number of elements written to
output_second_partis equal to the number of elements in the input for whichselect_first_part_opreturnedfalseandselect_second_part_opreturnedtrue.The number of elements written to
output_unselectedis equal to the number of input elements minus the number of elements written tooutput_first_partminus the number of elements written tooutput_second_part.
- Example
In this example a device-level three-way partition operation is performed on an array of integer values, even values are copied to the first partition, odd and 3-divisible values are copied to the second partition, and the rest of the values are copied to the unselected partition
#include <rocprim/rocprim.hpp> auto first_predicate = [] (int a) -> bool { return (a%2) == 0; }; auto second_predicate = [] (int a) -> bool { return (a%3) == 0; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * output_first_part; // array of 8 elements int * output_second_part; // array of 8 elements int * output_unselected; // array of 8 elements size_t * output_count; // array of 2 elements size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partition_three_way( temporary_storage_ptr, temporary_storage_size_bytes, input, output_first_part, output_second_part, output_unselected, output_count, input_size, first_predicate, second_predicate ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partition rocprim::partition_three_way( temporary_storage_ptr, temporary_storage_size_bytes, input, output_first_part, output_second_part, output_unselected, output_count, input_size, first_predicate, second_predicate ); // elements denoted by '*' were not modified // output_first_part: [2, 4, 6, 8, *, *, *, *] // output_second_part: [3, *, *, *, *, *, *, *] // output_unselected: [1, 5, 7, *, *, *, *, *] // output_count: [4, 1]
- Template Parameters:
Config – [optional] Configuration of the primitive, must be
default_configorselect_config.InputIterator – random-access iterator type of the input range. It can be a simple pointer type.
FirstOutputIterator – random-access iterator type of the first output range. It can be a simple pointer type.
SecondOutputIterator – random-access iterator type of the second output range. It can be a simple pointer type.
UnselectedOutputIterator – random-access iterator type of the unselected output range. It can be a simple pointer type.
SelectedCountOutputIterator – random-access iterator type of the selected_count_output value. It can be a simple pointer type.
FirstUnaryPredicate – type of the first unary selection predicate.
SecondUnaryPredicate – type of the second unary selection predicate.
UsingOrderedBlockId – If true, uses an atomic counter to assign block id instead of natural blockIdx-based ordering. Can increase performance on MI3xx architectures when using streams. The default is false.
- Parameters:
temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_sizeand function returns without performing the select operation.storage_size – [inout] reference to a size (in bytes) of
temporary_storage.input – [in] iterator to the first element in the range to select values from.
output_first_part – [out] iterator to the first element in the first output range.
output_second_part – [out] iterator to the first element in the second output range.
output_unselected – [out] iterator to the first element in the unselected output range.
selected_count_output – [out] iterator to the total number of selected values in
output_first_partandoutput_second_partrespectively.size – [in] number of elements in the input range.
select_first_part_op – [in] unary function object which returns
trueif the element should be inoutput_first_partrange The signature of the function should be equivalent to the following:bool f(const T &a);. The signature does not need to haveconst &, but function object must not modify the object passed to it.select_second_part_op – [in] unary function object which returns
trueif the element should be inoutput_second_partrange (given thatselect_first_part_opreturnedfalse) The signature of the function should be equivalent to the following:bool f(const T &a);. The signature does not need to haveconst &, but function object must not modify the object passed to it.stream – [in] [optional] HIP stream object. The default is
0(default stream).debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is
false.