Partition#
partition#
-
template<class Config = default_config, class InputIterator, class OutputIterator, class SelectedCountOutputIterator, class UnaryPredicate>
inline hipError_t rocprim::partition(void *temporary_storage, size_t &storage_size, InputIterator input, OutputIterator output, SelectedCountOutputIterator selected_count_output, const size_t size, UnaryPredicate predicate, const hipStream_t stream = 0, const bool debug_synchronous = false)# Parallel select primitive for device level using selection predicate.
Performs a device-wide partition using selection predicate. Partition copies the values from
input
tooutput
in such a way that all values for which thepredicate
returnstrue
precede the elements for which it returnsfalse
.- Overview
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
in a null pointer.Range specified by
selected_count_output
must have at least 1 element.Relative order is preserved for the elements for which the
predicate
returnstrue
. Other elements are copied in reverse order.
- Example
In this example a device-level partition operation is performed on an array of integer values, even values are copied before odd values.
#include <rocprim/rocprim.hpp> auto predicate = [] __device__ (int a) -> bool { return (a%2) == 0; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * output; // empty array of 8 elements size_t * output_count; // empty array of 1 element size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partition( temporary_storage_ptr, temporary_storage_size_bytes, input, output, output_count, input_size, predicate ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partition rocprim::partition( temporary_storage_ptr, temporary_storage_size_bytes, input, output, output_count, input_size, predicate ); // output: [2, 4, 6, 8, 7, 5, 3, 1] // output_count: 4
- Template Parameters:
Config – - [optional] configuration of the primitive. It has to be
select_config
or a class derived from it.InputIterator – - random-access iterator type of the input range. It can be a simple pointer type.
OutputIterator – - random-access iterator type of the output range. It can be a simple pointer type.
SelectedCountOutputIterator – - random-access iterator type of the selected_count_output value. It can be a simple pointer type.
UnaryPredicate – - type of a unary selection predicate.
- Parameters:
temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the select operation.storage_size – [inout] - reference to a size (in bytes) of
temporary_storage
.input – [in] - iterator to the first element in the range to select values from.
output – [out] - iterator to the first element in the output range.
selected_count_output – [out] - iterator to the total number of selected values (length of
output
).size – [in] - number of elements in the input range.
predicate – [in] - unary function object which returns
true
if the element should be ordered before other elements. The signature of the function should be equivalent to the following:bool f(const T &a);
. The signature does not need to haveconst &
, but function object must not modify the object passed to it.stream – [in] - [optional] HIP stream object. The default is
0
(default stream).debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is
false
.
partition_two_way#
-
template<class Config = default_config, class InputIterator, class SelectedOutputIterator, class RejectedOutputIterator, class SelectedCountOutputIterator, class Predicate>
inline hipError_t rocprim::partition_two_way(void *temporary_storage, size_t &storage_size, InputIterator input, SelectedOutputIterator output_selected, RejectedOutputIterator output_rejected, SelectedCountOutputIterator selected_count_output, const size_t size, Predicate predicate, const hipStream_t stream = 0, const bool debug_synchronous = false)# Two-way parallel select primitive for device level using selection predicate.
Performs a device-wide partition using selection predicate. Partition copies the values from
input
tooutput_selected
andoutput_rejected
for all values for which thepredicate
returnstrue
andfalse
respectively.- Overview
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
in a null pointer.The number of elements written to
output_selected
is equal to the number elements in the input for whichpredicate
returnstrue
.The number of elements written to
output_rejected
is equal to the number elements in the input for whichpredicate
returnsfalse
.Range specified by
selected_count_output
must have at least 1 element.Relative order is preserved.
- Example
In this example a device-level two-way partition operation is performed on an array of integer values, even values are copied into the selected output and odd values are copied into rejected output.
#include <rocprim/rocprim.hpp> auto predicate = [] __device__ (int a) -> bool { return (a%2) == 0; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * selected_output; // empty array of at least 4 elements int * rejected_output; // empty array of at least 4 elements size_t * output_count; // empty array of 1 element size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partition_two_way( temporary_storage_ptr, temporary_storage_size_bytes, input, selected_output, rejected_output, selected_output_count, input_size, predicate ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partition rocprim::partition_two_way( temporary_storage_ptr, temporary_storage_size_bytes, input, selected_output, rejected_output, selected_output_count, input_size, predicate ); // output_selected: [2, 4, 6, 8] // output_rejected: [1, 3, 5, 7] // selected_output_count: 4
- Template Parameters:
Config – - [optional] configuration of the primitive. If has to be an instance of
select_config
or a class derived from it.InputIterator – - random-access iterator type of the input range. It can be a simple pointer type.
SelectedOutputIterator – - random-access iterator type of the selected output range. It can be a simple pointer type.
RejectedOutputIterator – - random-access iterator type of the rejected output range. It can be a simple pointer type.
SelectedCountOutputIterator – - random-access iterator type of the selected_count_output value. It can be a simple pointer type.
Predicate – - type of the selection predicate.
- Parameters:
temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the select operation.storage_size – [inout] - reference to a size (in bytes) of
temporary_storage
.input – [in] - iterator to the first element in the range to select values from.
output_selected – [out] - iterator to the first element in the selected output range.
output_rejected – [out] - iterator to the first element in the rejected output range.
selected_count_output – [out] - iterator to the total number of selected values (length of
output_selected
).size – [in] - number of elements in the input range.
predicate – [in] - the unary selection predicate to select values into the select and reject outputs.
stream – [in] - [optional] HIP stream object. The default is
0
(default stream).debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is
false
.
partition_three_way#
-
template<class Config = default_config, typename InputIterator, typename FirstOutputIterator, typename SecondOutputIterator, typename UnselectedOutputIterator, typename SelectedCountOutputIterator, typename FirstUnaryPredicate, typename SecondUnaryPredicate>
inline hipError_t rocprim::partition_three_way(void *temporary_storage, size_t &storage_size, InputIterator input, FirstOutputIterator output_first_part, SecondOutputIterator output_second_part, UnselectedOutputIterator output_unselected, SelectedCountOutputIterator selected_count_output, const size_t size, FirstUnaryPredicate select_first_part_op, SecondUnaryPredicate select_second_part_op, const hipStream_t stream = 0, const bool debug_synchronous = false)# Parallel select primitive for device level using two selection predicates.
Performs a device-wide three-way partition using two selection predicates. Partition copies the values from
input
to eitheroutput_first_part
oroutput_second_part
oroutput_unselected
according to the following criteria: The value is copied tooutput_first_part
if the predicateselect_first_part_op
invoked with the value returnstrue
. It is copied tooutput_second_part
ifselect_first_part_op
returnsfalse
andselect_second_part_op
returnstrue
, and it is copied tooutput_unselected
otherwise.- Overview
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
is a null pointer.Range specified by
selected_count_output
must have at least 2 elements.Relative order is preserved for the elements.
The number of elements written to
output_first_part
is equal to the number of elements in the input for whichselect_first_part_op
returnedtrue
.The number of elements written to
output_second_part
is equal to the number of elements in the input for whichselect_first_part_op
returnedfalse
andselect_second_part_op
returnedtrue
.The number of elements written to
output_unselected
is equal to the number of input elements minus the number of elements written tooutput_first_part
minus the number of elements written tooutput_second_part
.
- Example
In this example a device-level three-way partition operation is performed on an array of integer values, even values are copied to the first partition, odd and 3-divisible values are copied to the second partition, and the rest of the values are copied to the unselected partition
#include <rocprim/rocprim.hpp> auto first_predicate = [] __device__ (int a) -> bool { return (a%2) == 0; }; auto second_predicate = [] __device__ (int a) -> bool { return (a%3) == 0; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * output_first_part; // array of 8 elements int * output_second_part; // array of 8 elements int * output_unselected; // array of 8 elements size_t * output_count; // array of 2 elements size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::partition_three_way( temporary_storage_ptr, temporary_storage_size_bytes, input, output_first_part, output_second_part, output_unselected, output_count, input_size, first_predicate, second_predicate ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform partition rocprim::partition_three_way( temporary_storage_ptr, temporary_storage_size_bytes, input, output_first_part, output_second_part, output_unselected, output_count, input_size, first_predicate, second_predicate ); // elements denoted by '*' were not modified // output_first_part: [2, 4, 6, 8, *, *, *, *] // output_second_part: [3, *, *, *, *, *, *, *] // output_unselected: [1, 5, 7, *, *, *, *, *] // output_count: [4, 1]
- Template Parameters:
Config – - [optional] configuration of the primitive. It has to be
select_config
or a class derived from it.InputIterator – - random-access iterator type of the input range. It can be a simple pointer type.
FirstOutputIterator – - random-access iterator type of the first output range. It can be a simple pointer type.
SecondOutputIterator – - random-access iterator type of the second output range. It can be a simple pointer type.
UnselectedOutputIterator – - random-access iterator type of the unselected output range. It can be a simple pointer type.
SelectedCountOutputIterator – - random-access iterator type of the selected_count_output value. It can be a simple pointer type.
FirstUnaryPredicate – - type of the first unary selection predicate.
SecondUnaryPredicate – - type of the second unary selection predicate.
- Parameters:
temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the select operation.storage_size – [inout] - reference to a size (in bytes) of
temporary_storage
.input – [in] - iterator to the first element in the range to select values from.
output_first_part – [out] - iterator to the first element in the first output range.
output_second_part – [out] - iterator to the first element in the second output range.
output_unselected – [out] - iterator to the first element in the unselected output range.
selected_count_output – [out] - iterator to the total number of selected values in
output_first_part
andoutput_second_part
respectively.size – [in] - number of elements in the input range.
select_first_part_op – [in] - unary function object which returns
true
if the element should be inoutput_first_part
range The signature of the function should be equivalent to the following:bool f(const T &a);
. The signature does not need to haveconst &
, but function object must not modify the object passed to it.select_second_part_op – [in] - unary function object which returns
true
if the element should be inoutput_second_part
range (given thatselect_first_part_op
returnedfalse
) The signature of the function should be equivalent to the following:bool f(const T &a);
. The signature does not need to haveconst &
, but function object must not modify the object passed to it.stream – [in] - [optional] HIP stream object. The default is
0
(default stream).debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is
false
.