Run Length Encode#
Configuring the kernel#
-
template<class ReduceByKeyConfig, class SelectConfig = default_config>
struct run_length_encode_config# Configuration of device-level run-length encoding operation.
- Template Parameters:
ReduceByKeyConfig – - configuration of device-level reduce-by-key operation. Must be
reduce_by_key_config
ordefault_config
.SelectConfig – - configuration of device-level select operation. Must be
select_config
ordefault_config
.
run_length_encode#
-
template<class Config = default_config, class InputIterator, class UniqueOutputIterator, class CountsOutputIterator, class RunsCountOutputIterator>
inline hipError_t rocprim::run_length_encode(void *temporary_storage, size_t &storage_size, InputIterator input, unsigned int size, UniqueOutputIterator unique_output, CountsOutputIterator counts_output, RunsCountOutputIterator runs_count_output, hipStream_t stream = 0, bool debug_synchronous = false)# Parallel run-length encoding for device level.
run_length_encode function performs a device-wide run-length encoding of runs (groups) of consecutive values. The first value of each run is copied to
unique_output
and the length of the run is written tocounts_output
. The total number of runs is written toruns_count_output
.- Overview
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
in a null pointer.Range specified by
input
must have at leastsize
elements.Range specified by
runs_count_output
must have at least 1 element.Ranges specified by
unique_output
andcounts_output
must have at least*runs_count_output
(i.e. the number of runs) elements.
- Example
In this example a device-level run-length encoding operation is performed on an array of integer values.
#include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 1, 1, 2, 10, 10, 10, 88] int * unique_output; // empty array of at least 4 elements int * counts_output; // empty array of at least 4 elements int * runs_count_output; // empty array of 1 element size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::run_length_encode( temporary_storage_ptr, temporary_storage_size_bytes, input, input_size, unique_output, counts_output, runs_count_output ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform encoding rocprim::run_length_encode( temporary_storage_ptr, temporary_storage_size_bytes, input, input_size, unique_output, counts_output, runs_count_output ); // unique_output: [1, 2, 10, 88] // counts_output: [3, 1, 3, 1] // runs_count_output: [4]
- Template Parameters:
Config – - [optional] Configuration of the primitive, must be
default_config
orrun_length_encode_config
.InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
UniqueOutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
CountsOutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
RunsCountOutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
- Parameters:
temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the operation.storage_size – [inout] - reference to a size (in bytes) of
temporary_storage
.input – [in] - iterator to the first element in the range of values.
size – [in] - number of element in the input range.
unique_output – [out] - iterator to the first element in the output range of unique values.
counts_output – [out] - iterator to the first element in the output range of lenghts.
runs_count_output – [out] - iterator to total number of runs.
stream – [in] - [optional] HIP stream object. Default is
0
(default stream).debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is
false
.
- Returns:
hipSuccess
(0
) after successful operation; otherwise a HIP runtime error of typehipError_t
.
run_length_encode_non_trivial_runs#
-
template<class Config = default_config, class InputIterator, class OffsetsOutputIterator, class CountsOutputIterator, class RunsCountOutputIterator>
inline hipError_t rocprim::run_length_encode_non_trivial_runs(void *temporary_storage, size_t &storage_size, InputIterator input, unsigned int size, OffsetsOutputIterator offsets_output, CountsOutputIterator counts_output, RunsCountOutputIterator runs_count_output, hipStream_t stream = 0, bool debug_synchronous = false)# Parallel run-length encoding of non-trivial runs for device level.
run_length_encode_non_trivial_runs function performs a device-wide run-length encoding of non-trivial runs (groups) of consecutive values (groups of more than one element). The offset of the first value of each non-trivial run is copied to
offsets_output
and the length of the run (the count of elements) is written tocounts_output
. The total number of non-trivial runs is written toruns_count_output
.- Overview
Returns the required size of
temporary_storage
instorage_size
iftemporary_storage
in a null pointer.Range specified by
input
must have at leastsize
elements.Range specified by
runs_count_output
must have at least 1 element.Ranges specified by
offsets_output
andcounts_output
must have at least*runs_count_output
(i.e. the number of non-trivial runs) elements.
- Example
In this example a device-level run-length encoding of non-trivial runs is performed on an array of integer values.
#include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 int * input; // e.g., [1, 1, 1, 2, 10, 10, 10, 88] int * offsets_output; // empty array of at least 2 elements int * counts_output; // empty array of at least 2 elements int * runs_count_output; // empty array of 1 element size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::run_length_encode_non_trivial_runs( temporary_storage_ptr, temporary_storage_size_bytes, input, input_size, offsets_output, counts_output, runs_count_output ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform encoding rocprim::run_length_encode_non_trivial_runs( temporary_storage_ptr, temporary_storage_size_bytes, input, input_size, offsets_output, counts_output, runs_count_output ); // offsets_output: [0, 4] // counts_output: [3, 3] // runs_count_output: [2]
- Template Parameters:
Config – - [optional] Configuration of the primitive, must be
default_config
orrun_length_encode_config
.InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
OffsetsOutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
CountsOutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
RunsCountOutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
- Parameters:
temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to
storage_size
and function returns without performing the operation.storage_size – [inout] - reference to a size (in bytes) of
temporary_storage
.input – [in] - iterator to the first element in the range of values.
size – [in] - number of element in the input range.
offsets_output – [out] - iterator to the first element in the output range of offsets.
counts_output – [out] - iterator to the first element in the output range of lenghts.
runs_count_output – [out] - iterator to total number of runs.
stream – [in] - [optional] HIP stream object. Default is
0
(default stream).debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is
false
.
- Returns:
hipSuccess
(0
) after successful operation; otherwise a HIP runtime error of typehipError_t
.