Scan#
Configuring the kernel#
scan#
- 
template<unsigned int BlockSize, unsigned int ItemsPerThread, ::rocprim::block_load_method BlockLoadMethod, ::rocprim::block_store_method BlockStoreMethod, ::rocprim::block_scan_algorithm BlockScanMethod, unsigned int SizeLimit = std::numeric_limits<unsigned int>::max()>
 struct scan_config : public rocprim::detail::scan_config_params#
- Configuration of device-level scan primitives. - Template Parameters:
- BlockSize – - number of threads in a block. 
- ItemsPerThread – - number of items processed by each thread. 
- BlockLoadMethod – - method for loading input values. 
- StoreLoadMethod – - method for storing values. 
- BlockScanMethod – - algorithm for block scan. 
- SizeLimit – - limit on the number of items for a single scan kernel launch. 
 
 - Subclassed by rocprim::detail::default_scan_config< arch, value_type, enable > 
scan_by_key#
- 
template<unsigned int BlockSize, unsigned int ItemsPerThread, ::rocprim::block_load_method BlockLoadMethod, ::rocprim::block_store_method BlockStoreMethod, ::rocprim::block_scan_algorithm BlockScanMethod, unsigned int SizeLimit = std::numeric_limits<unsigned int>::max()>
 struct scan_by_key_config : public rocprim::detail::scan_by_key_config_params#
- Configuration of device-level scan-by-key operation. - Template Parameters:
- BlockSize – - number of threads in a block. 
- ItemsPerThread – - number of items processed by each thread. 
- BlockLoadMethod – - method for loading input values. 
- StoreLoadMethod – - method for storing values. 
- BlockScanMethod – - algorithm for block scan. 
- SizeLimit – - limit on the number of items for a single scan kernel launch. 
 
 - Subclassed by rocprim::detail::default_scan_by_key_config< arch, key_type, value_type, enable > 
scan#
inclusive#
- 
template<class Config = default_config, class InputIterator, class OutputIterator, class BinaryFunction = ::rocprim::plus<typename std::iterator_traits<InputIterator>::value_type>, class AccType = typename std::iterator_traits<InputIterator>::value_type>
 inline hipError_t rocprim::inclusive_scan(void *temporary_storage, size_t &storage_size, InputIterator input, OutputIterator output, const size_t size, BinaryFunction scan_op = BinaryFunction(), const hipStream_t stream = 0, bool debug_synchronous = false)#
- Parallel inclusive scan primitive for device level. - inclusive_scan function performs a device-wide inclusive prefix scan operation using binary - scan_opoperator.- Overview
- Supports non-commutative scan operators. However, a scan operator should be associative. When used with non-associative functions the results may be non-deterministic and/or vary in precision. 
- Returns the required size of - temporary_storagein- storage_sizeif- temporary_storagein a null pointer.
- Ranges specified by - inputand- outputmust have at least- sizeelements.
- By default, the input type is used for accumulation. A custom type can be specified using the - AccTypetype parameter, see the example below.
 
- Example
- In this example a device-level inclusive sum operation is performed on an array of integer values ( - shorts are scanned into- ints).- #include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 short * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * output; // empty array of 8 elements size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::inclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, input_size, rocprim::plus<int>() ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform scan rocprim::inclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, input_size, rocprim::plus<int>() ); // output: [1, 3, 6, 10, 15, 21, 28, 36] - The same example as above, but now a custom accumulator type is specified. - #include <rocprim/rocprim.hpp> size_t input_size; short * input; int * output; size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; rocprim::inclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, input_size, rocprim::plus<int>() ); hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // Use type parameter to set custom accumulator type rocprim::inclusive_scan<rocprim::default_config, short*, int*, rocprim::plus<int>, int>(temporary_storage_ptr, temporary_storage_size_bytes, input_iterator, output, input_size, rocprim::plus<int>()); 
 - Template Parameters:
- Config – - [optional] configuration of the primitive, has to be - scan_configor a class derived from it.
- InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type. 
- OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type. 
- BinaryFunction – - type of binary function used for scan. Default type is - rocprim::plus<T>, where- Tis a- value_typeof- InputIterator.
- AccType – - accumulator type used to propagate the scanned values. Default type is value type of the input iterator. 
 
- Parameters:
- temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to - storage_sizeand function returns without performing the scan operation.
- storage_size – [inout] - reference to a size (in bytes) of - temporary_storage.
- input – [in] - iterator to the first element in the range to scan. 
- output – [out] - iterator to the first element in the output range. It can be same as - input.
- size – [in] - number of element in the input range. 
- scan_op – [in] - binary operation function object that will be used for scan. The signature of the function should be equivalent to the following: - T f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. Default is BinaryFunction().
- stream – [in] - [optional] HIP stream object. Default is - 0(default stream).
- debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is - false.
 
- Returns:
- hipSuccess(- 0) after successful scan; otherwise a HIP runtime error of type- hipError_t.
 
exclusive#
- 
template<class Config = default_config, class InputIterator, class OutputIterator, class InitValueType, class BinaryFunction = ::rocprim::plus<typename std::iterator_traits<InputIterator>::value_type>, class AccType = detail::input_type_t<InitValueType>>
 inline hipError_t rocprim::exclusive_scan(void *temporary_storage, size_t &storage_size, InputIterator input, OutputIterator output, const InitValueType initial_value, const size_t size, BinaryFunction scan_op = BinaryFunction(), const hipStream_t stream = 0, bool debug_synchronous = false)#
- Parallel exclusive scan primitive for device level. - exclusive_scan function performs a device-wide exclusive prefix scan operation using binary - scan_opoperator.- Overview
- Supports non-commutative scan operators. However, a scan operator should be associative. When used with non-associative functions the results may be non-deterministic and/or vary in precision. 
- Returns the required size of - temporary_storagein- storage_sizeif- temporary_storagein a null pointer.
- Ranges specified by - inputand- outputmust have at least- sizeelements.
 
- Example
- In this example a device-level exclusive min-scan operation is performed on an array of integer values ( - shorts are scanned into- ints) using custom operator.- #include <rocprim/rocprim.hpp> // custom scan function auto min_op = [] __device__ (int a, int b) -> int { return a < b ? a : b; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 short * input; // e.g., [4, 7, 6, 2, 5, 1, 3, 8] int * output; // empty array of 8 elements int start_value; // e.g., 9 size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::exclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, start_value, input_size, min_op ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform scan rocprim::exclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, start_value, input_size, min_op ); // output: [9, 4, 4, 4, 2, 2, 1, 1] 
 - Template Parameters:
- Config – - [optional] configuration of the primitive, has to be - scan_configor a class derived from it.
- InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type. 
- OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type. 
- InitValueType – - type of the initial value. 
- BinaryFunction – - type of binary function used for scan. Default type is - rocprim::plus<T>, where- Tis a- value_typeof- InputIterator.
- AccType – - accumulator type used to propagate the scanned values. Default type is ‘InitValueType’, unless it’s ‘rocprim::future_value’. Then it will be the wrapped input type. 
 
- Parameters:
- temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to - storage_sizeand function returns without performing the scan operation.
- storage_size – [inout] - reference to a size (in bytes) of - temporary_storage.
- input – [in] - iterator to the first element in the range to scan. 
- output – [out] - iterator to the first element in the output range. It can be same as - input.
- initial_value – [in] - initial value to start the scan. A rocpim::future_value may be passed to use a value that will be later computed. 
- size – [in] - number of element in the input range. 
- scan_op – [in] - binary operation function object that will be used for scan. The signature of the function should be equivalent to the following: - T f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. The default value is- BinaryFunction().
- stream – [in] - [optional] HIP stream object. The default is - 0(default stream).
- debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is - false.
 
- Returns:
- hipSuccess(- 0) after successful scan; otherwise a HIP runtime error of type- hipError_t.
 
segmented, inclusive#
- 
template<class Config = default_config, class InputIterator, class OutputIterator, class OffsetIterator, class BinaryFunction = ::rocprim::plus<typename std::iterator_traits<InputIterator>::value_type>>
 inline hipError_t rocprim::segmented_inclusive_scan(void *temporary_storage, size_t &storage_size, InputIterator input, OutputIterator output, unsigned int segments, OffsetIterator begin_offsets, OffsetIterator end_offsets, BinaryFunction scan_op = BinaryFunction(), hipStream_t stream = 0, bool debug_synchronous = false)#
- Parallel segmented inclusive scan primitive for device level. - segmented_inclusive_scan function performs a device-wide inclusive scan operation across multiple sequences from - inputusing binary- scan_opoperator.- Overview
- Returns the required size of - temporary_storagein- storage_sizeif- temporary_storagein a null pointer.
- Ranges specified by - inputand- outputmust have at least- sizeelements.
- Ranges specified by - begin_offsetsand- end_offsetsmust have at least- segmentselements. They may use the same sequence- offsetsof at least- segments + 1elements:- offsetsfor- begin_offsetsand- offsets + 1for- end_offsets.
 
- Example
- In this example a device-level segmented inclusive min-scan operation is performed on an array of integer values ( - shorts are scanned into- ints) using custom operator.- #include <rocprim/rocprim.hpp> // custom scan function auto min_op = [] __device__ (int a, int b) -> int { return a < b ? a : b; }; // Prepare input and output (declare pointers, allocate device memory etc.) short * input; // e.g., [4, 7, 6, 2, 5, 1, 3, 8] int * output; // empty array of 8 elements size_t segments; // e.g., 3 int * offsets; // e.g. [0, 2, 4, 8] size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::segmented_inclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, segments, offsets, offsets + 1, min_op ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform scan rocprim::inclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, segments, offsets, offsets + 1, min_op ); // output: [4, 4, 6, 2, 5, 1, 1, 1] 
 - Template Parameters:
- Config – - [optional] configuration of the primitive. It has to be - scan_configor a class derived from it.
- InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ RandomAccessIterator concept. It can be a simple pointer type. 
- OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ RandomAccessIterator concept. It can be a simple pointer type. 
- OffsetIterator – - random-access iterator type of segment offsets. Must meet the requirements of a C++ RandomAccessIterator concept. It can be a simple pointer type. 
- BinaryFunction – - type of binary function used for scan operation. Default type is - rocprim::plus<T>, where- Tis a- value_typeof- InputIterator.
 
- Parameters:
- temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to - storage_sizeand function returns without performing the scan operation.
- storage_size – [inout] - reference to a size (in bytes) of - temporary_storage.
- input – [in] - iterator to the first element in the range to scan. 
- output – [out] - iterator to the first element in the output range. 
- segments – [in] - number of segments in the input range. 
- begin_offsets – [in] - iterator to the first element in the range of beginning offsets. 
- end_offsets – [in] - iterator to the first element in the range of ending offsets. 
- scan_op – [in] - binary operation function object that will be used for scan. The signature of the function should be equivalent to the following: - T f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. The default value is- BinaryFunction().
- stream – [in] - [optional] HIP stream object. The default is - 0(default stream).
- debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is - false.
 
- Returns:
- hipSuccess(- 0) after successful scan; otherwise a HIP runtime error of type- hipError_t.
 
segmented, exclusive#
- 
template<class Config = default_config, class InputIterator, class OutputIterator, class OffsetIterator, class InitValueType, class BinaryFunction = ::rocprim::plus<typename std::iterator_traits<InputIterator>::value_type>>
 inline hipError_t rocprim::segmented_exclusive_scan(void *temporary_storage, size_t &storage_size, InputIterator input, OutputIterator output, unsigned int segments, OffsetIterator begin_offsets, OffsetIterator end_offsets, const InitValueType initial_value, BinaryFunction scan_op = BinaryFunction(), hipStream_t stream = 0, bool debug_synchronous = false)#
- Parallel segmented exclusive scan primitive for device level. - segmented_exclusive_scan function performs a device-wide exclusive scan operation across multiple sequences from - inputusing binary- scan_opoperator.- Overview
- Returns the required size of - temporary_storagein- storage_sizeif- temporary_storagein a null pointer.
- Ranges specified by - inputand- outputmust have at least- sizeelements.
- Ranges specified by - begin_offsetsand- end_offsetsmust have at least- segmentselements. They may use the same sequence- offsetsof at least- segments + 1elements:- offsetsfor- begin_offsetsand- offsets + 1for- end_offsets.
 
- Example
- In this example a device-level segmented exclusive min-scan operation is performed on an array of integer values ( - shorts are scanned into- ints) using custom operator.- #include <rocprim/rocprim.hpp> // custom scan function auto min_op = [] __device__ (int a, int b) -> int { return a < b ? a : b; }; // Prepare input and output (declare pointers, allocate device memory etc.) int start_value; // e.g., 9 short * input; // e.g., [4, 7, 6, 2, 5, 1, 3, 8] int * output; // empty array of 8 elements size_t segments; // e.g., 3 int * offsets; // e.g. [0, 2, 4, 8] size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::segmented_exclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, segments, offsets, offsets + 1 start_value, min_op ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform scan rocprim::exclusive_scan( temporary_storage_ptr, temporary_storage_size_bytes, input, output, segments, offsets, offsets + 1 start_value, min_op ); // output: [9, 4, 9, 6, 9, 5, 1, 1] 
 - Template Parameters:
- Config – - [optional] configuration of the primitive. It has to be - scan_configor a class derived from it.
- InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ RandomAccessIterator concept. It can be a simple pointer type. 
- OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ RandomAccessIterator concept. It can be a simple pointer type. 
- OffsetIterator – - random-access iterator type of segment offsets. Must meet the requirements of a C++ RandomAccessIterator concept. It can be a simple pointer type. 
- InitValueType – - type of the initial value. 
- BinaryFunction – - type of binary function used for scan operation. Default type is - rocprim::plus<T>, where- Tis a- value_typeof- InputIterator.
 
- Parameters:
- temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to - storage_sizeand function returns without performing the scan operation.
- storage_size – [inout] - reference to a size (in bytes) of - temporary_storage.
- input – [in] - iterator to the first element in the range to scan. 
- output – [out] - iterator to the first element in the output range. 
- segments – [in] - number of segments in the input range. 
- begin_offsets – [in] - iterator to the first element in the range of beginning offsets. 
- end_offsets – [in] - iterator to the first element in the range of ending offsets. 
- initial_value – [in] - initial value to start the scan. 
- scan_op – [in] - binary operation function object that will be used for scan. The signature of the function should be equivalent to the following: - T f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. The default value is- BinaryFunction().
- stream – [in] - [optional] HIP stream object. The default is - 0(default stream).
- debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is - false.
 
- Returns:
- hipSuccess(- 0) after successful scan; otherwise a HIP runtime error of type- hipError_t.
 
scan_by_key#
inclusive#
- 
template<typename Config = default_config, typename KeysInputIterator, typename ValuesInputIterator, typename ValuesOutputIterator, typename BinaryFunction = ::rocprim::plus<typename std::iterator_traits<ValuesInputIterator>::value_type>, typename KeyCompareFunction = ::rocprim::equal_to<typename std::iterator_traits<KeysInputIterator>::value_type>, typename AccType = typename std::iterator_traits<ValuesInputIterator>::value_type>
 inline hipError_t rocprim::inclusive_scan_by_key(void *const temporary_storage, size_t &storage_size, const KeysInputIterator keys_input, const ValuesInputIterator values_input, const ValuesOutputIterator values_output, const size_t size, const BinaryFunction scan_op = BinaryFunction(), const KeyCompareFunction key_compare_op = KeyCompareFunction(), const hipStream_t stream = 0, const bool debug_synchronous = false)#
- Parallel inclusive scan-by-key primitive for device level. - inclusive_scan_by_key function performs a device-wide inclusive prefix scan-by-key operation using binary - scan_opoperator.- Overview
- Supports non-commutative scan operators. However, a scan operator should be associative. When used with non-associative functions the results may be non-deterministic and/or vary in precision. 
- Returns the required size of - temporary_storagein- storage_sizeif- temporary_storagein a null pointer.
- Ranges specified by - keys_input,- values_input, and- values_outputmust have at least- sizeelements.
 
- Example
- In this example a device-level inclusive sum-by-key operation is performed on an array of integer values ( - shorts are scanned into- ints).- #include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t size; // e.g., 8 int * keys_input; // e.g., [1, 1, 2, 2, 3, 3, 3, 5] short * values_input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * values_output; // empty array of 8 elements size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::inclusive_scan_by_key( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, values_input, values_output, size, rocprim::plus<int>() ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform scan-by-key rocprim::inclusive_scan_by_key( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, values_input, values_output, size, rocprim::plus<int>() ); // values_output: [1, 3, 3, 7, 5, 11, 18, 8] 
 - Template Parameters:
- Config – - [optional] configuration of the primitive, has to be - scan_by_key_configor a class derived from it.
- KeysInputIterator – - random-access iterator type of the input range. It can be a simple pointer type. 
- ValuesInputIterator – - random-access iterator type of the input range. It can be a simple pointer type. 
- ValuesOutputIterator – - random-access iterator type of the output range. It can be a simple pointer type. 
- BinaryFunction – - type of binary function used for scan. Default type is - rocprim::plus<T>, where- Tis a- value_typeof- InputIterator.
- KeyCompareFunction – - type of binary function used to determine keys equality. Default type is - rocprim::equal_to<T>, where- Tis a- value_typeof- KeysInputIterator.
- AccType – - accumulator type used to propagate the scanned values. Default type is value type of the input iterator. 
 
- Parameters:
- temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to - storage_sizeand function returns without performing the scan operation.
- storage_size – [inout] - reference to a size (in bytes) of - temporary_storage.
- keys_input – [in] - iterator to the first element in the range of keys. 
- values_input – [in] - iterator to the first element in the range of values to scan. 
- values_output – [out] - iterator to the first element in the output value range. 
- size – [in] - number of element in the input range. 
- scan_op – [in] - binary operation function object that will be used for scanning input values. The signature of the function should be equivalent to the following: - T f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. Default is BinaryFunction().
- key_compare_op – [in] - binary operation function object that will be used to determine keys equality. The signature of the function should be equivalent to the following: - bool f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. Default is KeyCompareFunction().
- stream – [in] - [optional] HIP stream object. Default is - 0(default stream).
- debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is - false.
 
- Returns:
- hipSuccess(- 0) after successful scan; otherwise a HIP runtime error of type- hipError_t.
 
exclusive#
- 
template<typename Config = default_config, typename KeysInputIterator, typename ValuesInputIterator, typename ValuesOutputIterator, typename InitialValueType, typename BinaryFunction = ::rocprim::plus<typename std::iterator_traits<ValuesInputIterator>::value_type>, typename KeyCompareFunction = ::rocprim::equal_to<typename std::iterator_traits<KeysInputIterator>::value_type>, typename AccType = detail::input_type_t<InitialValueType>>
 inline hipError_t rocprim::exclusive_scan_by_key(void *const temporary_storage, size_t &storage_size, const KeysInputIterator keys_input, const ValuesInputIterator values_input, const ValuesOutputIterator values_output, const InitialValueType initial_value, const size_t size, const BinaryFunction scan_op = BinaryFunction(), const KeyCompareFunction key_compare_op = KeyCompareFunction(), const hipStream_t stream = 0, const bool debug_synchronous = false)#
- Parallel exclusive scan-by-key primitive for device level. - inclusive_scan_by_key function performs a device-wide exclusive prefix scan-by-key operation using binary - scan_opoperator.- Overview
- Supports non-commutative scan operators. However, a scan operator should be associative. When used with non-associative functions the results may be non-deterministic and/or vary in precision. 
- Returns the required size of - temporary_storagein- storage_sizeif- temporary_storagein a null pointer.
- Ranges specified by - keys_input,- values_input, and- values_outputmust have at least- sizeelements.
 
- Example
- In this example a device-level inclusive sum-by-key operation is performed on an array of integer values ( - shorts are scanned into- ints).- #include <rocprim/rocprim.hpp> // Prepare input and output (declare pointers, allocate device memory etc.) size_t size; // e.g., 8 int * keys_input; // e.g., [1, 1, 1, 2, 2, 3, 3, 4] short * values_input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int start_value; // e.g., 9 int * values_output; // empty array of 8 elements size_t temporary_storage_size_bytes; void * temporary_storage_ptr = nullptr; // Get required size of the temporary storage rocprim::exclusive_scan_by_key( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, values_input, values_output, start_value, size,rocprim::plus<int>() ); // allocate temporary storage hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes); // perform scan-by-key rocprim::exclusive_scan_by_key( temporary_storage_ptr, temporary_storage_size_bytes, keys_input, values_input, values_output, start_value, size,rocprim::plus<int>() ); // values_output: [9, 10, 12, 9, 13, 9, 15, 9] 
 - Template Parameters:
- Config – - [optional] configuration of the primitive, has to be - scan_by_key_configor a class derived from it.
- KeysInputIterator – - random-access iterator type of the input range. It can be a simple pointer type. 
- ValuesInputIterator – - random-access iterator type of the input range. It can be a simple pointer type. 
- ValuesOutputIterator – - random-access iterator type of the output range. It can be a simple pointer type. 
- InitValueType – - type of the initial value. 
- BinaryFunction – - type of binary function used for scan. Default type is - rocprim::plus<T>, where- Tis a- value_typeof- InputIterator.
- KeyCompareFunction – - type of binary function used to determine keys equality. Default type is - rocprim::equal_to<T>, where- Tis a- value_typeof- KeysInputIterator.
- AccType – - accumulator type used to propagate the scanned values. Default type is ‘InitValueType’, unless it’s ‘rocprim::future_value’. Then it will be the wrapped input type. 
 
- Parameters:
- temporary_storage – [in] - pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to - storage_sizeand function returns without performing the scan operation.
- storage_size – [inout] - reference to a size (in bytes) of - temporary_storage.
- keys_input – [in] - iterator to the first element in the range of keys. 
- values_input – [in] - iterator to the first element in the range of values to scan. 
- values_output – [out] - iterator to the first element in the output value range. 
- initial_value – [in] - initial value to start the scan. A rocpim::future_value may be passed to use a value that will be later computed. 
- size – [in] - number of element in the input range. 
- scan_op – [in] - binary operation function object that will be used for scanning input values. The signature of the function should be equivalent to the following: - T f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. Default is BinaryFunction().
- key_compare_op – [in] - binary operation function object that will be used to determine keys equality. The signature of the function should be equivalent to the following: - bool f(const T &a, const T &b);. The signature does not need to have- const &, but function object must not modify the objects passed to it. Default is KeyCompareFunction().
- stream – [in] - [optional] HIP stream object. Default is - 0(default stream).
- debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is - false.
 
- Returns:
- hipSuccess(- 0) after successful scan; otherwise a HIP runtime error of type- hipError_t.