Transform#
Configuring the kernel#
-
template<unsigned int BlockSize, unsigned int ItemsPerThread, unsigned int SizeLimit = std::numeric_limits<unsigned int>::max()>
struct transform_config : public rocprim::detail::transform_config_params# Configuration for the device-level transform operation.
- Template Parameters:
BlockSize – Number of threads in a block.
ItemsPerThread – Number of items processed by each thread.
SizeLimit – Limit on the number of items for a single kernel launch.
Subclassed by rocprim::detail::default_transform_config< arch, value_type >
transform#
-
template<class Config = default_config, class InputIterator, class OutputIterator, class UnaryFunction>
inline hipError_t rocprim::transform(InputIterator input, OutputIterator output, const size_t size, UnaryFunction transform_op, const hipStream_t stream = 0, bool debug_synchronous = false)# Parallel transform primitive for device level.
transform function performs a device-wide transformation operation using unary
transform_op
operator.- Overview
Ranges specified by
input
andoutput
must have at leastsize
elements.
- Example
In this example a device-level transform operation is performed on an array of integer values (
short
s are transformed intoint
s).#include <rocprim/rocprim.hpp> // custom transform function auto transform_op = [] __device__ (int a) -> int { return a + 5; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t input_size; // e.g., 8 short * input; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int * output; // empty array of 8 elements // perform transform rocprim::transform( input, output, input_size, transform_op ); // output: [6, 7, 8, 9, 10, 11, 12, 13]
- Template Parameters:
Config – - [optional] configuration of the primitive. It has to be
transform_config
or a class derived from it.InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
UnaryFunction – - type of unary function used for transform.
- Parameters:
input – [in] - iterator to the first element in the range to transform.
output – [out] - iterator to the first element in the output range.
size – [in] - number of element in the input range.
transform_op – [in] - unary operation function object that will be used for transform. The signature of the function should be equivalent to the following:
U f(const T &a);
. The signature does not need to haveconst &
, but function object must not modify the object passed to it.stream – [in] - [optional] HIP stream object. The default is
0
(default stream).debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is
false
.
-
template<class Config = default_config, class InputIterator1, class InputIterator2, class OutputIterator, class BinaryFunction>
inline hipError_t rocprim::transform(InputIterator1 input1, InputIterator2 input2, OutputIterator output, const size_t size, BinaryFunction transform_op, const hipStream_t stream = 0, bool debug_synchronous = false)# Parallel device-level transform primitive for two inputs.
transform function performs a device-wide transformation operation on two input ranges using binary
transform_op
operator.- Overview
Ranges specified by
input1
,input2
, andoutput
must have at leastsize
elements.
- Example
In this example a device-level transform operation is performed on two arrays of integer values (element-wise sum is performed).
#include <rocprim/rocprim.hpp> // custom transform function auto transform_op = [] __device__ (int a, int b) -> int { return a + b; }; // Prepare input and output (declare pointers, allocate device memory etc.) size_t size; // e.g., 8 int* input1; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int* input2; // e.g., [1, 2, 3, 4, 5, 6, 7, 8] int* output; // empty array of 8 elements // perform transform rocprim::transform( input1, input2, output, input1.size(), transform_op ); // output: [2, 4, 6, 8, 10, 12, 14, 16]
- Template Parameters:
Config – - [optional] configuration of the primitive. It has to be
transform_config
or a class derived from it.InputIterator1 – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
InputIterator2 – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.
OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.
BinaryFunction – - type of binary function used for transform.
- Parameters:
input1 – [in] - iterator to the first element in the 1st range to transform.
input2 – [in] - iterator to the first element in the 2nd range to transform.
output – [out] - iterator to the first element in the output range.
size – [in] - number of element in the input range.
transform_op – [in] - binary operation function object that will be used for transform. The signature of the function should be equivalent to the following:
U f(const T1& a, const T2& b);
. The signature does not need to haveconst &
, but function object must not modify the object passed to it.stream – [in] - [optional] HIP stream object. The default is
0
(default stream).debug_synchronous – [in] - [optional] If true, synchronization after every kernel launch is forced. Default value is
false
.