Transform#

Configuring the kernel#

template<unsigned int BlockSize, unsigned int ItemsPerThread, unsigned int SizeLimit = std::numeric_limits<unsigned int>::max()>
struct transform_config : public rocprim::detail::transform_config_params#

Configuration for the device-level transform operation.

Template Parameters:
  • BlockSize – Number of threads in a block.

  • ItemsPerThread – Number of items processed by each thread.

  • SizeLimit – Limit on the number of items for a single kernel launch.

Subclassed by rocprim::detail::default_transform_config< arch, data_type, enable >

transform#

template<class Config = default_config, class InputIterator, class OutputIterator, class UnaryFunction>
inline hipError_t rocprim::transform(InputIterator input, OutputIterator output, const size_t size, UnaryFunction transform_op, const hipStream_t stream = 0, bool debug_synchronous = false)#

Parallel transform primitive for device level.

transform function performs a device-wide transformation operation using unary transform_op operator.

Overview

  • Ranges specified by input and output must have at least size elements.

Example

In this example a device-level transform operation is performed on an array of integer values (shorts are transformed into ints).

#include <rocprim/rocprim.hpp>

// custom transform function
auto transform_op =
    [] __device__ (int a) -> int
    {
        return a + 5;
    };

// Prepare input and output (declare pointers, allocate device memory etc.)
size_t input_size;    // e.g., 8
short * input;        // e.g., [1, 2, 3, 4, 5, 6, 7, 8]
int * output;         // empty array of 8 elements

// perform transform
rocprim::transform(
    input, output, input_size, transform_op
);
// output: [6, 7, 8, 9, 10, 11, 12, 13]

Template Parameters:
  • Config – - [optional] configuration of the primitive. It has to be transform_config or a class derived from it.

  • InputIterator – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.

  • OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.

  • UnaryFunction – - type of unary function used for transform.

Parameters:
  • input[in] - iterator to the first element in the range to transform.

  • output[out] - iterator to the first element in the output range.

  • size[in] - number of element in the input range.

  • transform_op[in] - unary operation function object that will be used for transform. The signature of the function should be equivalent to the following: U f(const T &a);. The signature does not need to have const &, but function object must not modify the object passed to it.

  • stream[in] - [optional] HIP stream object. The default is 0 (default stream).

  • debug_synchronous[in] - [optional] If true, synchronization after every kernel launch is forced in order to check for errors. The default value is false.

template<class Config = default_config, class InputIterator1, class InputIterator2, class OutputIterator, class BinaryFunction>
inline hipError_t rocprim::transform(InputIterator1 input1, InputIterator2 input2, OutputIterator output, const size_t size, BinaryFunction transform_op, const hipStream_t stream = 0, bool debug_synchronous = false)#

Parallel device-level transform primitive for two inputs.

transform function performs a device-wide transformation operation on two input ranges using binary transform_op operator.

Overview

  • Ranges specified by input1, input2, and output must have at least size elements.

Example

In this example a device-level transform operation is performed on two arrays of integer values (element-wise sum is performed).

#include <rocprim/rocprim.hpp>

// custom transform function
auto transform_op =
    [] __device__ (int a, int b) -> int
    {
        return a + b;
    };

// Prepare input and output (declare pointers, allocate device memory etc.)
size_t size;   // e.g., 8
int* input1;   // e.g., [1, 2, 3, 4, 5, 6, 7, 8]
int* input2;   // e.g., [1, 2, 3, 4, 5, 6, 7, 8]
int* output;   // empty array of 8 elements

// perform transform
rocprim::transform(
    input1, input2, output, input1.size(), transform_op
);
// output: [2, 4, 6, 8, 10, 12, 14, 16]

Template Parameters:
  • Config – - [optional] Configuration of the primitive, must be default_config or transform_config.

  • InputIterator1 – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.

  • InputIterator2 – - random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.

  • OutputIterator – - random-access iterator type of the output range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.

  • BinaryFunction – - type of binary function used for transform.

Parameters:
  • input1[in] - iterator to the first element in the 1st range to transform.

  • input2[in] - iterator to the first element in the 2nd range to transform.

  • output[out] - iterator to the first element in the output range.

  • size[in] - number of element in the input range.

  • transform_op[in] - binary operation function object that will be used for transform. The signature of the function should be equivalent to the following: U f(const T1& a, const T2& b);. The signature does not need to have const &, but function object must not modify the object passed to it.

  • stream[in] - [optional] HIP stream object. The default is 0 (default stream).

  • debug_synchronous[in] - [optional] If true, synchronization after every kernel launch is forced. Default value is false.