Summary of the Operations#
Basics#
transform
applies a function to each element of the sequence, equivalent to the functional operationmap
select
takes the first N` elements of the sequence satisfying a condition (via a selection mask or a predicate function)unique
returns unique elements within a sequencehistogram
generates a summary of the statistical distribution of the sequence
Aggregation#
reduce
traverses the sequence while accumulating some data, equivalent to the functional operationfold_left
scan
is the cumulative version ofreduce
which returns the sequence of the intermediate values taken by the accumulator
Differentiation#
adjacent_difference
computes the difference between the current element and the previous or next one in the sequencediscontinuity
detects value change between the current element and the previous or next one in the sequence
Rearrangement#
sort
rearranges the sequence by sorting it. It could be according to a comparison operator or a value using a radix approachpartial_sort
rearranges the sequence by sorting it up to and including a given index, according to a comparison operator.nth_element
places the nth element in its sorted position, with elements less-than before, and greater after, according to a comparison operator.exchange
rearranges the elements according to a different stride configuration which is equivalent to a tensor axis transpositionshuffle
rotates the elements
Partition/Merge#
partition
divides the sequence into two or more sequences according to a predicate while preserving some ordering propertiesmerge
merges two ordered sequences into one while preserving the order
Data Movement#
store
stores the sequence to a continuous memory zone. There are variations to use an optimized path or to specify how to store the sequence to better fit the access patterns of the CUs.load
the complementary operations of the above ones.memcpy
copies bytes between device sources and destinations
Other operations#
run_length_encode
generates a compact representation of a sequencebinary_search
finds for each element the index of an element with the same value in another sequence (which has to be sorted)config
selects a kernel’s grid/block dimensions to tune the operation to a GPU