Class Hierarchy#

hipCUB: Class Hierarchy
Class Hierarchy
This inheritance list is sorted roughly, but not completely, alphabetically:
[detail level 12]
 Chipcub::BaseDigitExtractor< KeyT >Base struct for digit extractor. Contains common code to provide special handling for floating-point -0.0
 Chipcub::BinaryFlip< BinaryOpT >
 Chipcub::BlockMergeSortStrategy< KeyT, ValueT, NUM_THREADS, ITEMS_PER_THREAD, SynchronizationPolicy >Generalized merge sort algorithm
 Chipcub::BlockMergeSortStrategy< KeyT, NullType, ::rocprim::device_warp_size(), ITEMS_PER_THREAD, WarpMergeSort< KeyT, ITEMS_PER_THREAD, ::rocprim::device_warp_size(), NullType, 1 > >
 Chipcub::BlockMergeSortStrategy< KeyT, NullType, BLOCK_DIM_X *1 *1, ITEMS_PER_THREAD, BlockMergeSort< KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, NullType, 1, 1 > >
 Chipcub::BlockRakingLayout< T, BLOCK_THREADS, ARCH >BlockRakingLayout provides a conflict-free shared memory layout abstraction for 1D raking across thread block data
 Chipcub::BlockRunLengthDecode< ItemT, BLOCK_DIM_X, RUNS_PER_THREAD, DECODED_ITEMS_PER_THREAD, DecodedOffsetT, BLOCK_DIM_Y, BLOCK_DIM_Z >The BlockRunLengthDecode class supports decoding a run-length encoded array of items. That is, given the two arrays run_value[N] and run_lengths[N], run_value[i] is repeated run_lengths[i] many times in the output array. Due to the nature of the run-length decoding algorithm ("decompression"), the output size of the run-length decoded array is runtime-dependent and potentially without any upper bound. To address this, BlockRunLengthDecode allows retrieving a "window" from the run-length decoded array. The window's offset can be specified and BLOCK_THREADS * DECODED_ITEMS_PER_THREAD (i.e., referred to as window_size) decoded items from the specified window will be returned
 Chipcub::CacheModifiedInputIterator< MODIFIER, ValueType, OffsetT >
 Chipcub::CacheModifiedOutputIterator< MODIFIER, ValueType, OffsetT >
 Chipcub::CastOp< B >
 Chipcub::DiscardOutputIterator< OffsetT >A discard iterator
 Chipcub::DoubleBuffer< T >
 Chipcub::GridBarrierGridBarrier implements a software global barrier among thread blocks within a hip grid
 Chipcub::GridEvenShare< OffsetT >GridEvenShare is a descriptor utility for distributing input among CUDA thread blocks in an "even-share" fashion. Each thread block gets roughly the same number of input tiles
 Chipcub::GridQueue< OffsetT >GridQueue is a descriptor utility for dynamic queue management
 Chipcub::If< B, T, F >
 Chipcub::InequalityWrapper< EqualityOp >
 Chipcub::Int2Type< A >
 Chipcub::IsPointer< T >
 Chipcub::IsVolatile< T >
 Chipcub::Log2< N >
 Chipcub::PowerOfTwo< N >
 Chipcub::RadixSortTwiddle< IS_DESCENDING, KeyT >Twiddling keys for radix sort
 Chipcub::ReduceByKeyOp< ReductionOpT >
 Chipcub::ReduceBySegmentOp< ReductionOpT >
 Chipcub::RemoveQualifiers< T >
 Chipcub::DeviceSpmv::SpmvParams< ValueT, OffsetT >< Signed integer type for sequence offsets
 Chipcub::SwizzleScanOp< ScanOp >
 Chipcub::Uninitialized< T >A storage-backing wrapper that allows types with non-trivial constructors to be aliased in unions
 Chipcub::Uninitialized< _TempStorage >