thread_block_tile< size, ParentCGTy > Class Template Reference

thread_block_tile&lt; size, ParentCGTy &gt; Class Template Reference#

HIP Runtime API Reference: cooperative_groups::thread_block_tile< size, ParentCGTy > Class Template Reference
cooperative_groups::thread_block_tile< size, ParentCGTy > Class Template Reference

Group type - thread_block_tile. More...

#include <amd_hip_cooperative_groups.h>

Inheritance diagram for cooperative_groups::thread_block_tile< size, ParentCGTy >:
[legend]
Collaboration diagram for cooperative_groups::thread_block_tile< size, ParentCGTy >:
[legend]

Public Member Functions

 operator thread_block_tile< size, void > () const
 
unsigned int thread_rank () const
 Rank of the calling thread within [0, size() ).
 
void sync ()
 Synchronizes the threads in the group.
 
unsigned int meta_group_rank () const
 
unsigned int meta_group_size () const
 Returns the number of groups created when the parent group was partitioned.
 
template<class T >
shfl (T var, int srcRank) const
 Shuffle operation on group level.
 
template<class T >
shfl_down (T var, unsigned int lane_delta) const
 Shuffle down operation on group level.
 
template<class T >
shfl_up (T var, unsigned int lane_delta) const
 Shuffle up operation on group level.
 
template<class T >
shfl_xor (T var, unsigned int laneMask) const
 Shuffle xor operation on group level.
 
unsigned long long ballot (int pred) const
 Ballot function on group level.
 
int any (int pred) const
 Any function on group level.
 
int all (int pred) const
 All function on group level.
 
template<typename T >
unsigned long long match_any (T value) const
 Match any function on group level.
 
template<typename T >
unsigned long long match_all (T value, int &pred) const
 Match all function on group level.
 
- Public Member Functions inherited from cooperative_groups::thread_block_tile_base< tileSize >
shfl (T var, int srcRank) const
 
shfl_down (T var, unsigned int lane_delta) const
 
shfl_up (T var, unsigned int lane_delta) const
 
shfl_xor (T var, unsigned int laneMask) const
 
unsigned long long ballot (int pred) const
 
int any (int pred) const
 
int all (int pred) const
 
unsigned long long match_any (T value) const
 
unsigned long long match_all (T value, int &pred) const
 
- Public Member Functions inherited from cooperative_groups::tiled_group
unsigned int size () const
 
unsigned int thread_rank () const
 Rank of the calling thread within [0, size() ).
 
void sync () const
 Synchronizes the threads in the group.
 
- Public Member Functions inherited from cooperative_groups::thread_group
uint32_t size () const
 
unsigned int cg_type () const
 Returns the type of the group.
 
uint32_t thread_rank () const
 Rank of the calling thread within [0, size() ).
 
bool is_valid () const
 Returns true if the group has not violated any API constraints.
 
void sync () const
 Synchronizes the threads in the group.
 

Protected Member Functions

 thread_block_tile (const ParentCGTy &g)
 
- Protected Member Functions inherited from cooperative_groups::impl::thread_block_tile_internal< size, ParentCGTy >
template<unsigned int tbtSize, class tbtParentT >
 thread_block_tile_internal (const thread_block_tile_internal< tbtSize, tbtParentT > &g)
 
 thread_block_tile_internal (const thread_block &g)
 
- Protected Member Functions inherited from cooperative_groups::thread_block_tile_type< size, ParentCGTy >
 thread_block_tile_type ()
 
- Protected Member Functions inherited from cooperative_groups::tiled_group
 tiled_group (unsigned int tileSize)
 
- Protected Member Functions inherited from cooperative_groups::thread_group
 thread_group (internal::group_type type, uint32_t size=static_cast< uint64_t >(0), uint64_t mask=static_cast< uint64_t >(0))
 

Additional Inherited Members

- Static Public Member Functions inherited from cooperative_groups::thread_block_tile_type< size, ParentCGTy >
static void sync ()
 
- Static Public Member Functions inherited from cooperative_groups::thread_block_tile_base< tileSize >
static void sync ()
 
- Static Public Member Functions inherited from cooperative_groups::tile_base< size >
static constexpr unsigned int thread_rank ()
 Rank of the thread within this tile.
 
static unsigned int size ()
 Number of threads within this tile.
 
- Static Public Member Functions inherited from cooperative_groups::parent_group_info< tileSize, ParentCGTy >
static unsigned int meta_group_rank ()
 
static unsigned int meta_group_size ()
 Returns the number of groups created when the parent group was partitioned.
 
- Protected Attributes inherited from cooperative_groups::thread_group
uint32_t _type
 
uint32_t _size
 Type of the thread_group.
 
uint64_t _mask
 Total number of threads in the tread_group.
 
struct cooperative_groups::thread_group::_coalesced_info coalesced_info
 
- Static Protected Attributes inherited from cooperative_groups::tile_base< size >
static constexpr unsigned int numThreads
 

Detailed Description

template<unsigned int size, class ParentCGTy>
class cooperative_groups::thread_block_tile< size, ParentCGTy >

Group type - thread_block_tile.

Represents one tiled thread group in a wavefront. This group type also supports sub-wave level intrinsics.

Note
This type is implemented on Linux, under development on Microsoft Windows.

Constructor & Destructor Documentation

◆ thread_block_tile()

template<unsigned int size, class ParentCGTy >
cooperative_groups::thread_block_tile< size, ParentCGTy >::thread_block_tile ( const ParentCGTy &  g)
inlineprotected

Member Function Documentation

◆ all()

template<unsigned int size, class ParentCGTy >
int cooperative_groups::thread_block_tile< size, ParentCGTy >::all ( int  pred) const

All function on group level.

Returns non-zero if a predicate evaluates true for all threads.

Parameters
pred[in] The predicate to evaluate on group threads.

◆ any()

template<unsigned int size, class ParentCGTy >
int cooperative_groups::thread_block_tile< size, ParentCGTy >::any ( int  pred) const

Any function on group level.

Returns non-zero if a predicate evaluates true for any threads.

Parameters
pred[in] The predicate to evaluate on group threads.

◆ ballot()

template<unsigned int size, class ParentCGTy >
unsigned long long cooperative_groups::thread_block_tile< size, ParentCGTy >::ballot ( int  pred) const

Ballot function on group level.

Returns a bit mask with the Nth bit set to one if the Nth thread predicate evaluates true.

Parameters
pred[in] The predicate to evaluate on group threads.

◆ match_all()

template<unsigned int size, class ParentCGTy >
template<typename T >
unsigned long long cooperative_groups::thread_block_tile< size, ParentCGTy >::match_all ( value,
int &  pred 
) const

Match all function on group level.

Returns a bit mask containing a 1-bit for every participating thread if they all have the same value in value as the caller thread. The predicate pred is set to true if all participating threads have the same value in value.

Parameters
value[in] The value to examine on the current thread in group.
pred[out] The predicate is set to true if all participating threads in the thread group have the same value.

◆ match_any()

template<unsigned int size, class ParentCGTy >
template<typename T >
unsigned long long cooperative_groups::thread_block_tile< size, ParentCGTy >::match_any ( value) const

Match any function on group level.

Returns a bit mask containing a 1-bit for every participating thread if that thread has the same value in value as the caller thread.

Parameters
value[in] The value to examine on the current thread in group.

◆ meta_group_rank()

template<unsigned int size, class ParentCGTy >
unsigned int cooperative_groups::thread_block_tile< size, ParentCGTy >::meta_group_rank ( ) const

Returns the linear rank of the group within the set of tiles partitioned from a parent group (bounded by meta_group_size)

◆ meta_group_size()

template<unsigned int size, class ParentCGTy >
unsigned int cooperative_groups::thread_block_tile< size, ParentCGTy >::meta_group_size ( ) const

Returns the number of groups created when the parent group was partitioned.

◆ operator thread_block_tile< size, void >()

template<unsigned int size, class ParentCGTy >
cooperative_groups::thread_block_tile< size, ParentCGTy >::operator thread_block_tile< size, void > ( ) const
inline

◆ shfl()

template<unsigned int size, class ParentCGTy >
template<class T >
T cooperative_groups::thread_block_tile< size, ParentCGTy >::shfl ( var,
int  srcRank 
) const

Shuffle operation on group level.

Exchanging variables between threads without use of shared memory. Shuffle operation is a direct copy of var from srcRank thread ID of group.

Template Parameters
TThe type can be a 32-bit integer or single-precision floating point.
Parameters
var[in] The source variable to copy. Only the srcRank thread ID of group is copied to other threads.
srcRank[in] The source thread ID of the group for copy.

◆ shfl_down()

template<unsigned int size, class ParentCGTy >
template<class T >
T cooperative_groups::thread_block_tile< size, ParentCGTy >::shfl_down ( var,
unsigned int  lane_delta 
) const

Shuffle down operation on group level.

Exchanging variables between threads without use of shared memory. Shuffle down operation is copy of var from thread with thread ID of group relative higher with lane_delta to caller thread ID.

Template Parameters
TThe type can be a 32-bit integer or single-precision floating point.
Parameters
var[in] The source variable to copy.
lane_delta[in] The lane_delta is the relative thread ID difference between caller thread ID and source of copy thread ID. sourceID = (threadID + lane_delta) % size()

◆ shfl_up()

template<unsigned int size, class ParentCGTy >
template<class T >
T cooperative_groups::thread_block_tile< size, ParentCGTy >::shfl_up ( var,
unsigned int  lane_delta 
) const

Shuffle up operation on group level.

Exchanging variables between threads without use of shared memory. Shuffle up operation is copy of var from thread with thread ID of group relative lower with lane_delta to caller thread ID.

Template Parameters
TThe type can be a 32-bit integer or single-precision floating point.
Parameters
var[in] The source variable to copy.
lane_delta[in] The lane_delta is the relative thread ID difference between caller thread ID and source of copy thread ID. sourceID = (threadID - lane_delta) % size()

◆ shfl_xor()

template<unsigned int size, class ParentCGTy >
template<class T >
T cooperative_groups::thread_block_tile< size, ParentCGTy >::shfl_xor ( var,
unsigned int  laneMask 
) const

Shuffle xor operation on group level.

Exchanging variables between threads without use of shared memory. Shuffle xor operation is copy of var from thread with thread ID of group based on laneMask XOR of the caller thread ID.

Template Parameters
TThe type can be a 32-bit integer or single-precision floating point.
Parameters
var[in] The source variable to copy.
laneMask[in] The laneMask is the mask for XOR operation. sourceID = threadID ^ laneMask

◆ sync()

template<unsigned int size, class ParentCGTy >
void cooperative_groups::thread_block_tile< size, ParentCGTy >::sync ( )

Synchronizes the threads in the group.

Causes all threads in the group to wait at this synchronization point, and for all shared and global memory accesses by the threads to complete, before running synchronization. This guarantees the visibility of accessed data for all threads in the group.

Note
There are potential read-after-write (RAW), write-after-read (WAR), or write-after-write (WAW) hazards, when threads in the group access the same addresses in shared or global memory. The data hazards can be avoided with synchronization of the group.

◆ thread_rank()

template<unsigned int size, class ParentCGTy >
unsigned int cooperative_groups::thread_block_tile< size, ParentCGTy >::thread_rank ( ) const

Rank of the calling thread within [0, size() ).


The documentation for this class was generated from the following file:
  • /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-hip/checkouts/clr/hipamd/include/hip/amd_detail/amd_hip_cooperative_groups.h