Library setup, exit, and query routines#

ROCSHMEM_INIT#

__host__ void rocshmem_init(void)#
Parameters:

None.

Returns:

None.

Description: This routine initializes the rocSHMEM library and underlying transport layer. Before rocshmem_init is called, you must select the device that this PE is associated to by calling hipSetDevice.

__device__ void rocshmem_wg_init(void)#
Parameters:

None.

Returns:

None.

Description: This routine initializes device-side rocSHMEM resources. It must be called before any threads in this work-group invoke other rocSHMEM functions. It must be called collectively by all threads in the work-group.

ROCSHMEM_FINALIZE#

__host__ void rocshmem_finalize(void)#
Parameters:

None.

Returns:

None.

Description: This routine finalizes the rocSHMEM library.

__device__ void rocshmem_wg_finalize(void)#
Parameters:

None.

Returns:

None.

Description: This routine finalizes device-side rocSHMEM resources. It must be called before work-group completion if the work-group also called rocshmem_wg_init. It must be called collectively by all threads in the work-group.

ROCSHMEM_INIT_ATTR#

__host__ int rocshmem_init_attr(unsigned int flags, rocshmem_init_attr_t *attr)#
Parameters:
  • flags – The initialization method to be used.

  • attr – Attribute structure specifying input characteristics.

Returns int:

Returns 0 on success; otherwise, returns a nonzero value.

Description: This routine initializes the rocSHMEM runtime and underlying transport layer using the provided mode and attributes. The parameter flags can be either ROCSHMEM_INIT_WITH_UNIQUEID or ROCSHMEM_INIT_WITH_MPI_COMM.

ROCSHMEM_GET_UNIQUEID#

__host__ int rocshmem_get_uniqueid(rocshmem_uniqueid_t *uid)#
Parameters:

uid – Pointer to a unique ID handle.

Returns:

Returns 0 on success; otherwise, returns a nonzero value.

Description: This routine returns a unique ID.

ROCSHMEM_SET_ATTR_UNIQUEID_ARGS#

__host__ int rocshmem_set_attr_uniqueid_args(int rank, int nranks, rocshmem_uniqueid_t *uid, rocshmem_init_attr_t *attr)#
Parameters:
  • rank – Rank of the calling process.

  • nranks – Number of PEs.

  • uid – Unique ID used to identify the group processes.

  • attr – Attribute structure to be passed to rocshmem_init_attr_t.

Returns:

Returns 0 on success; otherwise, returns a nonzero value.

Description: This routine initializes the rocshmem_init_attr_t struct.

ROCSHMEM_N_PES#

__host__ int rocshmem_n_pes(void)#
Parameters:

None.

Returns:

Total number of PEs.

Description: This routine queries the total number of PEs. It can be called before rocshmem_init.

__device__ int rocshmem_n_pes(void)
__device__ int rocshmem_ctx_n_pes(rocshmem_ctx_t ctx)#
Parameters:

ctx – GPU side context handle.

Returns:

Total number of PEs.

Description: This routine queries the total number of PEs for a given context. It can be called per thread with no performance penalty.

ROCSHMEM_MY_PE#

__host__ int rocshmem_my_pe(void)#
Parameters:

None.

Returns:

PE ID of the caller.

Description: This routine queries the PE ID of the caller. It can be called before rocshmem_init.

__device__ int rocshmem_my_pe(void)
__device__ int rocshmem_ctx_my_pe(rocshmem_ctx_t ctx)#
Parameters:

ctx – GPU side context handle.

Returns:

PE ID of the caller.

Description: This routine queries the PE ID of the caller. It can be called per thread with no performance penalty.