Table Comparing Syntax for Different Compute APIs

Contents

Table Comparing Syntax for Different Compute APIs#

Term

CUDA

HIP

OpenCL

Device

int device​Id

int device​Id

cl_​device

Queue

cuda​Stream_​t

hip​Stream_​t

cl_​command_​queue

Event

cuda​Event_​t

hip​Event_​t

cl_​event

Memory

void *

void *

cl_​mem

grid

grid

NDRange

block

block

work-group

thread

thread

work-item

warp

warp

sub-group

Thread-
index

threadIdx.x

threadIdx.x

get_local_id(0)

Block-
index

blockIdx.x

blockIdx.x

get_group_id(0)

Block-
dim

blockDim.x

blockDim.x

get_local_size(0)

Grid-dim

gridDim.x

gridDim.x

get_num_groups(0)

Device Kernel

_​_global_​_

_​_global_​_

_​_kernel

Device Function

_​_device_​_

_​_device_​_

Implied in device compilation

Host Function

_​_host_ (default)

_​_host_ (default)

Implied in host compilation

Host + Device Function

_​_host_​_ _​_device_​_

_​_host_​_ _​_device_​_

No equivalent

Kernel Launch

<<< >>>

hip​Launch​Kernel/hip​Launch​Kernel​GGL/<<< >>>

cl​Enqueue​NDRange​Kernel

Global Memory

_​_global_​_

_​_global_​_

_​_global

Group Memory

_​_shared_​_

_​_shared_​_

_​_local

Constant

_​_constant_​_

_​_constant_​_

_​_constant

_​_syncthreads

_​_syncthreads

barrier(CLK_​LOCAL_​MEMFENCE)

Atomic Builtins

atomic​Add

atomic​Add

atomic_​add

Precise Math

cos(f)

cos(f)

cos(f)

Fast Math

_​_cos(f)

_​_cos(f)

native_​cos(f)

Vector

float4

float4

float4

Notes#

The indexing functions (starting with thread-index) show the terminology for a 1D grid. Some APIs use reverse order of xyz / 012 indexing for 3D grids.