Table Comparing Syntax for Different Compute APIs

Contents

Table Comparing Syntax for Different Compute APIs#

Term

CUDA

HIP

OpenCL

Device

int device​Id

int device​Id

cl_​device

Queue

cuda​Stream_​t

hip​Stream_​t

cl_​command_​queue

Event

cuda​Event_​t

hip​Event_​t

cl_​event

Memory

void *

void *

cl_​mem

grid

grid

NDRange

block

block

work-group

thread

thread

work-item

warp

warp

sub-group

Thread-
index

thread​Idx.x

thread​Idx.x

get_​local_​id(0)

Block-
index

block​Idx.x

block​Idx.x

get_​group_​id(0)

Block-
dim

block​Dim.x

block​Dim.x

get_​local_​size(0)

Grid-dim

grid​Dim.x

grid​Dim.x

get_​num_​groups(0)

Device Kernel

_​_global_​_

_​_global_​_

_​_kernel

Device Function

_​_device_​_

_​_device_​_

Implied in device compilation

Host Function

_​_host_ (default)

_​_host_ (default)

Implied in host compilation

Host + Device Function

_​_host_​_ _​_device_​_

_​_host_​_ _​_device_​_

No equivalent

Kernel Launch

<<< >>>

hip​Launch​Kernel/hip​Launch​Kernel​GGL/<<< >>>

cl​Enqueue​NDRange​Kernel

Global Memory

_​_global_​_

_​_global_​_

_​_global

Group Memory

_​_shared_​_

_​_shared_​_

_​_local

Constant

_​_constant_​_

_​_constant_​_

_​_constant

_​_syncthreads

_​_syncthreads

barrier(CLK_​LOCAL_​MEMFENCE)

Atomic Builtins

atomic​Add

atomic​Add

atomic_​add

Precise Math

cos(f)

cos(f)

cos(f)

Fast Math

_​_cos(f)

_​_cos(f)

native_​cos(f)

Vector

float4

float4

float4

Notes#

The indexing functions (starting with thread-index) show the terminology for a 1D grid. Some APIs use reverse order of xyz / 012 indexing for 3D grids.