Guidelines

Contents

Guidelines#

Naming conventions#

hipBLAS follows the following naming conventions,

  • Big case for matrix, e.g. matrix A, B, C GEMM (C = A*B)

  • Lower case for vector, e.g. vector x, y GEMV (y = A*x)

Notations#

hipBLAS function uses the following notations to denote precisions,

  • h = half

  • bf = 16 bit brain floating point

  • s = single

  • d = double

  • c = single complex

  • z = double complex

hipBLAS Types#

Definitions#

hipblasHandle_t#

typedef void *hipblasHandle_t#

hipblasHanlde_t is a void pointer, to store the library context (either rocBLAS or cuBLAS)

hipblasHalf#

typedef uint16_t hipblasHalf#

To specify the datatype to be unsigned short.

hipblasInt8#

typedef int8_t hipblasInt8#

To specify the datatype to be signed char.

hipblasStride#

typedef int64_t hipblasStride#

Stride between matrices or vectors in strided_batched functions.

hipblasBfloat16#

struct hipblasBfloat16#

Struct to represent a 16 bit Brain floating-point number.

hipblasComplex#

struct hipblasComplex#

Struct to represent a complex number with single precision real and imaginary parts.

hipblasDoubleComplex#

struct hipblasDoubleComplex#

Struct to represent a complex number with double precision real and imaginary parts.

Enums#

Enumeration constants have numbering that is consistent with CBLAS, ACML and most standard C BLAS libraries.

hipblasStatus_t#

enum hipblasStatus_t#

hipblas status codes definition

Values:

enumerator HIPBLAS_STATUS_SUCCESS#

Function succeeds

enumerator HIPBLAS_STATUS_NOT_INITIALIZED#

HIPBLAS library not initialized

enumerator HIPBLAS_STATUS_ALLOC_FAILED#

resource allocation failed

enumerator HIPBLAS_STATUS_INVALID_VALUE#

unsupported numerical value was passed to function

enumerator HIPBLAS_STATUS_MAPPING_ERROR#

access to GPU memory space failed

enumerator HIPBLAS_STATUS_EXECUTION_FAILED#

GPU program failed to execute

enumerator HIPBLAS_STATUS_INTERNAL_ERROR#

an internal HIPBLAS operation failed

enumerator HIPBLAS_STATUS_NOT_SUPPORTED#

function not implemented

enumerator HIPBLAS_STATUS_ARCH_MISMATCH#

architecture mismatch

enumerator HIPBLAS_STATUS_HANDLE_IS_NULLPTR#

hipBLAS handle is null pointer

enumerator HIPBLAS_STATUS_INVALID_ENUM#

unsupported enum value was passed to function

enumerator HIPBLAS_STATUS_UNKNOWN#

back-end returned an unsupported status code

hipblasOperation_t#

enum hipblasOperation_t#

Used to specify whether the matrix is to be transposed or not.

Values:

enumerator HIPBLAS_OP_N#

Operate with the matrix.

enumerator HIPBLAS_OP_T#

Operate with the transpose of the matrix.

enumerator HIPBLAS_OP_C#

Operate with the conjugate transpose of the matrix.

hipblasPointerMode_t#

enum hipblasPointerMode_t#

Indicates if scalar pointers are on host or device. This is used for scalars alpha and beta and for scalar function return values.

Values:

enumerator HIPBLAS_POINTER_MODE_HOST#

Scalar values affected by this variable will be located on the host.

enumerator HIPBLAS_POINTER_MODE_DEVICE#

Scalar values affected by this variable will be located on the device.

hipblasFillMode_t#

enum hipblasFillMode_t#

Used by the Hermitian, symmetric and triangular matrix routines to specify whether the upper or lower triangle is being referenced.

Values:

enumerator HIPBLAS_FILL_MODE_UPPER#

Upper triangle

enumerator HIPBLAS_FILL_MODE_LOWER#

Lower triangle

enumerator HIPBLAS_FILL_MODE_FULL#

hipblasDiagType_t#

enum hipblasDiagType_t#

It is used by the triangular matrix routines to specify whether the matrix is unit triangular.

Values:

enumerator HIPBLAS_DIAG_NON_UNIT#

Non-unit triangular.

enumerator HIPBLAS_DIAG_UNIT#

Unit triangular.

hipblasSideMode_t#

enum hipblasSideMode_t#

Indicates the side matrix A is located relative to matrix B during multiplication.

Values:

enumerator HIPBLAS_SIDE_LEFT#

Multiply general matrix by symmetric, Hermitian or triangular matrix on the left.

enumerator HIPBLAS_SIDE_RIGHT#

Multiply general matrix by symmetric, Hermitian or triangular matrix on the right.

enumerator HIPBLAS_SIDE_BOTH#

hipblasDatatype_t#

enum hipblasDatatype_t#

Indicates the precision width of data stored in a blas type.

Values:

enumerator HIPBLAS_R_16F#

16 bit floating point, real

enumerator HIPBLAS_R_32F#

32 bit floating point, real

enumerator HIPBLAS_R_64F#

64 bit floating point, real

enumerator HIPBLAS_C_16F#

16 bit floating point, complex

enumerator HIPBLAS_C_32F#

32 bit floating point, complex

enumerator HIPBLAS_C_64F#

64 bit floating point, complex

enumerator HIPBLAS_R_8I#

8 bit signed integer, real

enumerator HIPBLAS_R_8U#

8 bit unsigned integer, real

enumerator HIPBLAS_R_32I#

32 bit signed integer, real

enumerator HIPBLAS_R_32U#

32 bit unsigned integer, real

enumerator HIPBLAS_C_8I#

8 bit signed integer, complex

enumerator HIPBLAS_C_8U#

8 bit unsigned integer, complex

enumerator HIPBLAS_C_32I#

32 bit signed integer, complex

enumerator HIPBLAS_C_32U#

32 bit unsigned integer, complex

enumerator HIPBLAS_R_16B#

16 bit bfloat, real

enumerator HIPBLAS_C_16B#

16 bit bfloat, complex

enumerator HIPBLAS_DATATYPE_INVALID#

Invalid datatype value, do not use

hipblasGemmAlgo_t#

enum hipblasGemmAlgo_t#

Indicates if layer is active with bitmask.

Values:

enumerator HIPBLAS_GEMM_DEFAULT#

enumerator rocblas_gemm_algo_standard

hipblasAtomicsMode_t#

enum hipblasAtomicsMode_t#

Indicates if atomics operations are allowed. Not allowing atomic operations may generally improve determinism and repeatability of results at a cost of performance.

Values:

enumerator HIPBLAS_ATOMICS_NOT_ALLOWED#

Algorithms will refrain from atomics where applicable.

enumerator HIPBLAS_ATOMICS_ALLOWED#

Algorithms will take advantage of atomics where applicable.

hipBLAS Functions#

Level 1 BLAS#

hipblasIXamax + Batched, StridedBatched#

hipblasStatus_t hipblasIsamax(hipblasHandle_t handle, int n, const float *x, int incx, int *result)#
hipblasStatus_t hipblasIdamax(hipblasHandle_t handle, int n, const double *x, int incx, int *result)#
hipblasStatus_t hipblasIcamax(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, int *result)#
hipblasStatus_t hipblasIzamax(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, int *result)#

BLAS Level 1 API.

amax finds the first index of the element of maximum magnitude of a vector x.

  • Supported precisions in rocBLAS : s,d,c,z.

  • Supported precisions in cuBLAS : s,d,c,z.

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of y.

  • result[inout] device pointer or host pointer to store the amax index. return is 0.0 if n, incx<=0.

hipblasStatus_t hipblasIsamaxBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, int *result)#
hipblasStatus_t hipblasIdamaxBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, int *result)#
hipblasStatus_t hipblasIcamaxBatched(hipblasHandle_t handle, int n, const hipblasComplex *const x[], int incx, int batchCount, int *result)#
hipblasStatus_t hipblasIzamaxBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *const x[], int incx, int batchCount, int *result)#

BLAS Level 1 API.

amaxBatched finds the first index of the element of maximum magnitude of each vector x_i in a batch, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z.

  • Supported precisions in cuBLAS : No support.

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each vector x_i

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • batchCount[in] [int] number of instances in the batch, must be > 0.

  • result[out] device or host array of pointers of batchCount size for results. return is 0 if n, incx<=0.

hipblasStatus_t hipblasIsamaxStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, int *result)#
hipblasStatus_t hipblasIdamaxStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, int *result)#
hipblasStatus_t hipblasIcamaxStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#
hipblasStatus_t hipblasIzamaxStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#

BLAS Level 1 API.

amaxStridedBatched finds the first index of the element of maximum magnitude of each vector x_i in a batch, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each vector x_i

  • x[in] device pointer to the first vector x_1.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • stridex[in] [hipblasStride] specifies the pointer increment between one x_i and the next x_(i + 1).

  • batchCount[in] [int] number of instances in the batch

  • result[out] device or host pointer for storing contiguous batchCount results. return is 0 if n <= 0, incx<=0.

hipblasIXamin + Batched, StridedBatched#

hipblasStatus_t hipblasIsamin(hipblasHandle_t handle, int n, const float *x, int incx, int *result)#
hipblasStatus_t hipblasIdamin(hipblasHandle_t handle, int n, const double *x, int incx, int *result)#
hipblasStatus_t hipblasIcamin(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, int *result)#
hipblasStatus_t hipblasIzamin(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, int *result)#

BLAS Level 1 API.

amin finds the first index of the element of minimum magnitude of a vector x.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of y.

  • result[inout] device pointer or host pointer to store the amin index. return is 0.0 if n, incx<=0.

hipblasStatus_t hipblasIsaminBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, int *result)#
hipblasStatus_t hipblasIdaminBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, int *result)#
hipblasStatus_t hipblasIcaminBatched(hipblasHandle_t handle, int n, const hipblasComplex *const x[], int incx, int batchCount, int *result)#
hipblasStatus_t hipblasIzaminBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *const x[], int incx, int batchCount, int *result)#

BLAS Level 1 API.

aminBatched finds the first index of the element of minimum magnitude of each vector x_i in a batch, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each vector x_i

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • batchCount[in] [int] number of instances in the batch, must be > 0.

  • result[out] device or host pointers to array of batchCount size for results. return is 0 if n, incx<=0.

hipblasStatus_t hipblasIsaminStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, int *result)#
hipblasStatus_t hipblasIdaminStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, int *result)#
hipblasStatus_t hipblasIcaminStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#
hipblasStatus_t hipblasIzaminStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#

BLAS Level 1 API.

aminStridedBatched finds the first index of the element of minimum magnitude of each vector x_i in a batch, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each vector x_i

  • x[in] device pointer to the first vector x_1.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • stridex[in] [hipblasStride] specifies the pointer increment between one x_i and the next x_(i + 1)

  • batchCount[in] [int] number of instances in the batch

  • result[out] device or host pointer to array for storing contiguous batchCount results. return is 0 if n <= 0, incx<=0.

hipblasXasum + Batched, StridedBatched#

hipblasStatus_t hipblasSasum(hipblasHandle_t handle, int n, const float *x, int incx, float *result)#
hipblasStatus_t hipblasDasum(hipblasHandle_t handle, int n, const double *x, int incx, double *result)#
hipblasStatus_t hipblasScasum(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, float *result)#
hipblasStatus_t hipblasDzasum(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, double *result)#

BLAS Level 1 API.

asum computes the sum of the magnitudes of elements of a real vector x, or the sum of magnitudes of the real and imaginary parts of elements if x is a complex vector.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x and y.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x. incx must be > 0.

  • result[inout] device pointer or host pointer to store the asum product. return is 0.0 if n <= 0.

hipblasStatus_t hipblasSasumBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, float *result)#
hipblasStatus_t hipblasDasumBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, double *result)#
hipblasStatus_t hipblasScasumBatched(hipblasHandle_t handle, int n, const hipblasComplex *const x[], int incx, int batchCount, float *result)#
hipblasStatus_t hipblasDzasumBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *const x[], int incx, int batchCount, double *result)#

BLAS Level 1 API.

asumBatched computes the sum of the magnitudes of the elements in a batch of real vectors x_i, or the sum of magnitudes of the real and imaginary parts of elements if x_i is a complex vector, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each vector x_i

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • batchCount[in] [int] number of instances in the batch.

  • result[out] device array or host array of batchCount size for results. return is 0.0 if n, incx<=0.

hipblasStatus_t hipblasSasumStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, float *result)#
hipblasStatus_t hipblasDasumStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, double *result)#
hipblasStatus_t hipblasScasumStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasStride stridex, int batchCount, float *result)#
hipblasStatus_t hipblasDzasumStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, double *result)#

BLAS Level 1 API.

asumStridedBatched computes the sum of the magnitudes of elements of a real vectors x_i, or the sum of magnitudes of the real and imaginary parts of elements if x_i is a complex vector, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each vector x_i

  • x[in] device pointer to the first vector x_1.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.

  • batchCount[in] [int] number of instances in the batch

  • result[out] device pointer or host pointer to array for storing contiguous batchCount results. return is 0.0 if n, incx<=0.

hipblasXaxpy + Batched, StridedBatched#

hipblasStatus_t hipblasHaxpy(hipblasHandle_t handle, int n, const hipblasHalf *alpha, const hipblasHalf *x, int incx, hipblasHalf *y, int incy)#
hipblasStatus_t hipblasSaxpy(hipblasHandle_t handle, int n, const float *alpha, const float *x, int incx, float *y, int incy)#
hipblasStatus_t hipblasDaxpy(hipblasHandle_t handle, int n, const double *alpha, const double *x, int incx, double *y, int incy)#
hipblasStatus_t hipblasCaxpy(hipblasHandle_t handle, int n, const hipblasComplex *alpha, const hipblasComplex *x, int incx, hipblasComplex *y, int incy)#
hipblasStatus_t hipblasZaxpy(hipblasHandle_t handle, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *x, int incx, hipblasDoubleComplex *y, int incy)#

BLAS Level 1 API.

axpy computes constant alpha multiplied by vector x, plus vector y

y := alpha * x + y
  • Supported precisions in rocBLAS : h,s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x and y.

  • alpha[in] device pointer or host pointer to specify the scalar alpha.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x.

  • y[out] device pointer storing vector y.

  • incy[inout] [int] specifies the increment for the elements of y.

hipblasStatus_t hipblasHaxpyBatched(hipblasHandle_t handle, int n, const hipblasHalf *alpha, const hipblasHalf *const x[], int incx, hipblasHalf *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasSaxpyBatched(hipblasHandle_t handle, int n, const float *alpha, const float *const x[], int incx, float *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasDaxpyBatched(hipblasHandle_t handle, int n, const double *alpha, const double *const x[], int incx, double *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasCaxpyBatched(hipblasHandle_t handle, int n, const hipblasComplex *alpha, const hipblasComplex *const x[], int incx, hipblasComplex *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasZaxpyBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *const x[], int incx, hipblasDoubleComplex *const y[], int incy, int batchCount)#

BLAS Level 1 API.

axpyBatched compute y := alpha * x + y over a set of batched vectors.

  • Supported precisions in rocBLAS : h,s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x and y.

  • alpha[in] specifies the scalar alpha.

  • x[in] pointer storing vector x on the GPU.

  • incx[in] [int] specifies the increment for the elements of x.

  • y[out] pointer storing vector y on the GPU.

  • incy[inout] [int] specifies the increment for the elements of y.

  • batchCount[in] [int] number of instances in the batch

hipblasStatus_t hipblasHaxpyStridedBatched(hipblasHandle_t handle, int n, const hipblasHalf *alpha, const hipblasHalf *x, int incx, hipblasStride stridex, hipblasHalf *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasSaxpyStridedBatched(hipblasHandle_t handle, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasDaxpyStridedBatched(hipblasHandle_t handle, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasCaxpyStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *alpha, const hipblasComplex *x, int incx, hipblasStride stridex, hipblasComplex *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasZaxpyStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, hipblasDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#

BLAS Level 1 API.

axpyStridedBatched compute y := alpha * x + y over a set of strided batched vectors.

  • Supported precisions in rocBLAS : h,s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int]

  • alpha[in] specifies the scalar alpha.

  • x[in] pointer storing vector x on the GPU.

  • incx[in] [int] specifies the increment for the elements of x.

  • stridex[in] [hipblasStride] specifies the increment between vectors of x.

  • y[out] pointer storing vector y on the GPU.

  • incy[inout] [int] specifies the increment for the elements of y.

  • stridey[in] [hipblasStride] specifies the increment between vectors of y.

  • batchCount[in] [int] number of instances in the batch

hipblasXcopy + Batched, StridedBatched#

hipblasStatus_t hipblasScopy(hipblasHandle_t handle, int n, const float *x, int incx, float *y, int incy)#
hipblasStatus_t hipblasDcopy(hipblasHandle_t handle, int n, const double *x, int incx, double *y, int incy)#
hipblasStatus_t hipblasCcopy(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasComplex *y, int incy)#
hipblasStatus_t hipblasZcopy(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasDoubleComplex *y, int incy)#

BLAS Level 1 API.

copy copies each element x[i] into y[i], for i = 1 , … , n

y := x,
  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x to be copied to y.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x.

  • y[out] device pointer storing vector y.

  • incy[in] [int] specifies the increment for the elements of y.

hipblasStatus_t hipblasScopyBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, float *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasDcopyBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, double *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasCcopyBatched(hipblasHandle_t handle, int n, const hipblasComplex *const x[], int incx, hipblasComplex *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasZcopyBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *const x[], int incx, hipblasDoubleComplex *const y[], int incy, int batchCount)#

BLAS Level 1 API.

copyBatched copies each element x_i[j] into y_i[j], for j = 1 , … , n; i = 1 , … , batchCount

y_i := x_i,
where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i to be copied to y_i.

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each vector x_i.

  • y[out] device array of device pointers storing each vector y_i.

  • incy[in] [int] specifies the increment for the elements of each vector y_i.

  • batchCount[in] [int] number of instances in the batch

hipblasStatus_t hipblasScopyStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasDcopyStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasCcopyStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasStride stridex, hipblasComplex *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasZcopyStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, hipblasDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#

BLAS Level 1 API.

copyStridedBatched copies each element x_i[j] into y_i[j], for j = 1 , … , n; i = 1 , … , batchCount

y_i := x_i,
where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i to be copied to y_i.

  • x[in] device pointer to the first vector (x_1) in the batch.

  • incx[in] [int] specifies the increments for the elements of vectors x_i.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.

  • y[out] device pointer to the first vector (y_1) in the batch.

  • incy[in] [int] specifies the increment for the elements of vectors y_i.

  • stridey[in] [hipblasStride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_y, however the user should take care to ensure that stride_y is of appropriate size, for a typical case this means stride_y >= n * incy. stridey should be non zero.

  • batchCount[in] [int] number of instances in the batch

hipblasXdot + Batched, StridedBatched#

hipblasStatus_t hipblasHdot(hipblasHandle_t handle, int n, const hipblasHalf *x, int incx, const hipblasHalf *y, int incy, hipblasHalf *result)#
hipblasStatus_t hipblasBfdot(hipblasHandle_t handle, int n, const hipblasBfloat16 *x, int incx, const hipblasBfloat16 *y, int incy, hipblasBfloat16 *result)#
hipblasStatus_t hipblasSdot(hipblasHandle_t handle, int n, const float *x, int incx, const float *y, int incy, float *result)#
hipblasStatus_t hipblasDdot(hipblasHandle_t handle, int n, const double *x, int incx, const double *y, int incy, double *result)#
hipblasStatus_t hipblasCdotc(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, const hipblasComplex *y, int incy, hipblasComplex *result)#
hipblasStatus_t hipblasCdotu(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, const hipblasComplex *y, int incy, hipblasComplex *result)#
hipblasStatus_t hipblasZdotc(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, const hipblasDoubleComplex *y, int incy, hipblasDoubleComplex *result)#
hipblasStatus_t hipblasZdotu(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, const hipblasDoubleComplex *y, int incy, hipblasDoubleComplex *result)#

BLAS Level 1 API.

dot(u) performs the dot product of vectors x and y

result = x * y;
dotc performs the dot product of the conjugate of complex vector x and complex vector y
result = conjugate (x) * y;
  • Supported precisions in rocBLAS : h,bf,s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x and y.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of y.

  • y[in] device pointer storing vector y.

  • incy[in] [int] specifies the increment for the elements of y.

  • result[inout] device pointer or host pointer to store the dot product. return is 0.0 if n <= 0.

hipblasStatus_t hipblasHdotBatched(hipblasHandle_t handle, int n, const hipblasHalf *const x[], int incx, const hipblasHalf *const y[], int incy, int batchCount, hipblasHalf *result)#
hipblasStatus_t hipblasBfdotBatched(hipblasHandle_t handle, int n, const hipblasBfloat16 *const x[], int incx, const hipblasBfloat16 *const y[], int incy, int batchCount, hipblasBfloat16 *result)#
hipblasStatus_t hipblasSdotBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, const float *const y[], int incy, int batchCount, float *result)#
hipblasStatus_t hipblasDdotBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, const double *const y[], int incy, int batchCount, double *result)#
hipblasStatus_t hipblasCdotcBatched(hipblasHandle_t handle, int n, const hipblasComplex *const x[], int incx, const hipblasComplex *const y[], int incy, int batchCount, hipblasComplex *result)#
hipblasStatus_t hipblasCdotuBatched(hipblasHandle_t handle, int n, const hipblasComplex *const x[], int incx, const hipblasComplex *const y[], int incy, int batchCount, hipblasComplex *result)#
hipblasStatus_t hipblasZdotcBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *const x[], int incx, const hipblasDoubleComplex *const y[], int incy, int batchCount, hipblasDoubleComplex *result)#
hipblasStatus_t hipblasZdotuBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *const x[], int incx, const hipblasDoubleComplex *const y[], int incy, int batchCount, hipblasDoubleComplex *result)#

BLAS Level 1 API.

dotBatched(u) performs a batch of dot products of vectors x and y

result_i = x_i * y_i;
dotcBatched performs a batch of dot products of the conjugate of complex vector x and complex vector y
result_i = conjugate (x_i) * y_i;
where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors, for i = 1, …, batchCount

  • Supported precisions in rocBLAS : h,bf,s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i and y_i.

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i.

  • y[in] device array of device pointers storing each vector y_i.

  • incy[in] [int] specifies the increment for the elements of each y_i.

  • batchCount[in] [int] number of instances in the batch

  • result[inout] device array or host array of batchCount size to store the dot products of each batch. return 0.0 for each element if n <= 0.

hipblasStatus_t hipblasHdotStridedBatched(hipblasHandle_t handle, int n, const hipblasHalf *x, int incx, hipblasStride stridex, const hipblasHalf *y, int incy, hipblasStride stridey, int batchCount, hipblasHalf *result)#
hipblasStatus_t hipblasBfdotStridedBatched(hipblasHandle_t handle, int n, const hipblasBfloat16 *x, int incx, hipblasStride stridex, const hipblasBfloat16 *y, int incy, hipblasStride stridey, int batchCount, hipblasBfloat16 *result)#
hipblasStatus_t hipblasSdotStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, const float *y, int incy, hipblasStride stridey, int batchCount, float *result)#
hipblasStatus_t hipblasDdotStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, const double *y, int incy, hipblasStride stridey, int batchCount, double *result)#
hipblasStatus_t hipblasCdotcStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasStride stridex, const hipblasComplex *y, int incy, hipblasStride stridey, int batchCount, hipblasComplex *result)#
hipblasStatus_t hipblasCdotuStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasStride stridex, const hipblasComplex *y, int incy, hipblasStride stridey, int batchCount, hipblasComplex *result)#
hipblasStatus_t hipblasZdotcStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, const hipblasDoubleComplex *y, int incy, hipblasStride stridey, int batchCount, hipblasDoubleComplex *result)#
hipblasStatus_t hipblasZdotuStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, const hipblasDoubleComplex *y, int incy, hipblasStride stridey, int batchCount, hipblasDoubleComplex *result)#

BLAS Level 1 API.

dotStridedBatched(u) performs a batch of dot products of vectors x and y

result_i = x_i * y_i;
dotcStridedBatched performs a batch of dot products of the conjugate of complex vector x and complex vector y
result_i = conjugate (x_i) * y_i;
where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors, for i = 1, …, batchCount

  • Supported precisions in rocBLAS : h,bf,s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i and y_i.

  • x[in] device pointer to the first vector (x_1) in the batch.

  • incx[in] [int] specifies the increment for the elements of each x_i.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1)

  • y[in] device pointer to the first vector (y_1) in the batch.

  • incy[in] [int] specifies the increment for the elements of each y_i.

  • stridey[in] [hipblasStride] stride from the start of one vector (y_i) and the next one (y_i+1)

  • batchCount[in] [int] number of instances in the batch

  • result[inout] device array or host array of batchCount size to store the dot products of each batch. return 0.0 for each element if n <= 0.

hipblasXnrm2 + Batched, StridedBatched#

hipblasStatus_t hipblasSnrm2(hipblasHandle_t handle, int n, const float *x, int incx, float *result)#
hipblasStatus_t hipblasDnrm2(hipblasHandle_t handle, int n, const double *x, int incx, double *result)#
hipblasStatus_t hipblasScnrm2(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, float *result)#
hipblasStatus_t hipblasDznrm2(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, double *result)#

BLAS Level 1 API.

nrm2 computes the euclidean norm of a real or complex vector

      result := sqrt( x'*x ) for real vectors
      result := sqrt( x**H*x ) for complex vectors
  • Supported precisions in rocBLAS : s,d,c,z,sc,dz

  • Supported precisions in cuBLAS : s,d,sc,dz

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of y.

  • result[inout] device pointer or host pointer to store the nrm2 product. return is 0.0 if n, incx<=0.

hipblasStatus_t hipblasSnrm2Batched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, float *result)#
hipblasStatus_t hipblasDnrm2Batched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, double *result)#
hipblasStatus_t hipblasScnrm2Batched(hipblasHandle_t handle, int n, const hipblasComplex *const x[], int incx, int batchCount, float *result)#
hipblasStatus_t hipblasDznrm2Batched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *const x[], int incx, int batchCount, double *result)#

BLAS Level 1 API.

nrm2Batched computes the euclidean norm over a batch of real or complex vectors

      result := sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batchCount
      result := sqrt( x_i**H*x_i ) for complex vectors x, for i = 1, ..., batchCount
  • Supported precisions in rocBLAS : s,d,c,z,sc,dz

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each x_i.

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • batchCount[in] [int] number of instances in the batch

  • result[out] device pointer or host pointer to array of batchCount size for nrm2 results. return is 0.0 for each element if n <= 0, incx<=0.

hipblasStatus_t hipblasSnrm2StridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, float *result)#
hipblasStatus_t hipblasDnrm2StridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, double *result)#
hipblasStatus_t hipblasScnrm2StridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *x, int incx, hipblasStride stridex, int batchCount, float *result)#
hipblasStatus_t hipblasDznrm2StridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, double *result)#

BLAS Level 1 API.

nrm2StridedBatched computes the euclidean norm over a batch of real or complex vectors

      := sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batchCount
      := sqrt( x_i**H*x_i ) for complex vectors, for i = 1, ..., batchCount
  • Supported precisions in rocBLAS : s,d,c,z,sc,dz

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each x_i.

  • x[in] device pointer to the first vector x_1.

  • incx[in] [int] specifies the increment for the elements of each x_i. incx must be > 0.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.

  • batchCount[in] [int] number of instances in the batch

  • result[out] device pointer or host pointer to array for storing contiguous batchCount results. return is 0.0 for each element if n <= 0, incx<=0.

hipblasXrot + Batched, StridedBatched#

hipblasStatus_t hipblasSrot(hipblasHandle_t handle, int n, float *x, int incx, float *y, int incy, const float *c, const float *s)#
hipblasStatus_t hipblasDrot(hipblasHandle_t handle, int n, double *x, int incx, double *y, int incy, const double *c, const double *s)#
hipblasStatus_t hipblasCrot(hipblasHandle_t handle, int n, hipblasComplex *x, int incx, hipblasComplex *y, int incy, const float *c, const hipblasComplex *s)#
hipblasStatus_t hipblasCsrot(hipblasHandle_t handle, int n, hipblasComplex *x, int incx, hipblasComplex *y, int incy, const float *c, const float *s)#
hipblasStatus_t hipblasZrot(hipblasHandle_t handle, int n, hipblasDoubleComplex *x, int incx, hipblasDoubleComplex *y, int incy, const double *c, const hipblasDoubleComplex *s)#
hipblasStatus_t hipblasZdrot(hipblasHandle_t handle, int n, hipblasDoubleComplex *x, int incx, hipblasDoubleComplex *y, int incy, const double *c, const double *s)#

BLAS Level 1 API.

rot applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to vectors x and y. Scalars c and s may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode.

  • Supported precisions in rocBLAS : s,d,c,z,sc,dz

  • Supported precisions in cuBLAS : s,d,c,z,cs,zd

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in the x and y vectors.

  • x[inout] device pointer storing vector x.

  • incx[in] [int] specifies the increment between elements of x.

  • y[inout] device pointer storing vector y.

  • incy[in] [int] specifies the increment between elements of y.

  • c[in] device pointer or host pointer storing scalar cosine component of the rotation matrix.

  • s[in] device pointer or host pointer storing scalar sine component of the rotation matrix.

hipblasStatus_t hipblasSrotBatched(hipblasHandle_t handle, int n, float *const x[], int incx, float *const y[], int incy, const float *c, const float *s, int batchCount)#
hipblasStatus_t hipblasDrotBatched(hipblasHandle_t handle, int n, double *const x[], int incx, double *const y[], int incy, const double *c, const double *s, int batchCount)#
hipblasStatus_t hipblasCrotBatched(hipblasHandle_t handle, int n, hipblasComplex *const x[], int incx, hipblasComplex *const y[], int incy, const float *c, const hipblasComplex *s, int batchCount)#
hipblasStatus_t hipblasCsrotBatched(hipblasHandle_t handle, int n, hipblasComplex *const x[], int incx, hipblasComplex *const y[], int incy, const float *c, const float *s, int batchCount)#
hipblasStatus_t hipblasZrotBatched(hipblasHandle_t handle, int n, hipblasDoubleComplex *const x[], int incx, hipblasDoubleComplex *const y[], int incy, const double *c, const hipblasDoubleComplex *s, int batchCount)#
hipblasStatus_t hipblasZdrotBatched(hipblasHandle_t handle, int n, hipblasDoubleComplex *const x[], int incx, hipblasDoubleComplex *const y[], int incy, const double *c, const double *s, int batchCount)#

BLAS Level 1 API.

rotBatched applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to batched vectors x_i and y_i, for i = 1, …, batchCount. Scalars c and s may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode.

  • Supported precisions in rocBLAS : s,d,sc,dz

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each x_i and y_i vectors.

  • x[inout] device array of deivce pointers storing each vector x_i.

  • incx[in] [int] specifies the increment between elements of each x_i.

  • y[inout] device array of device pointers storing each vector y_i.

  • incy[in] [int] specifies the increment between elements of each y_i.

  • c[in] device pointer or host pointer to scalar cosine component of the rotation matrix.

  • s[in] device pointer or host pointer to scalar sine component of the rotation matrix.

  • batchCount[in] [int] the number of x and y arrays, i.e. the number of batches.

hipblasStatus_t hipblasSrotStridedBatched(hipblasHandle_t handle, int n, float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, const float *c, const float *s, int batchCount)#
hipblasStatus_t hipblasDrotStridedBatched(hipblasHandle_t handle, int n, double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, const double *c, const double *s, int batchCount)#
hipblasStatus_t hipblasCrotStridedBatched(hipblasHandle_t handle, int n, hipblasComplex *x, int incx, hipblasStride stridex, hipblasComplex *y, int incy, hipblasStride stridey, const float *c, const hipblasComplex *s, int batchCount)#
hipblasStatus_t hipblasCsrotStridedBatched(hipblasHandle_t handle, int n, hipblasComplex *x, int incx, hipblasStride stridex, hipblasComplex *y, int incy, hipblasStride stridey, const float *c, const float *s, int batchCount)#
hipblasStatus_t hipblasZrotStridedBatched(hipblasHandle_t handle, int n, hipblasDoubleComplex *x, int incx, hipblasStride stridex, hipblasDoubleComplex *y, int incy, hipblasStride stridey, const double *c, const hipblasDoubleComplex *s, int batchCount)#
hipblasStatus_t hipblasZdrotStridedBatched(hipblasHandle_t handle, int n, hipblasDoubleComplex *x, int incx, hipblasStride stridex, hipblasDoubleComplex *y, int incy, hipblasStride stridey, const double *c, const double *s, int batchCount)#

BLAS Level 1 API.

rotStridedBatched applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to strided batched vectors x_i and y_i, for i = 1, …, batchCount. Scalars c and s may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode.

  • Supported precisions in rocBLAS : s,d,sc,dz

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in each x_i and y_i vectors.

  • x[inout] device pointer to the first vector x_1.

  • incx[in] [int] specifies the increment between elements of each x_i.

  • stridex[in] [hipblasStride] specifies the increment from the beginning of x_i to the beginning of x_(i+1)

  • y[inout] device pointer to the first vector y_1.

  • incy[in] [int] specifies the increment between elements of each y_i.

  • stridey[in] [hipblasStride] specifies the increment from the beginning of y_i to the beginning of y_(i+1)

  • c[in] device pointer or host pointer to scalar cosine component of the rotation matrix.

  • s[in] device pointer or host pointer to scalar sine component of the rotation matrix.

  • batchCount[in] [int] the number of x and y arrays, i.e. the number of batches.

hipblasXrotg + Batched, StridedBatched#

hipblasStatus_t hipblasSrotg(hipblasHandle_t handle, float *a, float *b, float *c, float *s)#
hipblasStatus_t hipblasDrotg(hipblasHandle_t handle, double *a, double *b, double *c, double *s)#
hipblasStatus_t hipblasCrotg(hipblasHandle_t handle, hipblasComplex *a, hipblasComplex *b, float *c, hipblasComplex *s)#
hipblasStatus_t hipblasZrotg(hipblasHandle_t handle, hipblasDoubleComplex *a, hipblasDoubleComplex *b, double *c, hipblasDoubleComplex *s)#

BLAS Level 1 API.

rotg creates the Givens rotation matrix for the vector (a b). Scalars c and s and arrays a and b may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode. If the pointer mode is set to HIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to HIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • a[inout] device pointer or host pointer to input vector element, overwritten with r.

  • b[inout] device pointer or host pointer to input vector element, overwritten with z.

  • c[inout] device pointer or host pointer to cosine element of Givens rotation.

  • s[inout] device pointer or host pointer sine element of Givens rotation.

hipblasStatus_t hipblasSrotgBatched(hipblasHandle_t handle, float *const a[], float *const b[], float *const c[], float *const s[], int batchCount)#
hipblasStatus_t hipblasDrotgBatched(hipblasHandle_t handle, double *const a[], double *const b[], double *const c[], double *const s[], int batchCount)#
hipblasStatus_t hipblasCrotgBatched(hipblasHandle_t handle, hipblasComplex *const a[], hipblasComplex *const b[], float *const c[], hipblasComplex *const s[], int batchCount)#
hipblasStatus_t hipblasZrotgBatched(hipblasHandle_t handle, hipblasDoubleComplex *const a[], hipblasDoubleComplex *const b[], double *const c[], hipblasDoubleComplex *const s[], int batchCount)#

BLAS Level 1 API.

rotgBatched creates the Givens rotation matrix for the batched vectors (a_i b_i), for i = 1, …, batchCount. a, b, c, and s may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode. If the pointer mode is set to HIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to HIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • a[inout] device array of device pointers storing each single input vector element a_i, overwritten with r_i.

  • b[inout] device array of device pointers storing each single input vector element b_i, overwritten with z_i.

  • c[inout] device array of device pointers storing each cosine element of Givens rotation for the batch.

  • s[inout] device array of device pointers storing each sine element of Givens rotation for the batch.

  • batchCount[in] [int] number of batches (length of arrays a, b, c, and s).

hipblasStatus_t hipblasSrotgStridedBatched(hipblasHandle_t handle, float *a, hipblasStride stridea, float *b, hipblasStride strideb, float *c, hipblasStride stridec, float *s, hipblasStride strides, int batchCount)#
hipblasStatus_t hipblasDrotgStridedBatched(hipblasHandle_t handle, double *a, hipblasStride stridea, double *b, hipblasStride strideb, double *c, hipblasStride stridec, double *s, hipblasStride strides, int batchCount)#
hipblasStatus_t hipblasCrotgStridedBatched(hipblasHandle_t handle, hipblasComplex *a, hipblasStride stridea, hipblasComplex *b, hipblasStride strideb, float *c, hipblasStride stridec, hipblasComplex *s, hipblasStride strides, int batchCount)#
hipblasStatus_t hipblasZrotgStridedBatched(hipblasHandle_t handle, hipblasDoubleComplex *a, hipblasStride stridea, hipblasDoubleComplex *b, hipblasStride strideb, double *c, hipblasStride stridec, hipblasDoubleComplex *s, hipblasStride strides, int batchCount)#

BLAS Level 1 API.

rotgStridedBatched creates the Givens rotation matrix for the strided batched vectors (a_i b_i), for i = 1, …, batchCount. a, b, c, and s may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode. If the pointer mode is set to HIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to HIPBLAS_POINTER_MODE_HOST, this function returns immediately and synchronization is required to read the results.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • a[inout] device strided_batched pointer or host strided_batched pointer to first single input vector element a_1, overwritten with r.

  • stridea[in] [hipblasStride] distance between elements of a in batch (distance between a_i and a_(i + 1))

  • b[inout] device strided_batched pointer or host strided_batched pointer to first single input vector element b_1, overwritten with z.

  • strideb[in] [hipblasStride] distance between elements of b in batch (distance between b_i and b_(i + 1))

  • c[inout] device strided_batched pointer or host strided_batched pointer to first cosine element of Givens rotations c_1.

  • stridec[in] [hipblasStride] distance between elements of c in batch (distance between c_i and c_(i + 1))

  • s[inout] device strided_batched pointer or host strided_batched pointer to sine element of Givens rotations s_1.

  • strides[in] [hipblasStride] distance between elements of s in batch (distance between s_i and s_(i + 1))

  • batchCount[in] [int] number of batches (length of arrays a, b, c, and s).

hipblasXrotm + Batched, StridedBatched#

hipblasStatus_t hipblasSrotm(hipblasHandle_t handle, int n, float *x, int incx, float *y, int incy, const float *param)#
hipblasStatus_t hipblasDrotm(hipblasHandle_t handle, int n, double *x, int incx, double *y, int incy, const double *param)#

BLAS Level 1 API.

rotm applies the modified Givens rotation matrix defined by param to vectors x and y.

  • Supported precisions in rocBLAS : s,d

  • Supported precisions in cuBLAS : s,d

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in the x and y vectors.

  • x[inout] device pointer storing vector x.

  • incx[in] [int] specifies the increment between elements of x.

  • y[inout] device pointer storing vector y.

  • incy[in] [int] specifies the increment between elements of y.

  • param[in] device vector or host vector of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode.

hipblasStatus_t hipblasSrotmBatched(hipblasHandle_t handle, int n, float *const x[], int incx, float *const y[], int incy, const float *const param[], int batchCount)#
hipblasStatus_t hipblasDrotmBatched(hipblasHandle_t handle, int n, double *const x[], int incx, double *const y[], int incy, const double *const param[], int batchCount)#

BLAS Level 1 API.

rotmBatched applies the modified Givens rotation matrix defined by param_i to batched vectors x_i and y_i, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in the x and y vectors.

  • x[inout] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment between elements of each x_i.

  • y[inout] device array of device pointers storing each vector y_1.

  • incy[in] [int] specifies the increment between elements of each y_i.

  • param[in] device array of device vectors of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may ONLY be stored on the device for the batched version of this function.

  • batchCount[in] [int] the number of x and y arrays, i.e. the number of batches.

hipblasStatus_t hipblasSrotmStridedBatched(hipblasHandle_t handle, int n, float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, const float *param, hipblasStride strideParam, int batchCount)#
hipblasStatus_t hipblasDrotmStridedBatched(hipblasHandle_t handle, int n, double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, const double *param, hipblasStride strideParam, int batchCount)#

BLAS Level 1 API.

rotmStridedBatched applies the modified Givens rotation matrix defined by param_i to strided batched vectors x_i and y_i, for i = 1, …, batchCount

  • Supported precisions in rocBLAS : s,d

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] number of elements in the x and y vectors.

  • x[inout] device pointer pointing to first strided batched vector x_1.

  • incx[in] [int] specifies the increment between elements of each x_i.

  • stridex[in] [hipblasStride] specifies the increment between the beginning of x_i and x_(i + 1)

  • y[inout] device pointer pointing to first strided batched vector y_1.

  • incy[in] [int] specifies the increment between elements of each y_i.

  • stridey[in] [hipblasStride] specifies the increment between the beginning of y_i and y_(i + 1)

  • param[in] device pointer pointing to first array of 5 elements defining the rotation (param_1). param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may ONLY be stored on the device for the strided_batched version of this function.

  • strideParam[in] [hipblasStride] specifies the increment between the beginning of param_i and param_(i + 1)

  • batchCount[in] [int] the number of x and y arrays, i.e. the number of batches.

hipblasXrotmg + Batched, StridedBatched#

hipblasStatus_t hipblasSrotmg(hipblasHandle_t handle, float *d1, float *d2, float *x1, const float *y1, float *param)#
hipblasStatus_t hipblasDrotmg(hipblasHandle_t handle, double *d1, double *d2, double *x1, const double *y1, double *param)#

BLAS Level 1 API.

rotmg creates the modified Givens rotation matrix for the vector (d1 * x1, d2 * y1). Parameters may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode. If the pointer mode is set to HIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to HIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.

  • Supported precisions in rocBLAS : s,d

  • Supported precisions in cuBLAS : s,d

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • d1[inout] device pointer or host pointer to input scalar that is overwritten.

  • d2[inout] device pointer or host pointer to input scalar that is overwritten.

  • x1[inout] device pointer or host pointer to input scalar that is overwritten.

  • y1[in] device pointer or host pointer to input scalar.

  • param[out] device vector or host vector of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode.

hipblasStatus_t hipblasSrotmgBatched(hipblasHandle_t handle, float *const d1[], float *const d2[], float *const x1[], const float *const y1[], float *const param[], int batchCount)#
hipblasStatus_t hipblasDrotmgBatched(hipblasHandle_t handle, double *const d1[], double *const d2[], double *const x1[], const double *const y1[], double *const param[], int batchCount)#

BLAS Level 1 API.

rotmgBatched creates the modified Givens rotation matrix for the batched vectors (d1_i * x1_i, d2_i * y1_i), for i = 1, …, batchCount. Parameters may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode. If the pointer mode is set to HIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to HIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.

  • Supported precisions in rocBLAS : s,d

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • d1[inout] device batched array or host batched array of input scalars that is overwritten.

  • d2[inout] device batched array or host batched array of input scalars that is overwritten.

  • x1[inout] device batched array or host batched array of input scalars that is overwritten.

  • y1[in] device batched array or host batched array of input scalars.

  • param[out] device batched array or host batched array of vectors of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode.

  • batchCount[in] [int] the number of instances in the batch.

hipblasStatus_t hipblasSrotmgStridedBatched(hipblasHandle_t handle, float *d1, hipblasStride strided1, float *d2, hipblasStride strided2, float *x1, hipblasStride stridex1, const float *y1, hipblasStride stridey1, float *param, hipblasStride strideParam, int batchCount)#
hipblasStatus_t hipblasDrotmgStridedBatched(hipblasHandle_t handle, double *d1, hipblasStride strided1, double *d2, hipblasStride strided2, double *x1, hipblasStride stridex1, const double *y1, hipblasStride stridey1, double *param, hipblasStride strideParam, int batchCount)#

BLAS Level 1 API.

rotmgStridedBatched creates the modified Givens rotation matrix for the strided batched vectors (d1_i * x1_i, d2_i * y1_i), for i = 1, …, batchCount. Parameters may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode. If the pointer mode is set to HIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set to HIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.

  • Supported precisions in rocBLAS : s,d

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • d1[inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.

  • strided1[in] [hipblasStride] specifies the increment between the beginning of d1_i and d1_(i+1)

  • d2[inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.

  • strided2[in] [hipblasStride] specifies the increment between the beginning of d2_i and d2_(i+1)

  • x1[inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.

  • stridex1[in] [hipblasStride] specifies the increment between the beginning of x1_i and x1_(i+1)

  • y1[in] device strided_batched array or host strided_batched array of input scalars.

  • stridey1[in] [hipblasStride] specifies the increment between the beginning of y1_i and y1_(i+1)

  • param[out] device stridedBatched array or host stridedBatched array of vectors of 5 elements defining the rotation. param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling hipblasSetPointerMode.

  • strideParam[in] [hipblasStride] specifies the increment between the beginning of param_i and param_(i + 1)

  • batchCount[in] [int] the number of instances in the batch.

hipblasXscal + Batched, StridedBatched#

hipblasStatus_t hipblasSscal(hipblasHandle_t handle, int n, const float *alpha, float *x, int incx)#
hipblasStatus_t hipblasDscal(hipblasHandle_t handle, int n, const double *alpha, double *x, int incx)#
hipblasStatus_t hipblasCscal(hipblasHandle_t handle, int n, const hipblasComplex *alpha, hipblasComplex *x, int incx)#
hipblasStatus_t hipblasCsscal(hipblasHandle_t handle, int n, const float *alpha, hipblasComplex *x, int incx)#
hipblasStatus_t hipblasZscal(hipblasHandle_t handle, int n, const hipblasDoubleComplex *alpha, hipblasDoubleComplex *x, int incx)#
hipblasStatus_t hipblasZdscal(hipblasHandle_t handle, int n, const double *alpha, hipblasDoubleComplex *x, int incx)#

BLAS Level 1 API.

scal scales each element of vector x with scalar alpha.

x := alpha * x
  • Supported precisions in rocBLAS : s,d,c,z,cs,zd

  • Supported precisions in cuBLAS : s,d,c,z,cs,zd

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x.

  • alpha[in] device pointer or host pointer for the scalar alpha.

  • x[inout] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x.

hipblasStatus_t hipblasSscalBatched(hipblasHandle_t handle, int n, const float *alpha, float *const x[], int incx, int batchCount)#
hipblasStatus_t hipblasDscalBatched(hipblasHandle_t handle, int n, const double *alpha, double *const x[], int incx, int batchCount)#
hipblasStatus_t hipblasCscalBatched(hipblasHandle_t handle, int n, const hipblasComplex *alpha, hipblasComplex *const x[], int incx, int batchCount)#
hipblasStatus_t hipblasZscalBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *alpha, hipblasDoubleComplex *const x[], int incx, int batchCount)#
hipblasStatus_t hipblasCsscalBatched(hipblasHandle_t handle, int n, const float *alpha, hipblasComplex *const x[], int incx, int batchCount)#
hipblasStatus_t hipblasZdscalBatched(hipblasHandle_t handle, int n, const double *alpha, hipblasDoubleComplex *const x[], int incx, int batchCount)#

BLAS Level 1 API.

scalBatched scales each element of vector x_i with scalar alpha, for i = 1, … , batchCount.

 x_i := alpha * x_i
where (x_i) is the i-th instance of the batch.

  • Supported precisions in rocBLAS : s,d,c,z,cs,zd

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i.

  • alpha[in] host pointer or device pointer for the scalar alpha.

  • x[inout] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i.

  • batchCount[in] [int] specifies the number of batches in x.

hipblasStatus_t hipblasSscalStridedBatched(hipblasHandle_t handle, int n, const float *alpha, float *x, int incx, hipblasStride stridex, int batchCount)#
hipblasStatus_t hipblasDscalStridedBatched(hipblasHandle_t handle, int n, const double *alpha, double *x, int incx, hipblasStride stridex, int batchCount)#
hipblasStatus_t hipblasCscalStridedBatched(hipblasHandle_t handle, int n, const hipblasComplex *alpha, hipblasComplex *x, int incx, hipblasStride stridex, int batchCount)#
hipblasStatus_t hipblasZscalStridedBatched(hipblasHandle_t handle, int n, const hipblasDoubleComplex *alpha, hipblasDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
hipblasStatus_t hipblasCsscalStridedBatched(hipblasHandle_t handle, int n, const float *alpha, hipblasComplex *x, int incx, hipblasStride stridex, int batchCount)#
hipblasStatus_t hipblasZdscalStridedBatched(hipblasHandle_t handle, int n, const double *alpha, hipblasDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#

BLAS Level 1 API.

scalStridedBatched scales each element of vector x_i with scalar alpha, for i = 1, … , batchCount.

 x_i := alpha * x_i ,
where (x_i) is the i-th instance of the batch.

  • Supported precisions in rocBLAS : s,d,c,z,cs,zd

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i.

  • alpha[in] host pointer or device pointer for the scalar alpha.

  • x[inout] device pointer to the first vector (x_1) in the batch.

  • incx[in] [int] specifies the increment for the elements of x.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.

  • batchCount[in] [int] specifies the number of batches in x.

hipblasXswap + Batched, StridedBatched#

hipblasStatus_t hipblasSswap(hipblasHandle_t handle, int n, float *x, int incx, float *y, int incy)#
hipblasStatus_t hipblasDswap(hipblasHandle_t handle, int n, double *x, int incx, double *y, int incy)#
hipblasStatus_t hipblasCswap(hipblasHandle_t handle, int n, hipblasComplex *x, int incx, hipblasComplex *y, int incy)#
hipblasStatus_t hipblasZswap(hipblasHandle_t handle, int n, hipblasDoubleComplex *x, int incx, hipblasDoubleComplex *y, int incy)#

BLAS Level 1 API.

swap interchanges vectors x and y.

y := x; x := y
  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in x and y.

  • x[inout] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x.

  • y[inout] device pointer storing vector y.

  • incy[in] [int] specifies the increment for the elements of y.

hipblasStatus_t hipblasSswapBatched(hipblasHandle_t handle, int n, float *const x[], int incx, float *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasDswapBatched(hipblasHandle_t handle, int n, double *const x[], int incx, double *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasCswapBatched(hipblasHandle_t handle, int n, hipblasComplex *const x[], int incx, hipblasComplex *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasZswapBatched(hipblasHandle_t handle, int n, hipblasDoubleComplex *const x[], int incx, hipblasDoubleComplex *const y[], int incy, int batchCount)#

BLAS Level 1 API.

swapBatched interchanges vectors x_i and y_i, for i = 1 , … , batchCount

y_i := x_i; x_i := y_i
  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i and y_i.

  • x[inout] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i.

  • y[inout] device array of device pointers storing each vector y_i.

  • incy[in] [int] specifies the increment for the elements of each y_i.

  • batchCount[in] [int] number of instances in the batch.

hipblasStatus_t hipblasSswapStridedBatched(hipblasHandle_t handle, int n, float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasDswapStridedBatched(hipblasHandle_t handle, int n, double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasCswapStridedBatched(hipblasHandle_t handle, int n, hipblasComplex *x, int incx, hipblasStride stridex, hipblasComplex *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasZswapStridedBatched(hipblasHandle_t handle, int n, hipblasDoubleComplex *x, int incx, hipblasStride stridex, hipblasDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#

BLAS Level 1 API.

swapStridedBatched interchanges vectors x_i and y_i, for i = 1 , … , batchCount

y_i := x_i; x_i := y_i
  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • n[in] [int] the number of elements in each x_i and y_i.

  • x[inout] device pointer to the first vector x_1.

  • incx[in] [int] specifies the increment for the elements of x.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx.

  • y[inout] device pointer to the first vector y_1.

  • incy[in] [int] specifies the increment for the elements of y.

  • stridey[in] [hipblasStride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_x, however the user should take care to ensure that stride_y is of appropriate size, for a typical case this means stride_y >= n * incy. stridey should be non zero.

  • batchCount[in] [int] number of instances in the batch.

Level 2 BLAS#

hipblasXgbmv + Batched, StridedBatched#

hipblasStatus_t hipblasSgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const float *alpha, const float *AP, int lda, const float *x, int incx, const float *beta, float *y, int incy)#
hipblasStatus_t hipblasDgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const double *alpha, const double *AP, int lda, const double *x, int incx, const double *beta, double *y, int incy)#
hipblasStatus_t hipblasCgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipblasComplex *alpha, const hipblasComplex *AP, int lda, const hipblasComplex *x, int incx, const hipblasComplex *beta, hipblasComplex *y, int incy)#
hipblasStatus_t hipblasZgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *AP, int lda, const hipblasDoubleComplex *x, int incx, const hipblasDoubleComplex *beta, hipblasDoubleComplex *y, int incy)#

BLAS Level 2 API.

gbmv performs one of the matrix-vector operations

y := alpha*A*x    + beta*y,   or
y := alpha*A**T*x + beta*y,   or
y := alpha*A**H*x + beta*y,
where alpha and beta are scalars, x and y are vectors and A is an m by n banded matrix with kl sub-diagonals and ku super-diagonals.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • trans[in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not

  • m[in] [int] number of rows of matrix A

  • n[in] [int] number of columns of matrix A

  • kl[in] [int] number of sub-diagonals of A

  • ku[in] [int] number of super-diagonals of A

  • alpha[in] device pointer or host pointer to scalar alpha.

  • AP[in] device pointer storing banded matrix A. Leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first super-diagonal above on the RHS of row ku. The first sub-diagonal resides below on the LHS of row ku + 2. This propogates up and down across sub/super-diagonals. Ex: (m = n = 7; ku = 2, kl = 2) 1 2 3 0 0 0 0 0 0 3 3 3 3 3 4 1 2 3 0 0 0 0 2 2 2 2 2 2 5 4 1 2 3 0 0 -&#8212;> 1 1 1 1 1 1 1 0 5 4 1 2 3 0 4 4 4 4 4 4 0 0 0 5 4 1 2 0 5 5 5 5 5 0 0 0 0 0 5 4 1 2 0 0 0 0 0 0 0 0 0 0 0 5 4 1 0 0 0 0 0 0 0 Note that the empty elements which don’t correspond to data will not be referenced.

  • lda[in] [int] specifies the leading dimension of A. Must be >= (kl + ku + 1)

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x.

  • beta[in] device pointer or host pointer to scalar beta.

  • y[inout] device pointer storing vector y.

  • incy[in] [int] specifies the increment for the elements of y.

hipblasStatus_t hipblasSgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const float *alpha, const float *const AP[], int lda, const float *const x[], int incx, const float *beta, float *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasDgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const double *alpha, const double *const AP[], int lda, const double *const x[], int incx, const double *beta, double *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasCgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipblasComplex *alpha, const hipblasComplex *const AP[], int lda, const hipblasComplex *const x[], int incx, const hipblasComplex *beta, hipblasComplex *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasZgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *const AP[], int lda, const hipblasDoubleComplex *const x[], int incx, const hipblasDoubleComplex *beta, hipblasDoubleComplex *const y[], int incy, int batchCount)#

BLAS Level 2 API.

gbmvBatched performs one of the matrix-vector operations

y_i := alpha*A_i*x_i    + beta*y_i,   or
y_i := alpha*A_i**T*x_i + beta*y_i,   or
y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the i-th instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n banded matrix with kl sub-diagonals and ku super-diagonals, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • trans[in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not

  • m[in] [int] number of rows of each matrix A_i

  • n[in] [int] number of columns of each matrix A_i

  • kl[in] [int] number of sub-diagonals of each A_i

  • ku[in] [int] number of super-diagonals of each A_i

  • alpha[in] device pointer or host pointer to scalar alpha.

  • AP[in] device array of device pointers storing each banded matrix A_i. Leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first super-diagonal above on the RHS of row ku. The first sub-diagonal resides below on the LHS of row ku + 2. This propogates up and down across sub/super-diagonals. Ex: (m = n = 7; ku = 2, kl = 2) 1 2 3 0 0 0 0 0 0 3 3 3 3 3 4 1 2 3 0 0 0 0 2 2 2 2 2 2 5 4 1 2 3 0 0 -&#8212;> 1 1 1 1 1 1 1 0 5 4 1 2 3 0 4 4 4 4 4 4 0 0 0 5 4 1 2 0 5 5 5 5 5 0 0 0 0 0 5 4 1 2 0 0 0 0 0 0 0 0 0 0 0 5 4 1 0 0 0 0 0 0 0 Note that the empty elements which don’t correspond to data will not be referenced.

  • lda[in] [int] specifies the leading dimension of each A_i. Must be >= (kl + ku + 1)

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each x_i.

  • beta[in] device pointer or host pointer to scalar beta.

  • y[inout] device array of device pointers storing each vector y_i.

  • incy[in] [int] specifies the increment for the elements of each y_i.

  • batchCount[in] [int] specifies the number of instances in the batch.

hipblasStatus_t hipblasSgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, const float *beta, float *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasDgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, const double *beta, double *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasCgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipblasComplex *alpha, const hipblasComplex *AP, int lda, hipblasStride strideA, const hipblasComplex *x, int incx, hipblasStride stridex, const hipblasComplex *beta, hipblasComplex *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasZgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *AP, int lda, hipblasStride strideA, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, const hipblasDoubleComplex *beta, hipblasDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#

BLAS Level 2 API.

gbmvStridedBatched performs one of the matrix-vector operations

y_i := alpha*A_i*x_i    + beta*y_i,   or
y_i := alpha*A_i**T*x_i + beta*y_i,   or
y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the i-th instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n banded matrix with kl sub-diagonals and ku super-diagonals, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • trans[in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not

  • m[in] [int] number of rows of matrix A

  • n[in] [int] number of columns of matrix A

  • kl[in] [int] number of sub-diagonals of A

  • ku[in] [int] number of super-diagonals of A

  • alpha[in] device pointer or host pointer to scalar alpha.

  • AP[in] device pointer to first banded matrix (A_1). Leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first super-diagonal above on the RHS of row ku. The first sub-diagonal resides below on the LHS of row ku + 2. This propogates up and down across sub/super-diagonals. Ex: (m = n = 7; ku = 2, kl = 2) 1 2 3 0 0 0 0 0 0 3 3 3 3 3 4 1 2 3 0 0 0 0 2 2 2 2 2 2 5 4 1 2 3 0 0 -&#8212;> 1 1 1 1 1 1 1 0 5 4 1 2 3 0 4 4 4 4 4 4 0 0 0 5 4 1 2 0 5 5 5 5 5 0 0 0 0 0 5 4 1 2 0 0 0 0 0 0 0 0 0 0 0 5 4 1 0 0 0 0 0 0 0 Note that the empty elements which don’t correspond to data will not be referenced.

  • lda[in] [int] specifies the leading dimension of A. Must be >= (kl + ku + 1)

  • strideA[in] [hipblasStride] stride from the start of one matrix (A_i) and the next one (A_i+1)

  • x[in] device pointer to first vector (x_1).

  • incx[in] [int] specifies the increment for the elements of x.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1)

  • beta[in] device pointer or host pointer to scalar beta.

  • y[inout] device pointer to first vector (y_1).

  • incy[in] [int] specifies the increment for the elements of y.

  • stridey[in] [hipblasStride] stride from the start of one vector (y_i) and the next one (x_i+1)

  • batchCount[in] [int] specifies the number of instances in the batch.

hipblasXgemv + Batched, StridedBatched#

hipblasStatus_t hipblasSgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const float *alpha, const float *AP, int lda, const float *x, int incx, const float *beta, float *y, int incy)#
hipblasStatus_t hipblasDgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const double *alpha, const double *AP, int lda, const double *x, int incx, const double *beta, double *y, int incy)#
hipblasStatus_t hipblasCgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipblasComplex *alpha, const hipblasComplex *AP, int lda, const hipblasComplex *x, int incx, const hipblasComplex *beta, hipblasComplex *y, int incy)#
hipblasStatus_t hipblasZgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *AP, int lda, const hipblasDoubleComplex *x, int incx, const hipblasDoubleComplex *beta, hipblasDoubleComplex *y, int incy)#

BLAS Level 2 API.

gemv performs one of the matrix-vector operations

y := alpha*A*x    + beta*y,   or
y := alpha*A**T*x + beta*y,   or
y := alpha*A**H*x + beta*y,
where alpha and beta are scalars, x and y are vectors and A is an m by n matrix.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • trans[in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not

  • m[in] [int] number of rows of matrix A

  • n[in] [int] number of columns of matrix A

  • alpha[in] device pointer or host pointer to scalar alpha.

  • AP[in] device pointer storing matrix A.

  • lda[in] [int] specifies the leading dimension of A.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x.

  • beta[in] device pointer or host pointer to scalar beta.

  • y[inout] device pointer storing vector y.

  • incy[in] [int] specifies the increment for the elements of y.

hipblasStatus_t hipblasSgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const float *alpha, const float *const AP[], int lda, const float *const x[], int incx, const float *beta, float *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasDgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const double *alpha, const double *const AP[], int lda, const double *const x[], int incx, const double *beta, double *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasCgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipblasComplex *alpha, const hipblasComplex *const AP[], int lda, const hipblasComplex *const x[], int incx, const hipblasComplex *beta, hipblasComplex *const y[], int incy, int batchCount)#
hipblasStatus_t hipblasZgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *const AP[], int lda, const hipblasDoubleComplex *const x[], int incx, const hipblasDoubleComplex *beta, hipblasDoubleComplex *const y[], int incy, int batchCount)#

BLAS Level 2 API.

gemvBatched performs a batch of matrix-vector operations

y_i := alpha*A_i*x_i    + beta*y_i,   or
y_i := alpha*A_i**T*x_i + beta*y_i,   or
y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the i-th instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n matrix, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • trans[in] [hipblasOperation_t] indicates whether matrices A_i are tranposed (conjugated) or not

  • m[in] [int] number of rows of each matrix A_i

  • n[in] [int] number of columns of each matrix A_i

  • alpha[in] device pointer or host pointer to scalar alpha.

  • AP[in] device array of device pointers storing each matrix A_i.

  • lda[in] [int] specifies the leading dimension of each matrix A_i.

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each vector x_i.

  • beta[in] device pointer or host pointer to scalar beta.

  • y[inout] device array of device pointers storing each vector y_i.

  • incy[in] [int] specifies the increment for the elements of each vector y_i.

  • batchCount[in] [int] number of instances in the batch

hipblasStatus_t hipblasSgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, const float *beta, float *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasDgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, const double *beta, double *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasCgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const hipblasComplex *alpha, const hipblasComplex *AP, int lda, hipblasStride strideA, const hipblasComplex *x, int incx, hipblasStride stridex, const hipblasComplex *beta, hipblasComplex *y, int incy, hipblasStride stridey, int batchCount)#
hipblasStatus_t hipblasZgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *AP, int lda, hipblasStride strideA, const hipblasDoubleComplex *x, int incx, hipblasStride stridex, const hipblasDoubleComplex *beta, hipblasDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#

BLAS Level 2 API.

gemvStridedBatched performs a batch of matrix-vector operations

y_i := alpha*A_i*x_i    + beta*y_i,   or
y_i := alpha*A_i**T*x_i + beta*y_i,   or
y_i := alpha*A_i**H*x_i + beta*y_i,
where (A_i, x_i, y_i) is the i-th instance of the batch. alpha and beta are scalars, x_i and y_i are vectors and A_i is an m by n matrix, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • transA[in] [hipblasOperation_t] indicates whether matrices A_i are tranposed (conjugated) or not

  • m[in] [int] number of rows of matrices A_i

  • n[in] [int] number of columns of matrices A_i

  • alpha[in] device pointer or host pointer to scalar alpha.

  • AP[in] device pointer to the first matrix (A_1) in the batch.

  • lda[in] [int] specifies the leading dimension of matrices A_i.

  • strideA[in] [hipblasStride] stride from the start of one matrix (A_i) and the next one (A_i+1)

  • x[in] device pointer to the first vector (x_1) in the batch.

  • incx[in] [int] specifies the increment for the elements of vectors x_i.

  • stridex[in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stridex, however the user should take care to ensure that stridex is of appropriate size. When trans equals HIPBLAS_OP_N this typically means stridex >= n * incx, otherwise stridex >= m * incx.

  • beta[in] device pointer or host pointer to scalar beta.

  • y[inout] device pointer to the first vector (y_1) in the batch.

  • incy[in] [int] specifies the increment for the elements of vectors y_i.

  • stridey[in] [hipblasStride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stridey, however the user should take care to ensure that stridey is of appropriate size. When trans equals HIPBLAS_OP_N this typically means stridey >= m * incy, otherwise stridey >= n * incy. stridey should be non zero.

  • batchCount[in] [int] number of instances in the batch

hipblasXger + Batched, StridedBatched#

hipblasStatus_t hipblasSger(hipblasHandle_t handle, int m, int n, const float *alpha, const float *x, int incx, const float *y, int incy, float *AP, int lda)#
hipblasStatus_t hipblasDger(hipblasHandle_t handle, int m, int n, const double *alpha, const double *x, int incx, const double *y, int incy, double *AP, int lda)#
hipblasStatus_t hipblasCgeru(hipblasHandle_t handle, int m, int n, const hipblasComplex *alpha, const hipblasComplex *x, int incx, const hipblasComplex *y, int incy, hipblasComplex *AP, int lda)#
hipblasStatus_t hipblasCgerc(hipblasHandle_t handle, int m, int n, const hipblasComplex *alpha, const hipblasComplex *x, int incx, const hipblasComplex *y, int incy, hipblasComplex *AP, int lda)#
hipblasStatus_t hipblasZgeru(hipblasHandle_t handle, int m, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *x, int incx, const hipblasDoubleComplex *y, int incy, hipblasDoubleComplex *AP, int lda)#
hipblasStatus_t hipblasZgerc(hipblasHandle_t handle, int m, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *x, int incx, const hipblasDoubleComplex *y, int incy, hipblasDoubleComplex *AP, int lda)#

BLAS Level 2 API.

ger,geru,gerc performs the matrix-vector operations

A := A + alpha*x*y**T , OR
A := A + alpha*x*y**H for gerc
where alpha is a scalar, x and y are vectors, and A is an m by n matrix.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : s,d,c,z

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • m[in] [int] the number of rows of the matrix A.

  • n[in] [int] the number of columns of the matrix A.

  • alpha[in] device pointer or host pointer to scalar alpha.

  • x[in] device pointer storing vector x.

  • incx[in] [int] specifies the increment for the elements of x.

  • y[in] device pointer storing vector y.

  • incy[in] [int] specifies the increment for the elements of y.

  • AP[inout] device pointer storing matrix A.

  • lda[in] [int] specifies the leading dimension of A.

hipblasStatus_t hipblasSgerBatched(hipblasHandle_t handle, int m, int n, const float *alpha, const float *const x[], int incx, const float *const y[], int incy, float *const AP[], int lda, int batchCount)#
hipblasStatus_t hipblasDgerBatched(hipblasHandle_t handle, int m, int n, const double *alpha, const double *const x[], int incx, const double *const y[], int incy, double *const AP[], int lda, int batchCount)#
hipblasStatus_t hipblasCgeruBatched(hipblasHandle_t handle, int m, int n, const hipblasComplex *alpha, const hipblasComplex *const x[], int incx, const hipblasComplex *const y[], int incy, hipblasComplex *const AP[], int lda, int batchCount)#
hipblasStatus_t hipblasCgercBatched(hipblasHandle_t handle, int m, int n, const hipblasComplex *alpha, const hipblasComplex *const x[], int incx, const hipblasComplex *const y[], int incy, hipblasComplex *const AP[], int lda, int batchCount)#
hipblasStatus_t hipblasZgeruBatched(hipblasHandle_t handle, int m, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *const x[], int incx, const hipblasDoubleComplex *const y[], int incy, hipblasDoubleComplex *const AP[], int lda, int batchCount)#
hipblasStatus_t hipblasZgercBatched(hipblasHandle_t handle, int m, int n, const hipblasDoubleComplex *alpha, const hipblasDoubleComplex *const x[], int incx, const hipblasDoubleComplex *const y[], int incy, hipblasDoubleComplex *const AP[], int lda, int batchCount)#

BLAS Level 2 API.

gerBatched,geruBatched,gercBatched performs a batch of the matrix-vector operations

A := A + alpha*x*y**T , OR
A := A + alpha*x*y**H for gerc
where (A_i, x_i, y_i) is the i-th instance of the batch. alpha is a scalar, x_i and y_i are vectors and A_i is an m by n matrix, for i = 1, …, batchCount.

  • Supported precisions in rocBLAS : s,d,c,z

  • Supported precisions in cuBLAS : No support

Parameters:
  • handle[in] [hipblasHandle_t] handle to the hipblas library context queue.

  • m[in] [int] the number of rows of each matrix A_i.

  • n[in] [int] the number of columns of eaceh matrix A_i.

  • alpha[in] device pointer or host pointer to scalar alpha.

  • x[in] device array of device pointers storing each vector x_i.

  • incx[in] [int] specifies the increment for the elements of each vector x_i.

  • y[in] device array of device pointers storing each vector y_i.

  • incy[in] [int] specifies the increment for the elements of each vector y_i.

  • AP[inout] device array of device pointers storing each matrix A_i.

  • lda[in] [int] specifies the leading dimension of each A_i.

  • batchCount[in] [int] number of instances in the batch

hipblasStatus_t hipblasSgerStridedBatched(hipblasHandle_t handle, int m, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, const float *y, int incy, hipblasStride stridey, float *AP, int lda, hipblasStride strideA, int batchCount)#
hipblasStatus_t hipblasDgerStridedBatched(hipblasHandle_t handle, int m, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, const double *y, int incy, hipblasStride stridey, double *AP, int lda, hipblasStride strideA, int batchCount)#
hipblasStatus_t hipblasCgeruStridedBatched(hipblasHandle_t handle, int m, int n, const hipblasComplex *alpha, const hipblasComplex *x, int incx, hipblasStride stridex, const hipblasComplex *y, int incy, hipblasStride stridey, hipblasComplex *AP, int lda, hipblasStride strideA, int batchCount)#
hipblasStatus_t hipblasCgercStridedBatched(hipblasHandle_t handle, int m, int n, const hipblasComplex *alpha, const hipblasComplex *x, int incx, hipblasStride stridex, const hipblasComplex *y, int incy, hipblasStride stridey, hipblasComplex *AP, int lda, hipblasStride strideA, int batchCount)#
hipblasSt