hipBLAS API#
The topic discusses technical aspects of the hipBLAS API and provides reference information about the API functions.
The hipBLAS interface#
The hipBLAS interface is compatible with the rocBLAS and cuBLAS-v2 APIs. Porting a CUDA application which originally called the cuBLAS API to an application calling the hipBLAS API should be relatively straightforward.
GEMV API#
For example, the hipBLAS SGEMV interface is:
hipblasStatus_t
hipblasSgemv(hipblasHandle_t handle,
hipblasOperation_t trans,
int m, int n, const float *alpha,
const float *A, int lda,
const float *x, int incx, const float *beta,
float *y, int incy );
Batched and strided GEMM API#
hipBLAS GEMM can process matrices in batches with regular strides by using the strided-batched version of the API:
hipblasStatus_t
hipblasSgemmStridedBatched(hipblasHandle_t handle,
hipblasOperation_t transa, hipblasOperation_t transb,
int m, int n, int k, const float *alpha,
const float *A, int lda, long long bsa,
const float *B, int ldb, long long bsb, const float *beta,
float *C, int ldc, long long bsc,
int batchCount);
hipBLAS assumes matrix A and vectors x and y are allocated in the GPU memory space for data.
You are responsible for copying data to and from the host and device memory.
Naming conventions#
hipBLAS follows the following naming conventions:
Upper case for a matrix, for example, matrix A, B, C GEMM (C = A*B)
Lower case for a vector, for example, vector x, y GEMV (y = A*x)
Notations#
hipBLAS function uses the following notations to denote precisions:
h = half
bf = 16-bit brain floating point
s = single
d = double
c = single complex
z = double complex
hipBLAS backends#
hipBLAS has multiple backends for different platforms: cuBLAS for the NVIDIA
platform and rocBLAS for the AMD platform. The cuBLAS backend does not support
all the functions and only returns the HIPBLAS_STATUS_NOT_SUPPORTED status
code.
The following level 1-3 and solver functions are not supported with the cuBLAS backend:
AXPY functions with half.
DOT functions with half and bfloat16.
SPR functions with
std:complex<float>andstd:complex<double>.All the batched functions except for TRSM, GEMV, and GEMM and solver functions (GETRF, GETRS, GEQRF, GELS).
GETRF, GETRS, GEQRF, and GELS non-batched and strided_batched functions.
ILP64 interfaces#
The hipBLAS library Level-1 functions are also provided with ILP64 interfaces.
With these interfaces, all int arguments are replaced with the typename
int64_t. These ILP64 function names all end with the _64 suffix.
The only output arguments that change are for
xMAX and xMIN, for which the index is now int64_t. Function level documentation is not
repeated for these APIs because they are identical in behavior to the LP64 versions.
However functions that support this alternate API include the line:
This function supports the 64-bit integer interface.
The functionality of the ILP64 interfaces depends on the backend being used, see the rocBLAS or NVIDIA CUDA cuBLAS documentation for more information about support for ILP64 interfaces.
Atomic operations#
Some hipBLAS functions might use atomic operations to increase performance.
This can cause these functions to give results that are not bit-wise reproducible.
By default, the rocBLAS backend allows the use of atomics while the CUDA cuBLAS backend disallows their use.
To set the desired behavior, users can call
hipblasSetAtomicsMode(). See the rocBLAS or CUDA
cuBLAS documentation for more specific information about atomic operations in the backend library.
Graph support for hipBLAS#
Graph support (also referred to as stream capture support) for hipBLAS depends on the backend being used. If rocBLAS is the backend, see the rocBLAS documentation. Similarly, if CUDA cuBLAS is the backend, see the cuBLAS documentation.
Custom data types#
hipBlas defines the hipblasBfloat16. For more details, see
Custom types.
hipBLAS types#
For information about the hipblasStatus_t, hipblasComputeType_t, and hipblasOperation_t enumerations,
see hipblas-common.h in the hipBLAS-common GitHub repository.
Definitions#
hipblasHandle_t#
-
typedef void *hipblasHandle_t#
hipblasHandle_tis a void pointer that stores the library context (either rocBLAS or cuBLAS).
hipblasHalf#
-
typedef uint16_t hipblasHalf#
To specify the datatype as an unsigned short.
hipblasInt8#
-
typedef int8_t hipblasInt8#
To specify the datatype as a signed char.
hipblasStride#
-
typedef int64_t hipblasStride#
Stride between matrices or vectors in strided_batched functions.
hipblasBfloat16#
-
struct hipblasBfloat16#
Struct to represent a 16-bit Brain floating-point number.
Enums#
Enumeration constants have numbering that is consistent with CBLAS, ACML, and most standard C BLAS libraries.
hipblasStatus_t#
For information about hipblasStatus_t,
see hipblas-common.h in the hipBLAS-common GitHub repository.
hipblasOperation_t#
For information about hipblasOperation_t,
see hipblas-common.h in the hipBLAS-common GitHub repository.
hipblasPointerMode_t#
-
enum hipblasPointerMode_t#
Indicates whether scalar pointers are on the host or device. This is used for scalars alpha and beta and for scalar function return values.
Values:
-
enumerator HIPBLAS_POINTER_MODE_HOST#
Scalar values affected by this variable will be located on the host.
-
enumerator HIPBLAS_POINTER_MODE_DEVICE#
Scalar values affected by this variable will be located on the device.
-
enumerator HIPBLAS_POINTER_MODE_HOST#
hipblasFillMode_t#
-
enum hipblasFillMode_t#
Used by the Hermitian, symmetric, and triangular matrix routines to specify whether the upper or lower triangle is being referenced.
Values:
-
enumerator HIPBLAS_FILL_MODE_UPPER#
Upper triangle.
-
enumerator HIPBLAS_FILL_MODE_LOWER#
Lower triangle.
-
enumerator HIPBLAS_FILL_MODE_FULL#
-
enumerator HIPBLAS_FILL_MODE_UPPER#
hipblasDiagType_t#
hipblasSideMode_t#
-
enum hipblasSideMode_t#
Indicates the side matrix A is located on, relative to matrix B, during multiplication.
Values:
-
enumerator HIPBLAS_SIDE_LEFT#
Multiply general matrix by symmetric, Hermitian, or triangular matrix on the left.
-
enumerator HIPBLAS_SIDE_RIGHT#
Multiply general matrix by symmetric, Hermitian, or triangular matrix on the right.
-
enumerator HIPBLAS_SIDE_BOTH#
-
enumerator HIPBLAS_SIDE_LEFT#
hipblasComputeType_t#
For information about hipblasComputeType_t,
see hipblas-common.h in the hipBLAS-common GitHub repository.
hipblasGemmAlgo_t#
hipblasAtomicsMode_t#
-
enum hipblasAtomicsMode_t#
Indicates whether atomics operations are allowed. Not allowing atomic operations can generally improve determinism and repeatability of results at a cost of performance. By default, the rocBLAS backend allows atomic operations, while the cuBLAS backend disallows atomic operations. See the backend documentation for more details.
Values:
-
enumerator HIPBLAS_ATOMICS_NOT_ALLOWED#
Algorithms will refrain from atomics where applicable.
-
enumerator HIPBLAS_ATOMICS_ALLOWED#
Algorithms will take advantage of atomics where applicable.
-
enumerator HIPBLAS_ATOMICS_NOT_ALLOWED#
hipBLAS functions#
Level 1 BLAS#
hipblasIXamax + Batched, StridedBatched#
-
hipblasStatus_t hipblasIsamax(hipblasHandle_t handle, int n, const float *x, int incx, int *result)#
-
hipblasStatus_t hipblasIdamax(hipblasHandle_t handle, int n, const double *x, int incx, int *result)#
-
hipblasStatus_t hipblasIcamax(hipblasHandle_t handle, int n, const hipComplex *x, int incx, int *result)#
-
hipblasStatus_t hipblasIzamax(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, int *result)#
BLAS Level 1 API
The amax functions find the first index of the element of maximum magnitude of a vector
x.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the amax index. Return value is 0.0 if n, incx<=0.
The amax function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasIsamaxBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, int *result)#
-
hipblasStatus_t hipblasIdamaxBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, int *result)#
-
hipblasStatus_t hipblasIcamaxBatched(hipblasHandle_t handle, int n, const hipComplex *const x[], int incx, int batchCount, int *result)#
-
hipblasStatus_t hipblasIzamaxBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *const x[], int incx, int batchCount, int *result)#
BLAS Level 1 API
The amaxBatched functions find the first index of the element of maximum magnitude of each vector
x_iin a batch, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each vector x_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
batchCount – [in] [int] number of instances in the batch. Must be > 0.
result – [out] device or host array of pointers of batchCount size for results. Return value is 0 if n, incx<=0.
The amaxBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasIsamaxStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, int *result)#
-
hipblasStatus_t hipblasIdamaxStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, int *result)#
-
hipblasStatus_t hipblasIcamaxStridedBatched(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#
-
hipblasStatus_t hipblasIzamaxStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#
BLAS Level 1 API
The amaxStridedBatched functions find the first index of the element of maximum magnitude of each vector
x_iin a batch, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each vector x_i.
x – [in] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [hipblasStride] specifies the pointer increment between one x_i and the next x_(i + 1).
batchCount – [in] [int] number of instances in the batch.
result – [out] device or host pointer for storing contiguous batchCount results. Return value is 0 if n <= 0, incx<=0.
The amaxStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasIXamin + Batched, StridedBatched#
-
hipblasStatus_t hipblasIsamin(hipblasHandle_t handle, int n, const float *x, int incx, int *result)#
-
hipblasStatus_t hipblasIdamin(hipblasHandle_t handle, int n, const double *x, int incx, int *result)#
-
hipblasStatus_t hipblasIcamin(hipblasHandle_t handle, int n, const hipComplex *x, int incx, int *result)#
-
hipblasStatus_t hipblasIzamin(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, int *result)#
BLAS Level 1 API
The amin functions find the first index of the element of minimum magnitude of a vector
x.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the amin index. Return value is 0.0 if n, incx<=0.
The amin function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasIsaminBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, int *result)#
-
hipblasStatus_t hipblasIdaminBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, int *result)#
-
hipblasStatus_t hipblasIcaminBatched(hipblasHandle_t handle, int n, const hipComplex *const x[], int incx, int batchCount, int *result)#
-
hipblasStatus_t hipblasIzaminBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *const x[], int incx, int batchCount, int *result)#
BLAS Level 1 API
The aminBatched functions find the first index of the element of minimum magnitude of each vector
x_iin a batch, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each vector x_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
batchCount – [in] [int] number of instances in the batch. Must be > 0.
result – [out] device or host pointers to array of batchCount size for results. Return value is 0 if n, incx<=0.
The aminBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasIsaminStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, int *result)#
-
hipblasStatus_t hipblasIdaminStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, int *result)#
-
hipblasStatus_t hipblasIcaminStridedBatched(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#
-
hipblasStatus_t hipblasIzaminStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, int *result)#
BLAS Level 1 API
The aminStridedBatched functions find the first index of the element of minimum magnitude of each vector
x_iin a batch, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each vector x_i.
x – [in] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [hipblasStride] specifies the pointer increment between one x_i and the next x_(i + 1).
batchCount – [in] [int] number of instances in the batch.
result – [out] device or host pointer to array for storing contiguous batchCount results. Return value is 0 if n <= 0, incx<=0.
The aminStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXasum + Batched, StridedBatched#
-
hipblasStatus_t hipblasSasum(hipblasHandle_t handle, int n, const float *x, int incx, float *result)#
-
hipblasStatus_t hipblasDasum(hipblasHandle_t handle, int n, const double *x, int incx, double *result)#
-
hipblasStatus_t hipblasScasum(hipblasHandle_t handle, int n, const hipComplex *x, int incx, float *result)#
-
hipblasStatus_t hipblasDzasum(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, double *result)#
BLAS Level 1 API
The asum functions compute the sum of the magnitudes of elements of a real vector
x, or the sum of the magnitudes of the real and imaginary parts of elements ifxis a complex vector.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x. incx must be > 0.
result – [inout] device pointer or host pointer to store the asum product. Return value is 0.0 if n <= 0.
The asum function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSasumBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, float *result)#
-
hipblasStatus_t hipblasDasumBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, double *result)#
-
hipblasStatus_t hipblasScasumBatched(hipblasHandle_t handle, int n, const hipComplex *const x[], int incx, int batchCount, float *result)#
-
hipblasStatus_t hipblasDzasumBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *const x[], int incx, int batchCount, double *result)#
BLAS Level 1 API
The asumBatched functions computes the sum of the magnitudes of the elements in a batch of real vectors
x_i, or the sum of the magnitudes of the real and imaginary parts of elements ifx_iis a complex vector, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each vector x_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
batchCount – [in] [int] number of instances in the batch.
result – [out] device array or host array of batchCount size for results. Return value is 0.0 if n, incx<=0.
The asumBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSasumStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, float *result)#
-
hipblasStatus_t hipblasDasumStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, double *result)#
-
hipblasStatus_t hipblasScasumStridedBatched(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipblasStride stridex, int batchCount, float *result)#
-
hipblasStatus_t hipblasDzasumStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, double *result)#
BLAS Level 1 API
The asumStridedBatched functions compute the sum of the magnitudes of elements of real vectors
x_i, or the sum of the magnitudes of the real and imaginary parts of elements ifx_iis a complex vector, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each vector x_i.
x – [in] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stride_x. However, the user should take care to ensure that stride_x is of an appropriate size. For a typical case, this means stride_x >= n * incx.
batchCount – [in] [int] number of instances in the batch.
result – [out] device pointer or host pointer to array for storing contiguous batchCount results. Return value is 0.0 if n, incx<=0.
The asumStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXaxpy + Batched, StridedBatched#
-
hipblasStatus_t hipblasHaxpy(hipblasHandle_t handle, int n, const hipblasHalf *alpha, const hipblasHalf *x, int incx, hipblasHalf *y, int incy)#
-
hipblasStatus_t hipblasSaxpy(hipblasHandle_t handle, int n, const float *alpha, const float *x, int incx, float *y, int incy)#
-
hipblasStatus_t hipblasDaxpy(hipblasHandle_t handle, int n, const double *alpha, const double *x, int incx, double *y, int incy)#
-
hipblasStatus_t hipblasCaxpy(hipblasHandle_t handle, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZaxpy(hipblasHandle_t handle, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipDoubleComplex *y, int incy)#
BLAS Level 1 API
The axpy functions compute a constant
alphamultiplied by vectorxplus vectory.y := alpha * x + y
Supported precisions in rocBLAS :
h,s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
alpha – [in] device pointer or host pointer to specify the scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [out] device pointer storing vector y.
incy – [inout] [int] specifies the increment for the elements of y.
The axpy function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasHaxpyBatched(hipblasHandle_t handle, int n, const hipblasHalf *alpha, const hipblasHalf *const x[], int incx, hipblasHalf *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasSaxpyBatched(hipblasHandle_t handle, int n, const float *alpha, const float *const x[], int incx, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDaxpyBatched(hipblasHandle_t handle, int n, const double *alpha, const double *const x[], int incx, double *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasCaxpyBatched(hipblasHandle_t handle, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZaxpyBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 1 API
The axpyBatched functions compute
y := alpha * x + yover a set of batched vectors.Supported precisions in rocBLAS :
h,s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
alpha – [in] specifies the scalar alpha.
x – [in] pointer storing vector x on the GPU.
incx – [in] [int] specifies the increment for the elements of x.
y – [out] pointer storing vector y on the GPU.
incy – [inout] [int] specifies the increment for the elements of y.
batchCount – [in] [int] number of instances in the batch.
The axpyBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasHaxpyStridedBatched(hipblasHandle_t handle, int n, const hipblasHalf *alpha, const hipblasHalf *x, int incx, hipblasStride stridex, hipblasHalf *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasSaxpyStridedBatched(hipblasHandle_t handle, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDaxpyStridedBatched(hipblasHandle_t handle, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasCaxpyStridedBatched(hipblasHandle_t handle, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZaxpyStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 1 API
The axpyStridedBatched functions compute
y := alpha * x + yover a set of strided batched vectors.Supported precisions in rocBLAS :
h,s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int]
alpha – [in] specifies the scalar alpha.
x – [in] pointer storing vector x on the GPU.
incx – [in] [int] specifies the increment for the elements of x.
stridex – [in] [hipblasStride] specifies the increment between vectors of x.
y – [out] pointer storing vector y on the GPU.
incy – [inout] [int] specifies the increment for the elements of y.
stridey – [in] [hipblasStride] specifies the increment between vectors of y.
batchCount – [in] [int] number of instances in the batch.
The axpyStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXcopy + Batched, StridedBatched#
-
hipblasStatus_t hipblasScopy(hipblasHandle_t handle, int n, const float *x, int incx, float *y, int incy)#
-
hipblasStatus_t hipblasDcopy(hipblasHandle_t handle, int n, const double *x, int incx, double *y, int incy)#
-
hipblasStatus_t hipblasCcopy(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZcopy(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipDoubleComplex *y, int incy)#
BLAS Level 1 API
The copy functions copy each element
x[i]intoy[i], fori= 1 , … ,n.y := x,
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x to be copied to y.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [out] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
The copy function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasScopyBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDcopyBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, double *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasCcopyBatched(hipblasHandle_t handle, int n, const hipComplex *const x[], int incx, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZcopyBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *const x[], int incx, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 1 API
The copyBatched functions copy each element
x_i[j]intoy_i[j], forj= 1 , … ,n;i= 1 , … ,batchCount.where (y_i := x_i,
x_i,y_i) is thei-th instance of the batch.x_iandy_iare vectors.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i to be copied to y_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
y – [out] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
batchCount – [in] [int] number of instances in the batch.
The copyBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasScopyStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDcopyStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasCcopyStridedBatched(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipblasStride stridex, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZcopyStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 1 API
The copyStridedBatched functions copy each element
x_i[j]intoy_i[j], forj= 1 , … ,n;i= 1 , … ,batchCount.where (y_i := x_i,
x_i,y_i) is thei-th instance of the batch.x_iandy_iare vectors.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i to be copied to y_i.
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [int] specifies the increments for the elements of vectors x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. For a typical case, this means stridex >= n * incx.
y – [out] device pointer to the first vector (y_1) in the batch.
incy – [in] [int] specifies the increment for the elements of vectors y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should ensure that stridey is of an appropriate size. For a typical case this means stridey >= n * incy. stridey should be non zero.
batchCount – [in] [int] number of instances in the batch.
The copyStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXdot + Batched, StridedBatched#
-
hipblasStatus_t hipblasHdot(hipblasHandle_t handle, int n, const hipblasHalf *x, int incx, const hipblasHalf *y, int incy, hipblasHalf *result)#
-
hipblasStatus_t hipblasBfdot(hipblasHandle_t handle, int n, const hipblasBfloat16 *x, int incx, const hipblasBfloat16 *y, int incy, hipblasBfloat16 *result)#
-
hipblasStatus_t hipblasSdot(hipblasHandle_t handle, int n, const float *x, int incx, const float *y, int incy, float *result)#
-
hipblasStatus_t hipblasDdot(hipblasHandle_t handle, int n, const double *x, int incx, const double *y, int incy, double *result)#
-
hipblasStatus_t hipblasCdotc(hipblasHandle_t handle, int n, const hipComplex *x, int incx, const hipComplex *y, int incy, hipComplex *result)#
-
hipblasStatus_t hipblasCdotu(hipblasHandle_t handle, int n, const hipComplex *x, int incx, const hipComplex *y, int incy, hipComplex *result)#
-
hipblasStatus_t hipblasZdotc(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, const hipDoubleComplex *y, int incy, hipDoubleComplex *result)#
-
hipblasStatus_t hipblasZdotu(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, const hipDoubleComplex *y, int incy, hipDoubleComplex *result)#
BLAS Level 1 API
The dot(u) functions performs the dot product of vectors
xandy.The dotc functions performs the dot product of the conjugate of complex vectorresult = x * y;
xand complex vectory.result = conjugate (x) * y;
Supported precisions in rocBLAS :
h,bf,s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of y.
y – [in] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the dot product. Return value is 0.0 if n <= 0.
The dot function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasHdotBatched(hipblasHandle_t handle, int n, const hipblasHalf *const x[], int incx, const hipblasHalf *const y[], int incy, int batchCount, hipblasHalf *result)#
-
hipblasStatus_t hipblasBfdotBatched(hipblasHandle_t handle, int n, const hipblasBfloat16 *const x[], int incx, const hipblasBfloat16 *const y[], int incy, int batchCount, hipblasBfloat16 *result)#
-
hipblasStatus_t hipblasSdotBatched(hipblasHandle_t handle, int n, const float *const x[], int incx, const float *const y[], int incy, int batchCount, float *result)#
-
hipblasStatus_t hipblasDdotBatched(hipblasHandle_t handle, int n, const double *const x[], int incx, const double *const y[], int incy, int batchCount, double *result)#
-
hipblasStatus_t hipblasCdotcBatched(hipblasHandle_t handle, int n, const hipComplex *const x[], int incx, const hipComplex *const y[], int incy, int batchCount, hipComplex *result)#
-
hipblasStatus_t hipblasCdotuBatched(hipblasHandle_t handle, int n, const hipComplex *const x[], int incx, const hipComplex *const y[], int incy, int batchCount, hipComplex *result)#
-
hipblasStatus_t hipblasZdotcBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *const y[], int incy, int batchCount, hipDoubleComplex *result)#
-
hipblasStatus_t hipblasZdotuBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *const y[], int incy, int batchCount, hipDoubleComplex *result)#
BLAS Level 1 API
The dot(u)Batched functions perform a batch of dot products of vectors
xandy.The dotcBatched functions performs a batch of dot products of the conjugate of complex vectorresult_i = x_i * y_i;
xand complex vectory.where (result_i = conjugate (x_i) * y_i;
x_i,y_i) is thei-th instance of the batch andx_iandy_iare vectors, fori= 1, …,batchCount.Supported precisions in rocBLAS :
h,bf,s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
batchCount – [in] [int] number of instances in the batch.
result – [inout] device array or host array of batchCount size to store the dot products of each batch. Returns 0.0 for each element if n <= 0.
The dotBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasHdotStridedBatched(hipblasHandle_t handle, int n, const hipblasHalf *x, int incx, hipblasStride stridex, const hipblasHalf *y, int incy, hipblasStride stridey, int batchCount, hipblasHalf *result)#
-
hipblasStatus_t hipblasBfdotStridedBatched(hipblasHandle_t handle, int n, const hipblasBfloat16 *x, int incx, hipblasStride stridex, const hipblasBfloat16 *y, int incy, hipblasStride stridey, int batchCount, hipblasBfloat16 *result)#
-
hipblasStatus_t hipblasSdotStridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, const float *y, int incy, hipblasStride stridey, int batchCount, float *result)#
-
hipblasStatus_t hipblasDdotStridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, const double *y, int incy, hipblasStride stridey, int batchCount, double *result)#
-
hipblasStatus_t hipblasCdotcStridedBatched(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *y, int incy, hipblasStride stridey, int batchCount, hipComplex *result)#
-
hipblasStatus_t hipblasCdotuStridedBatched(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *y, int incy, hipblasStride stridey, int batchCount, hipComplex *result)#
-
hipblasStatus_t hipblasZdotcStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount, hipDoubleComplex *result)#
-
hipblasStatus_t hipblasZdotuStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount, hipDoubleComplex *result)#
BLAS Level 1 API
The dot(u)StridedBatched functions perform a batch of dot products of vectors
xandy.The dotcStridedBatched functions perform a batch of dot products of the conjugate of complex vectorresult_i = x_i * y_i;
xand complex vectory.where (result_i = conjugate (x_i) * y_i;
x_i,y_i) is thei-th instance of the batch andx_iandy_iare vectors, fori= 1, …,batchCount.Supported precisions in rocBLAS :
h,bf,s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
y – [in] device pointer to the first vector (y_1) in the batch.
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
batchCount – [in] [int] number of instances in the batch.
result – [inout] device array or host array of batchCount size to store the dot products of each batch. Returns 0.0 for each element if n <= 0.
The dotStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXnrm2 + Batched, StridedBatched#
-
hipblasStatus_t hipblasSnrm2(hipblasHandle_t handle, int n, const float *x, int incx, float *result)#
-
hipblasStatus_t hipblasDnrm2(hipblasHandle_t handle, int n, const double *x, int incx, double *result)#
-
hipblasStatus_t hipblasScnrm2(hipblasHandle_t handle, int n, const hipComplex *x, int incx, float *result)#
-
hipblasStatus_t hipblasDznrm2(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, double *result)#
BLAS Level 1 API
The nrm2 functions compute the Euclidean norm of a real or complex vector.
result := sqrt( x'*x ) for real vectors result := sqrt( x**H*x ) for complex vectors
Supported precisions in rocBLAS :
s,d,c,z,sc, anddz.Supported precisions in cuBLAS :
s,d,sc, anddz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the nrm2 product. Return value is 0.0 if n, incx<=0.
The nrm2 function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSnrm2Batched(hipblasHandle_t handle, int n, const float *const x[], int incx, int batchCount, float *result)#
-
hipblasStatus_t hipblasDnrm2Batched(hipblasHandle_t handle, int n, const double *const x[], int incx, int batchCount, double *result)#
-
hipblasStatus_t hipblasScnrm2Batched(hipblasHandle_t handle, int n, const hipComplex *const x[], int incx, int batchCount, float *result)#
-
hipblasStatus_t hipblasDznrm2Batched(hipblasHandle_t handle, int n, const hipDoubleComplex *const x[], int incx, int batchCount, double *result)#
BLAS Level 1 API
The nrm2Batched functions compute the Euclidean norm over a batch of real or complex vectors.
result := sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batchCount result := sqrt( x_i**H*x_i ) for complex vectors x, for i = 1, ..., batchCount
Supported precisions in rocBLAS :
s,d,c,z,sc, anddz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each x_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
batchCount – [in] [int] number of instances in the batch.
result – [out] device pointer or host pointer to array of batchCount size for nrm2 results. Return value is 0.0 for each element if n <= 0, incx<=0.
The nrm2Batched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSnrm2StridedBatched(hipblasHandle_t handle, int n, const float *x, int incx, hipblasStride stridex, int batchCount, float *result)#
-
hipblasStatus_t hipblasDnrm2StridedBatched(hipblasHandle_t handle, int n, const double *x, int incx, hipblasStride stridex, int batchCount, double *result)#
-
hipblasStatus_t hipblasScnrm2StridedBatched(hipblasHandle_t handle, int n, const hipComplex *x, int incx, hipblasStride stridex, int batchCount, float *result)#
-
hipblasStatus_t hipblasDznrm2StridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount, double *result)#
BLAS Level 1 API
The nrm2StridedBatched functions compute the Euclidean norm over a batch of real or complex vectors.
:= sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batchCount := sqrt( x_i**H*x_i ) for complex vectors, for i = 1, ..., batchCount
Supported precisions in rocBLAS :
s,d,c,z,sc, anddz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each x_i.
x – [in] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. For a typical case, this means stridex >= n * incx.
batchCount – [in] [int] number of instances in the batch.
result – [out] device pointer or host pointer to array for storing contiguous batchCount results. Return value is 0.0 for each element if n <= 0, incx<=0.
The nrm2StridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXrot + Batched, StridedBatched#
-
hipblasStatus_t hipblasSrot(hipblasHandle_t handle, int n, float *x, int incx, float *y, int incy, const float *c, const float *s)#
-
hipblasStatus_t hipblasDrot(hipblasHandle_t handle, int n, double *x, int incx, double *y, int incy, const double *c, const double *s)#
-
hipblasStatus_t hipblasCrot(hipblasHandle_t handle, int n, hipComplex *x, int incx, hipComplex *y, int incy, const float *c, const hipComplex *s)#
-
hipblasStatus_t hipblasCsrot(hipblasHandle_t handle, int n, hipComplex *x, int incx, hipComplex *y, int incy, const float *c, const float *s)#
-
hipblasStatus_t hipblasZrot(hipblasHandle_t handle, int n, hipDoubleComplex *x, int incx, hipDoubleComplex *y, int incy, const double *c, const hipDoubleComplex *s)#
-
hipblasStatus_t hipblasZdrot(hipblasHandle_t handle, int n, hipDoubleComplex *x, int incx, hipDoubleComplex *y, int incy, const double *c, const double *s)#
BLAS Level 1 API
The rot functions apply the Givens rotation matrix defined by
c=cos(alpha)ands=sin(alpha)to vectorsxandy. Scalarscandscan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode.Supported precisions in rocBLAS :
s,d,c,z,sc, anddz.Supported precisions in cuBLAS :
s,d,c,z,cs, andzd.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in the x and y vectors.
x – [inout] device pointer storing vector x.
incx – [in] [int] specifies the increment between elements of x.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment between elements of y.
c – [in] device pointer or host pointer storing the scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer storing the scalar sine component of the rotation matrix.
The rot function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotBatched(hipblasHandle_t handle, int n, float *const x[], int incx, float *const y[], int incy, const float *c, const float *s, int batchCount)#
-
hipblasStatus_t hipblasDrotBatched(hipblasHandle_t handle, int n, double *const x[], int incx, double *const y[], int incy, const double *c, const double *s, int batchCount)#
-
hipblasStatus_t hipblasCrotBatched(hipblasHandle_t handle, int n, hipComplex *const x[], int incx, hipComplex *const y[], int incy, const float *c, const hipComplex *s, int batchCount)#
-
hipblasStatus_t hipblasCsrotBatched(hipblasHandle_t handle, int n, hipComplex *const x[], int incx, hipComplex *const y[], int incy, const float *c, const float *s, int batchCount)#
-
hipblasStatus_t hipblasZrotBatched(hipblasHandle_t handle, int n, hipDoubleComplex *const x[], int incx, hipDoubleComplex *const y[], int incy, const double *c, const hipDoubleComplex *s, int batchCount)#
-
hipblasStatus_t hipblasZdrotBatched(hipblasHandle_t handle, int n, hipDoubleComplex *const x[], int incx, hipDoubleComplex *const y[], int incy, const double *c, const double *s, int batchCount)#
BLAS Level 1 API
The rotBatched functions apply the Givens rotation matrix defined by
c=cos(alpha)ands=sin(alpha)to batched vectorsx_iandy_i, fori= 1, …,batchCount. Scalarscandscan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode.Supported precisions in rocBLAS :
s,d,sc, anddz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each x_i and y_i vectors.
x – [inout] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment between elements of each x_i.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment between elements of each y_i.
c – [in] device pointer or host pointer to the scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer to the scalar sine component of the rotation matrix.
batchCount – [in] [int] the number of x and y arrays, that is, the number of batches.
The rotBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotStridedBatched(hipblasHandle_t handle, int n, float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, const float *c, const float *s, int batchCount)#
-
hipblasStatus_t hipblasDrotStridedBatched(hipblasHandle_t handle, int n, double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, const double *c, const double *s, int batchCount)#
-
hipblasStatus_t hipblasCrotStridedBatched(hipblasHandle_t handle, int n, hipComplex *x, int incx, hipblasStride stridex, hipComplex *y, int incy, hipblasStride stridey, const float *c, const hipComplex *s, int batchCount)#
-
hipblasStatus_t hipblasCsrotStridedBatched(hipblasHandle_t handle, int n, hipComplex *x, int incx, hipblasStride stridex, hipComplex *y, int incy, hipblasStride stridey, const float *c, const float *s, int batchCount)#
-
hipblasStatus_t hipblasZrotStridedBatched(hipblasHandle_t handle, int n, hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *y, int incy, hipblasStride stridey, const double *c, const hipDoubleComplex *s, int batchCount)#
-
hipblasStatus_t hipblasZdrotStridedBatched(hipblasHandle_t handle, int n, hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *y, int incy, hipblasStride stridey, const double *c, const double *s, int batchCount)#
BLAS Level 1 API
The rotStridedBatched functions apply the Givens rotation matrix defined by
c=cos(alpha)ands=sin(alpha)to strided batched vectorsx_iandy_i, fori= 1, …,batchCount. Scalarscandscan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode.Supported precisions in rocBLAS :
s,d,sc, anddz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each x_i and y_i vectors.
x – [inout] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment between elements of each x_i.
stridex – [in] [hipblasStride] specifies the increment from the beginning of x_i to the beginning of x_(i+1).
y – [inout] device pointer to the first vector y_1.
incy – [in] [int] specifies the increment between elements of each y_i.
stridey – [in] [hipblasStride] specifies the increment from the beginning of y_i to the beginning of y_(i+1).
c – [in] device pointer or host pointer to the scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer to the scalar sine component of the rotation matrix.
batchCount – [in] [int] the number of x and y arrays, that is, the number of batches.
The rotStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXrotg + Batched, StridedBatched#
-
hipblasStatus_t hipblasSrotg(hipblasHandle_t handle, float *a, float *b, float *c, float *s)#
-
hipblasStatus_t hipblasDrotg(hipblasHandle_t handle, double *a, double *b, double *c, double *s)#
-
hipblasStatus_t hipblasCrotg(hipblasHandle_t handle, hipComplex *a, hipComplex *b, float *c, hipComplex *s)#
-
hipblasStatus_t hipblasZrotg(hipblasHandle_t handle, hipDoubleComplex *a, hipDoubleComplex *b, double *c, hipDoubleComplex *s)#
BLAS Level 1 API
The rotg functions create the Givens rotation matrix for the vector
(a b). Scalarscandsand arraysaandbcan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode. If the pointer mode is set toHIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set toHIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
a – [inout] device pointer or host pointer to the input vector element, overwritten with r.
b – [inout] device pointer or host pointer to the input vector element, overwritten with z.
c – [inout] device pointer or host pointer to the cosine element of the Givens rotation.
s – [inout] device pointer or host pointer to the sine element of the Givens rotation.
The rotg function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotgBatched(hipblasHandle_t handle, float *const a[], float *const b[], float *const c[], float *const s[], int batchCount)#
-
hipblasStatus_t hipblasDrotgBatched(hipblasHandle_t handle, double *const a[], double *const b[], double *const c[], double *const s[], int batchCount)#
-
hipblasStatus_t hipblasCrotgBatched(hipblasHandle_t handle, hipComplex *const a[], hipComplex *const b[], float *const c[], hipComplex *const s[], int batchCount)#
-
hipblasStatus_t hipblasZrotgBatched(hipblasHandle_t handle, hipDoubleComplex *const a[], hipDoubleComplex *const b[], double *const c[], hipDoubleComplex *const s[], int batchCount)#
BLAS Level 1 API
The rotgBatched functions create the Givens rotation matrix for the batched vectors
(a_i b_i), fori= 1, …,batchCount.a,b,c, andscan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode. If the pointer mode is set toHIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set toHIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
a – [inout] device array of device pointers storing each single input vector element a_i, overwritten with r_i.
b – [inout] device array of device pointers storing each single input vector element b_i, overwritten with z_i.
c – [inout] device array of device pointers storing each cosine element of the Givens rotation for the batch.
s – [inout] device array of device pointers storing each sine element of the Givens rotation for the batch.
batchCount – [in] [int] number of batches (length of arrays a, b, c, and s).
The rotgBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotgStridedBatched(hipblasHandle_t handle, float *a, hipblasStride stridea, float *b, hipblasStride strideb, float *c, hipblasStride stridec, float *s, hipblasStride strides, int batchCount)#
-
hipblasStatus_t hipblasDrotgStridedBatched(hipblasHandle_t handle, double *a, hipblasStride stridea, double *b, hipblasStride strideb, double *c, hipblasStride stridec, double *s, hipblasStride strides, int batchCount)#
-
hipblasStatus_t hipblasCrotgStridedBatched(hipblasHandle_t handle, hipComplex *a, hipblasStride stridea, hipComplex *b, hipblasStride strideb, float *c, hipblasStride stridec, hipComplex *s, hipblasStride strides, int batchCount)#
-
hipblasStatus_t hipblasZrotgStridedBatched(hipblasHandle_t handle, hipDoubleComplex *a, hipblasStride stridea, hipDoubleComplex *b, hipblasStride strideb, double *c, hipblasStride stridec, hipDoubleComplex *s, hipblasStride strides, int batchCount)#
BLAS Level 1 API
The rotgStridedBatched functions create the Givens rotation matrix for the strided batched vectors
(a_i b_i), fori= 1, …,batchCount.a,b,c, andscan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode. If the pointer mode is set toHIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set toHIPBLAS_POINTER_MODE_HOST, this function returns immediately and synchronization is required to read the results.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
a – [inout] device strided_batched pointer or host strided_batched pointer to the first single input vector element a_1, overwritten with r.
stridea – [in] [hipblasStride] distance between elements of a in batch (distance between a_i and a_(i + 1)).
b – [inout] device strided_batched pointer or host strided_batched pointer to the first single input vector element b_1, overwritten with z.
strideb – [in] [hipblasStride] distance between elements of b in batch (distance between b_i and b_(i + 1)).
c – [inout] device strided_batched pointer or host strided_batched pointer to the first cosine element of the Givens rotations c_1.
stridec – [in] [hipblasStride] distance between elements of c in batch (distance between c_i and c_(i + 1)).
s – [inout] device strided_batched pointer or host strided_batched pointer to the sine element of the Givens rotations s_1.
strides – [in] [hipblasStride] distance between elements of s in batch (distance between s_i and s_(i + 1)).
batchCount – [in] [int] number of batches (length of arrays a, b, c, and s).
The rotgStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXrotm + Batched, StridedBatched#
-
hipblasStatus_t hipblasSrotm(hipblasHandle_t handle, int n, float *x, int incx, float *y, int incy, const float *param)#
-
hipblasStatus_t hipblasDrotm(hipblasHandle_t handle, int n, double *x, int incx, double *y, int incy, const double *param)#
BLAS Level 1 API
The rotm functions apply the modified Givens rotation matrix defined by
paramto vectorsxandy.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS :
sandd.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in the x and y vectors.
x – [inout] device pointer storing vector x.
incx – [in] [int] specifies the increment between elements of x.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment between elements of y.
param – [in] device vector or host vector of five elements defining the rotation. param can be stored in either the host or device memory. The location is specified by calling hipblasSetPointerMode.
param[0] = flag
param[1] = H11
param[2] = H21
param[3] = H12
param[4] = H22
The flag parameter defines the form of H:
flag = -1 => H = ( H11 H12 H21 H22 )
flag = 0 => H = ( 1.0 H12 H21 1.0 )
flag = 1 => H = ( H11 1.0 -1.0 H22 )
flag = -2 => H = ( 1.0 0.0 0.0 1.0 )
The rotm function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotmBatched(hipblasHandle_t handle, int n, float *const x[], int incx, float *const y[], int incy, const float *const param[], int batchCount)#
-
hipblasStatus_t hipblasDrotmBatched(hipblasHandle_t handle, int n, double *const x[], int incx, double *const y[], int incy, const double *const param[], int batchCount)#
BLAS Level 1 API
The rotmBatched functions apply the modified Givens rotation matrix defined by
param_ito batched vectorsx_iandy_i, fori= 1, …,batchCount.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in the x and y vectors.
x – [inout] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment between elements of each x_i.
y – [inout] device array of device pointers storing each vector y_1.
incy – [in] [int] specifies the increment between elements of each y_i.
param – [in] device array of device vectors of five elements defining the rotation. param can ONLY be stored on the device for the batched version of this function.
param[0] = flag
param[1] = H11
param[2] = H21
param[3] = H12
param[4] = H22
The flag parameter defines the form of H:
flag = -1 => H = ( H11 H12 H21 H22 )
flag = 0 => H = ( 1.0 H12 H21 1.0 )
flag = 1 => H = ( H11 1.0 -1.0 H22 )
flag = -2 => H = ( 1.0 0.0 0.0 1.0 )
batchCount – [in] [int] the number of x and y arrays, that is, the number of batches.
The rotmBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotmStridedBatched(hipblasHandle_t handle, int n, float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, const float *param, hipblasStride strideParam, int batchCount)#
-
hipblasStatus_t hipblasDrotmStridedBatched(hipblasHandle_t handle, int n, double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, const double *param, hipblasStride strideParam, int batchCount)#
BLAS Level 1 API
The rotmStridedBatched functions apply the modified Givens rotation matrix defined by
param_ito strided batched vectorsx_iandy_i, fori= 1, …,batchCount.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in the x and y vectors.
x – [inout] device pointer pointing to first strided batched vector x_1.
incx – [in] [int] specifies the increment between elements of each x_i.
stridex – [in] [hipblasStride] specifies the increment between the beginning of x_i and x_(i + 1).
y – [inout] device pointer pointing to the first strided batched vector y_1.
incy – [in] [int] specifies the increment between elements of each y_i.
stridey – [in] [hipblasStride] specifies the increment between the beginning of y_i and y_(i + 1).
param – [in] device pointer pointing to first array of five elements defining the rotation (param_1). param can ONLY be stored on the device for the strided_batched version of this function.
param[0] = flag
param[1] = H11
param[2] = H21
param[3] = H12
param[4] = H22
The flag parameter defines the form of H:
flag = -1 => H = ( H11 H12 H21 H22 )
flag = 0 => H = ( 1.0 H12 H21 1.0 )
flag = 1 => H = ( H11 1.0 -1.0 H22 )
flag = -2 => H = ( 1.0 0.0 0.0 1.0 )
strideParam – [in] [hipblasStride] specifies the increment between the beginning of param_i and param_(i + 1).
batchCount – [in] [int] the number of x and y arrays, that is, the number of batches.
The rotmStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXrotmg + Batched, StridedBatched#
-
hipblasStatus_t hipblasSrotmg(hipblasHandle_t handle, float *d1, float *d2, float *x1, const float *y1, float *param)#
-
hipblasStatus_t hipblasDrotmg(hipblasHandle_t handle, double *d1, double *d2, double *x1, const double *y1, double *param)#
BLAS Level 1 API
The rotmg functions create the modified Givens rotation matrix for the vector
(d1 * x1, d2 * y1). Parameters can be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode. If the pointer mode is set toHIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set toHIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS :
sandd.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
d1 – [inout] device pointer or host pointer to input scalar that is overwritten.
d2 – [inout] device pointer or host pointer to input scalar that is overwritten.
x1 – [inout] device pointer or host pointer to input scalar that is overwritten.
y1 – [in] device pointer or host pointer to input scalar.
param – [out] device vector or host vector of five elements defining the rotation. param can be stored in either host or device memory. The location is specified by calling hipblasSetPointerMode.
param[0] = flag
param[1] = H11
param[2] = H21
param[3] = H12
param[4] = H22
The flag parameter defines the form of H:
flag = -1 => H = ( H11 H12 H21 H22 )
flag = 0 => H = ( 1.0 H12 H21 1.0 )
flag = 1 => H = ( H11 1.0 -1.0 H22 )
flag = -2 => H = ( 1.0 0.0 0.0 1.0 )
The rotmg function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotmgBatched(hipblasHandle_t handle, float *const d1[], float *const d2[], float *const x1[], const float *const y1[], float *const param[], int batchCount)#
-
hipblasStatus_t hipblasDrotmgBatched(hipblasHandle_t handle, double *const d1[], double *const d2[], double *const x1[], const double *const y1[], double *const param[], int batchCount)#
BLAS Level 1 API
The rotmgBatched functions create the modified Givens rotation matrix for the batched vectors
(d1_i * x1_i, d2_i * y1_i), fori= 1, …,batchCount. Parameters can be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode. If the pointer mode is set toHIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set toHIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
d1 – [inout] device batched array or host batched array of input scalars that is overwritten.
d2 – [inout] device batched array or host batched array of input scalars that is overwritten.
x1 – [inout] device batched array or host batched array of input scalars that is overwritten.
y1 – [in] device batched array or host batched array of input scalars.
param – [out] device batched array or host batched array of vectors of five elements defining the rotation. param can be stored in either host or device memory. The location is specified by calling hipblasSetPointerMode.
param[0] = flag
param[1] = H11
param[2] = H21
param[3] = H12
param[4] = H22
The flag parameter defines the form of H:
flag = -1 => H = ( H11 H12 H21 H22 )
flag = 0 => H = ( 1.0 H12 H21 1.0 )
flag = 1 => H = ( H11 1.0 -1.0 H22 )
flag = -2 => H = ( 1.0 0.0 0.0 1.0 )
batchCount – [in] [int] the number of instances in the batch.
The rotmgBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSrotmgStridedBatched(hipblasHandle_t handle, float *d1, hipblasStride strided1, float *d2, hipblasStride strided2, float *x1, hipblasStride stridex1, const float *y1, hipblasStride stridey1, float *param, hipblasStride strideParam, int batchCount)#
-
hipblasStatus_t hipblasDrotmgStridedBatched(hipblasHandle_t handle, double *d1, hipblasStride strided1, double *d2, hipblasStride strided2, double *x1, hipblasStride stridex1, const double *y1, hipblasStride stridey1, double *param, hipblasStride strideParam, int batchCount)#
BLAS Level 1 API
The rotmgStridedBatched functions create the modified Givens rotation matrix for the strided batched vectors
(d1_i * x1_i, d2_i * y1_i), fori= 1, …,batchCount. Parameters can be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode. If the pointer mode is set toHIPBLAS_POINTER_MODE_HOST, this function blocks the CPU until the GPU has finished and the results are available in host memory. If the pointer mode is set toHIPBLAS_POINTER_MODE_DEVICE, this function returns immediately and synchronization is required to read the results.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
d1 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.
strided1 – [in] [hipblasStride] specifies the increment between the beginning of d1_i and d1_(i+1).
d2 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.
strided2 – [in] [hipblasStride] specifies the increment between the beginning of d2_i and d2_(i+1).
x1 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten.
stridex1 – [in] [hipblasStride] specifies the increment between the beginning of x1_i and x1_(i+1).
y1 – [in] device strided_batched array or host strided_batched array of input scalars.
stridey1 – [in] [hipblasStride] specifies the increment between the beginning of y1_i and y1_(i+1).
param – [out] device stridedBatched array or host stridedBatched array of vectors of five elements defining the rotation. param can be stored in either host or device memory. The location is specified by calling hipblasSetPointerMode.
param[0] = flag
param[1] = H11
param[2] = H21
param[3] = H12
param[4] = H22
The flag parameter defines the form of H:
flag = -1 => H = ( H11 H12 H21 H22 )
flag = 0 => H = ( 1.0 H12 H21 1.0 )
flag = 1 => H = ( H11 1.0 -1.0 H22 )
flag = -2 => H = ( 1.0 0.0 0.0 1.0 )
strideParam – [in] [hipblasStride] specifies the increment between the beginning of param_i and param_(i + 1).
batchCount – [in] [int] the number of instances in the batch.
The rotmgStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXscal + Batched, StridedBatched#
-
hipblasStatus_t hipblasSscal(hipblasHandle_t handle, int n, const float *alpha, float *x, int incx)#
-
hipblasStatus_t hipblasDscal(hipblasHandle_t handle, int n, const double *alpha, double *x, int incx)#
-
hipblasStatus_t hipblasCscal(hipblasHandle_t handle, int n, const hipComplex *alpha, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasCsscal(hipblasHandle_t handle, int n, const float *alpha, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasZscal(hipblasHandle_t handle, int n, const hipDoubleComplex *alpha, hipDoubleComplex *x, int incx)#
-
hipblasStatus_t hipblasZdscal(hipblasHandle_t handle, int n, const double *alpha, hipDoubleComplex *x, int incx)#
BLAS Level 1 API
The scal functions scales each element of vector
xwith scalaralpha.x := alpha * x
Supported precisions in rocBLAS :
s,d,c,z,cs, andzd.Supported precisions in cuBLAS :
s,d,c,z,cs, andzd.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
alpha – [in] device pointer or host pointer for the scalar alpha.
x – [inout] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
The scal function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSscalBatched(hipblasHandle_t handle, int n, const float *alpha, float *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasDscalBatched(hipblasHandle_t handle, int n, const double *alpha, double *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCscalBatched(hipblasHandle_t handle, int n, const hipComplex *alpha, hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZscalBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *alpha, hipDoubleComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCsscalBatched(hipblasHandle_t handle, int n, const float *alpha, hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZdscalBatched(hipblasHandle_t handle, int n, const double *alpha, hipDoubleComplex *const x[], int incx, int batchCount)#
BLAS Level 1 API
The scalBatched functions scale each element of vector
x_iwith scalaralpha, fori= 1, … ,batchCount.where (x_i := alpha * x_i
x_i) is thei-th instance of the batch.Supported precisions in rocBLAS :
s,d,c,z,cs, andzd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i.
alpha – [in] host pointer or device pointer for the scalar alpha.
x – [inout] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
batchCount – [in] [int] specifies the number of batches in x.
The scalBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSscalStridedBatched(hipblasHandle_t handle, int n, const float *alpha, float *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasDscalStridedBatched(hipblasHandle_t handle, int n, const double *alpha, double *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCscalStridedBatched(hipblasHandle_t handle, int n, const hipComplex *alpha, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZscalStridedBatched(hipblasHandle_t handle, int n, const hipDoubleComplex *alpha, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCsscalStridedBatched(hipblasHandle_t handle, int n, const float *alpha, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZdscalStridedBatched(hipblasHandle_t handle, int n, const double *alpha, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
BLAS Level 1 API
The scalStridedBatched functions scale each element of vector
x_iwith scalaralpha, fori= 1, … ,batchCount.wherex_i := alpha * x_i ,
(x_i)is thei-th instance of the batch.Supported precisions in rocBLAS :
s,d,c,z,cs, andzd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i.
alpha – [in] host pointer or device pointer for the scalar alpha.
x – [inout] device pointer to the first vector (x_1) in the batch.
incx – [in] [int] specifies the increment for the elements of x.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stride_x. However, the user should ensure that stride_x is of an appropriate size. For a typical case, this means stride_x >= n * incx.
batchCount – [in] [int] specifies the number of batches in x.
The scalStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXswap + Batched, StridedBatched#
-
hipblasStatus_t hipblasSswap(hipblasHandle_t handle, int n, float *x, int incx, float *y, int incy)#
-
hipblasStatus_t hipblasDswap(hipblasHandle_t handle, int n, double *x, int incx, double *y, int incy)#
-
hipblasStatus_t hipblasCswap(hipblasHandle_t handle, int n, hipComplex *x, int incx, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZswap(hipblasHandle_t handle, int n, hipDoubleComplex *x, int incx, hipDoubleComplex *y, int incy)#
BLAS Level 1 API
The swap functions interchange vectors
xandy.y := x; x := y
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
x – [inout] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
The swap function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSswapBatched(hipblasHandle_t handle, int n, float *const x[], int incx, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDswapBatched(hipblasHandle_t handle, int n, double *const x[], int incx, double *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasCswapBatched(hipblasHandle_t handle, int n, hipComplex *const x[], int incx, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZswapBatched(hipblasHandle_t handle, int n, hipDoubleComplex *const x[], int incx, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 1 API
The swapBatched functions interchange vectors
x_iandy_i, fori= 1 , … ,batchCount.y_i := x_i; x_i := y_i
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [inout] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
batchCount – [in] [int] number of instances in the batch.
The swapBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSswapStridedBatched(hipblasHandle_t handle, int n, float *x, int incx, hipblasStride stridex, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDswapStridedBatched(hipblasHandle_t handle, int n, double *x, int incx, hipblasStride stridex, double *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasCswapStridedBatched(hipblasHandle_t handle, int n, hipComplex *x, int incx, hipblasStride stridex, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZswapStridedBatched(hipblasHandle_t handle, int n, hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 1 API
The swapStridedBatched functions interchange vectors
x_iandy_i, fori= 1 , … ,batchCount.y_i := x_i; x_i := y_i
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [inout] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of x.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. For a typical case, this means stridex >= n * incx.
y – [inout] device pointer to the first vector y_1.
incy – [in] [int] specifies the increment for the elements of y.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should ensure that stridey is of an appropriate size. For a typical case, this means stridey >= n * incy. stridey should be non zero.
batchCount – [in] [int] number of instances in the batch.
The swapStridedBatched function supports the 64-bit integer interface. See the ILP64 interfaces section.
Level 2 BLAS#
hipblasXgbmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasSgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const float *alpha, const float *AP, int lda, const float *x, int incx, const float *beta, float *y, int incy)#
-
hipblasStatus_t hipblasDgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const double *alpha, const double *AP, int lda, const double *x, int incx, const double *beta, double *y, int incy)#
-
hipblasStatus_t hipblasCgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *x, int incx, const hipComplex *beta, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZgbmv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *x, int incx, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy)#
BLAS Level 2 API
The gbmv functions perform one of the matrix-vector operations:
wherey := alpha*A*x + beta*y, or y := alpha*A**T*x + beta*y, or y := alpha*A**H*x + beta*y,
alphaandbetaare scalars,xandyare vectors, andAis anmbynbanded matrix withklsub-diagonals andkusuper-diagonals.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
trans – [in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not.
m – [in] [int] number of rows of matrix A.
n – [in] [int] number of columns of matrix A.
kl – [in] [int] number of sub-diagonals of A.
ku – [in] [int] number of super-diagonals of A.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in]
device pointer storing banded matrix A. The leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first super-diagonal above on the RHS of row ku. The first sub-diagonal resides below on the LHS of row ku + 2. This propagates up and down across sub/super-diagonals.
Ex: (m = n = 7; ku = 2, kl = 2)
1 2 3 0 0 0 0 -> 0 0 3 3 3 3 3
4 1 2 3 0 0 0 -> 0 2 2 2 2 2 2
5 4 1 2 3 0 0 -> 1 1 1 1 1 1 1
0 5 4 1 2 3 0 -> 4 4 4 4 4 4 0
0 0 5 4 1 2 0 -> 5 5 5 5 5 0 0
0 0 0 5 4 1 2 -> 0 0 0 0 0 0 0
0 0 0 0 5 4 1 -> 0 0 0 0 0 0 0
Note that empty elements that don’t correspond to data will not be referenced.
lda – [in] [int] specifies the leading dimension of A. Must be >= (kl + ku + 1).
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
The gbmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const float *alpha, const float *const AP[], int lda, const float *const x[], int incx, const float *beta, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const double *alpha, const double *const AP[], int lda, const double *const x[], int incx, const double *beta, double *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasCgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const x[], int incx, const hipComplex *beta, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZgbmvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *beta, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 2 API
The gbmvBatched functions perform one of the matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis anmbynbanded matrix withklsub-diagonals andkusuper-diagonals, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
trans – [in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not.
m – [in] [int] number of rows of each matrix A_i.
n – [in] [int] number of columns of each matrix A_i.
kl – [in] [int] number of sub-diagonals of each A_i.
ku – [in] [int] number of super-diagonals of each A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in]
device array of device pointers storing each banded matrix A_i. The leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first super-diagonal above on the RHS of row ku. The first sub-diagonal resides below on the LHS of row ku + 2. This propagates up and down across sub/super-diagonals.
Ex: (m = n = 7; ku = 2, kl = 2)
1 2 3 0 0 0 0 -> 0 0 3 3 3 3 3
4 1 2 3 0 0 0 -> 0 2 2 2 2 2 2
5 4 1 2 3 0 0 -> 1 1 1 1 1 1 1
0 5 4 1 2 3 0 -> 4 4 4 4 4 4 0
0 0 5 4 1 2 0 -> 5 5 5 5 5 0 0
0 0 0 5 4 1 2 -> 0 0 0 0 0 0 0
0 0 0 0 5 4 1 -> 0 0 0 0 0 0 0
Note that empty elements that don’t correspond to data will not be referenced.
lda – [in] [int] specifies the leading dimension of each A_i. Must be >= (kl + ku + 1).
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
batchCount – [in] [int] specifies the number of instances in the batch.
The gbmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, const float *beta, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, const double *beta, double *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasCgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *beta, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZgbmvStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, int kl, int ku, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The gbmvStridedBatched functions perform one of the matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis anmbynbanded matrix withklsub-diagonals andkusuper-diagonals, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
trans – [in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not.
m – [in] [int] number of rows of matrix A.
n – [in] [int] number of columns of matrix A.
kl – [in] [int] number of sub-diagonals of A.
ku – [in] [int] number of super-diagonals of A.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in]
device pointer to first banded matrix (A_1). The leading (kl + ku + 1) by n part of the matrix contains the coefficients of the banded matrix. The leading diagonal resides in row (ku + 1) with the first super-diagonal above on the RHS of row ku. The first sub-diagonal resides below on the LHS of row ku + 2. This propagates up and down across sub/super-diagonals.
Ex: (m = n = 7; ku = 2, kl = 2)
1 2 3 0 0 0 0 -> 0 0 3 3 3 3 3
4 1 2 3 0 0 0 -> 0 2 2 2 2 2 2
5 4 1 2 3 0 0 -> 1 1 1 1 1 1 1
0 5 4 1 2 3 0 -> 4 4 4 4 4 4 0
0 0 5 4 1 2 0 -> 5 5 5 5 5 0 0
0 0 0 5 4 1 2 -> 0 0 0 0 0 0 0
0 0 0 0 5 4 1 -> 0 0 0 0 0 0 0
Note that empty elements that don’t correspond to data will not be referenced.
lda – [in] [int] specifies the leading dimension of A. Must be >= (kl + ku + 1).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
x – [in] device pointer to first vector (x_1).
incx – [in] [int] specifies the increment for the elements of x.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer to first vector (y_1).
incy – [in] [int] specifies the increment for the elements of y.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (x_i+1).
batchCount – [in] [int] specifies the number of instances in the batch.
The gbmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXgemv + Batched, StridedBatched#
-
hipblasStatus_t hipblasSgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const float *alpha, const float *AP, int lda, const float *x, int incx, const float *beta, float *y, int incy)#
-
hipblasStatus_t hipblasDgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const double *alpha, const double *AP, int lda, const double *x, int incx, const double *beta, double *y, int incy)#
-
hipblasStatus_t hipblasCgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *x, int incx, const hipComplex *beta, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZgemv(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *x, int incx, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy)#
BLAS Level 2 API
The gemv functions perform one of the matrix-vector operations:
wherey := alpha*A*x + beta*y, or y := alpha*A**T*x + beta*y, or y := alpha*A**H*x + beta*y,
alphaandbetaare scalars,xandyare vectors, andAis anmbynmatrix.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
trans – [in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not.
m – [in] [int] number of rows of matrix A.
n – [in] [int] number of columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
The gemv` functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const float *alpha, const float *const AP[], int lda, const float *const x[], int incx, const float *beta, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const double *alpha, const double *const AP[], int lda, const double *const x[], int incx, const double *beta, double *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasCgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const x[], int incx, const hipComplex *beta, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZgemvBatched(hipblasHandle_t handle, hipblasOperation_t trans, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *beta, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 2 API
The gemvBatched functions perform a batch of matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis anmbynmatrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
trans – [in] [hipblasOperation_t] indicates whether matrices A_i are tranposed (conjugated) or not.
m – [in] [int] number of rows of each matrix A_i.
n – [in] [int] number of columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each matrix A_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
batchCount – [in] [int] number of instances in the batch.
The gemvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, const float *beta, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, const double *beta, double *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasCgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *beta, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZgemvStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The gemvStridedBatched functions perform a batch of matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i, or y_i := alpha*A_i**T*x_i + beta*y_i, or y_i := alpha*A_i**H*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis anmbynmatrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] indicates whether matrices A_i are tranposed (conjugated) or not.
m – [in] [int] number of rows of matrices A_i.
n – [in] [int] number of columns of matrices A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer to the first matrix (A_1) in the batch.
lda – [in] [int] specifies the leading dimension of matrices A_i.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [int] specifies the increment for the elements of vectors x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. When trans equals HIPBLAS_OP_N, this typically means stridex >= n * incx. Otherwise, stridex >= m * incx.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer to the first vector (y_1) in the batch.
incy – [in] [int] specifies the increment for the elements of vectors y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should ensure that stridey is of an appropriate size. When trans equals HIPBLAS_OP_N, this typically means stridey >= m * incy. Otherwise, stridey >= n * incy. stridey should be non zero.
batchCount – [in] [int] number of instances in the batch.
The gemvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXger + Batched, StridedBatched#
-
hipblasStatus_t hipblasSger(hipblasHandle_t handle, int m, int n, const float *alpha, const float *x, int incx, const float *y, int incy, float *AP, int lda)#
-
hipblasStatus_t hipblasDger(hipblasHandle_t handle, int m, int n, const double *alpha, const double *x, int incx, const double *y, int incy, double *AP, int lda)#
-
hipblasStatus_t hipblasCgeru(hipblasHandle_t handle, int m, int n, const hipComplex *alpha, const hipComplex *x, int incx, const hipComplex *y, int incy, hipComplex *AP, int lda)#
-
hipblasStatus_t hipblasCgerc(hipblasHandle_t handle, int m, int n, const hipComplex *alpha, const hipComplex *x, int incx, const hipComplex *y, int incy, hipComplex *AP, int lda)#
-
hipblasStatus_t hipblasZgeru(hipblasHandle_t handle, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, const hipDoubleComplex *y, int incy, hipDoubleComplex *AP, int lda)#
-
hipblasStatus_t hipblasZgerc(hipblasHandle_t handle, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, const hipDoubleComplex *y, int incy, hipDoubleComplex *AP, int lda)#
BLAS Level 2 API
The ger, geru, and gerc functions perform the matrix-vector operations:
whereA := A + alpha*x*y**T , OR A := A + alpha*x*y**H for gerc
alphais a scalar,xandyare vectors, andAis anmbynmatrix.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
m – [in] [int] the number of rows of the matrix A.
n – [in] [int] the number of columns of the matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
AP – [inout] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
The ger functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgerBatched(hipblasHandle_t handle, int m, int n, const float *alpha, const float *const x[], int incx, const float *const y[], int incy, float *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasDgerBatched(hipblasHandle_t handle, int m, int n, const double *alpha, const double *const x[], int incx, const double *const y[], int incy, double *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasCgeruBatched(hipblasHandle_t handle, int m, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, const hipComplex *const y[], int incy, hipComplex *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasCgercBatched(hipblasHandle_t handle, int m, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, const hipComplex *const y[], int incy, hipComplex *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasZgeruBatched(hipblasHandle_t handle, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *const y[], int incy, hipDoubleComplex *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasZgercBatched(hipblasHandle_t handle, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *const y[], int incy, hipDoubleComplex *const AP[], int lda, int batchCount)#
BLAS Level 2 API
The gerBatched, geruBatched, and gercBatched functions perform a batch of the matrix-vector operations:
whereA := A + alpha*x*y**T , OR A := A + alpha*x*y**H for gerc
(A_i, x_i, y_i)is thei-th instance of the batch,alphais a scalar,x_iandy_iare vectors, andA_iis anmbynmatrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
m – [in] [int] the number of rows of each matrix A_i.
n – [in] [int] the number of columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
AP – [inout] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i.
batchCount – [in] [int] number of instances in the batch.
The gerBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgerStridedBatched(hipblasHandle_t handle, int m, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, const float *y, int incy, hipblasStride stridey, float *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasDgerStridedBatched(hipblasHandle_t handle, int m, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, const double *y, int incy, hipblasStride stridey, double *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasCgeruStridedBatched(hipblasHandle_t handle, int m, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *y, int incy, hipblasStride stridey, hipComplex *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasCgercStridedBatched(hipblasHandle_t handle, int m, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *y, int incy, hipblasStride stridey, hipComplex *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZgeruStridedBatched(hipblasHandle_t handle, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *y, int incy, hipblasStride stridey, hipDoubleComplex *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZgercStridedBatched(hipblasHandle_t handle, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *y, int incy, hipblasStride stridey, hipDoubleComplex *AP, int lda, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The gerStridedBatched, geruStridedBatched, and gercStridedBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*y_i**T, OR A_i := A_i + alpha*x_i*y_i**H for gerc
(A_i, x_i, y_i)is thei-th instance of the batch,alphais a scalar,x_iandy_iare vectors, andA_iis anmbynmatrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
m – [in] [int] the number of rows of each matrix A_i.
n – [in] [int] the number of columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer to the first vector (x_1) in the batch.
incx – [in] [int] specifies the increments for the elements of each vector x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. For a typical case, this means stridex >= m * incx.
y – [inout] device pointer to the first vector (y_1) in the batch.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should ensure that stridey is of an appropriate size. For a typical case, this means stridey >= n * incy.
AP – [inout] device pointer to the first matrix (A_1) in the batch.
lda – [in] [int] specifies the leading dimension of each A_i.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1)
batchCount – [in] [int] number of instances in the batch.
The gerStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXhbmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasChbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *x, int incx, const hipComplex *beta, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZhbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *x, int incx, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy)#
BLAS Level 2 API
The hbmv functions perform the matrix-vector operations:
wherey := alpha*A*x + beta*y
alphaandbetaare scalars,xandyaren-element vectors, andAis annbynHermitian band matrix withksuper-diagonals.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A is being supplied.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A is being supplied.
n – [in] [int] the order of the matrix A.
k – [in] [int] the number of super-diagonals of the matrix A. Must be >= 0.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer storing matrix A. Of dimension (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The leading (k + 1) by n part of A must contain the upper triangular band part of the Hermitian matrix, with the leading diagonal in row (k + 1), the first super-diagonal on the RHS of row k, and so forth. The top left k by x triangle of A will not be referenced.
Ex (upper, lda = n = 4, k = 1):
A -> Represented matrix
(0,0) (5,9) (6,8) (7,7) -> (1, 0) (5, 9) (0, 0) (0, 0)
(1,0) (2,0) (3,0) (4,0) -> (5,-9) (2, 0) (6, 8) (0, 0)
(0,0) (0,0) (0,0) (0,0) -> (0, 0) (6,-8) (3, 0) (7, 7)
(0,0) (0,0) (0,0) (0,0) -> (0, 0) (0, 0) (7,-7) (4, 0)if uplo == HIPBLAS_FILL_MODE_LOWER: The leading (k + 1) by n part of A must contain the lower triangular band part of the Hermitian matrix, with the leading diagonal in row (1), the first sub-diagonal on the LHS of row 2, and so forth. The bottom right k by k triangle of A will not be referenced.
Ex (lower, lda = 2, n = 4, k = 1):
A -> Represented matrix
(1,0) (2,0) (3,0) (4,0) -> (1, 0) (5,-9) (0, 0) (0, 0)
(5,9) (6,8) (7,7) (0,0) -> (5, 9) (2, 0) (6,-8) (0, 0)
-> (0, 0) (6, 8) (3, 0) (7,-7)
-> (0, 0) (0, 0) (7, 7) (4, 0)As a Hermitian matrix, the imaginary part of the main diagonal of A will not be referenced and is assumed to be == 0.
lda – [in] [int] specifies the leading dimension of A. Must be >= k + 1.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
The hbmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const x[], int incx, const hipComplex *beta, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZhbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *beta, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 2 API
The hbmvBatched functions perform one of the matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i
alphaandbetaare scalars,x_iandy_iaren-element vectors, andA_iis annbynHermitian band matrix withksuper-diagonals, for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is being supplied.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is being supplied.
n – [in] [int] the order of each matrix A_i.
k – [in] [int] the number of super-diagonals of each matrix A_i. Must be >= 0.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array of device pointers storing each matrix_i A of dimension (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The leading (k + 1) by n part of each A_i must contain the upper triangular band part of the Hermitian matrix, with the leading diagonal in row (k + 1), the first super-diagonal on the RHS of row k, and so forth. The top left k by x triangle of each A_i will not be referenced.
Ex (upper, lda = n = 4, k = 1):
A -> Represented matrix
(0,0) (5,9) (6,8) (7,7) -> (1, 0) (5, 9) (0, 0) (0, 0)
(1,0) (2,0) (3,0) (4,0) -> (5,-9) (2, 0) (6, 8) (0, 0)
(0,0) (0,0) (0,0) (0,0) -> (0, 0) (6,-8) (3, 0) (7, 7)
(0,0) (0,0) (0,0) (0,0) -> (0, 0) (0, 0) (7,-7) (4, 0)if uplo == HIPBLAS_FILL_MODE_LOWER: The leading (k + 1) by n part of each A_i must contain the lower triangular band part of the Hermitian matrix, with the leading diagonal in row (1), the first sub-diagonal on the LHS of row 2, and so forth. The bottom right k by k triangle of each A_i will not be referenced.
Ex (lower, lda = 2, n = 4, k = 1):
A -> Represented matrix
(1,0) (2,0) (3,0) (4,0) -> (1, 0) (5,-9) (0, 0) (0, 0)
(5,9) (6,8) (7,7) (0,0) -> (5, 9) (2, 0) (6,-8) (0, 0)
-> (0, 0) (6, 8) (3, 0) (7,-7)
-> (0, 0) (0, 0) (7, 7) (4, 0)As a Hermitian matrix, the imaginary part of the main diagonal of each A_i will not be referenced and is assumed to be == 0.
lda – [in] [int] specifies the leading dimension of each A_i. Must be >= max(1, n).
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of y.
batchCount – [in] [int] number of instances in the batch.
The hbmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *beta, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZhbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The hbmvStridedBatched functions perform one of the matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i
alphaandbetaare scalars,x_iandy_iaren-element vectors, andA_iis annbynHermitian band matrix withksuper-diagonals, for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is being supplied.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is being supplied.
n – [in] [int] the order of each matrix A_i.
k – [in] [int] the number of super-diagonals of each matrix A_i. Must be >= 0.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array pointing to the first matrix A_1. Each A_i is of dimension (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The leading (k + 1) by n part of each A_i must contain the upper triangular band part of the Hermitian matrix, with the leading diagonal in row (k + 1), the first super-diagonal on the RHS of row k, and so forth. The top left k by x triangle of each A_i will not be referenced.
Ex (upper, lda = n = 4, k = 1):
A -> Represented matrix
(0,0) (5,9) (6,8) (7,7) -> (1, 0) (5, 9) (0, 0) (0, 0)
(1,0) (2,0) (3,0) (4,0) -> (5,-9) (2, 0) (6, 8) (0, 0)
(0,0) (0,0) (0,0) (0,0) -> (0, 0) (6,-8) (3, 0) (7, 7)
(0,0) (0,0) (0,0) (0,0) -> (0, 0) (0, 0) (7,-7) (4, 0)if uplo == HIPBLAS_FILL_MODE_LOWER: The leading (k + 1) by n part of each A_i must contain the lower triangular band part of the Hermitian matrix, with the leading diagonal in row (1), the first sub-diagonal on the LHS of row 2, and so forth. The bottom right k by k triangle of each A_i will not be referenced.
Ex (lower, lda = 2, n = 4, k = 1):
A Represented matrix
(1,0) (2,0) (3,0) (4,0) -> (1, 0) (5,-9) (0, 0) (0, 0)
(5,9) (6,8) (7,7) (0,0) -> (5, 9) (2, 0) (6,-8) (0, 0)
-> (0, 0) (6, 8) (3, 0) (7,-7)
-> (0, 0) (0, 0) (7, 7) (4, 0)As a Hermitian matrix, the imaginary part of the main diagonal of each A_i will not be referenced and is assumed to be == 0.
lda – [in] [int] specifies the leading dimension of each A_i. Must be >= max(1, n).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
x – [in] device array pointing to the first vector y_1.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array pointing to the first vector y_1.
incy – [in] [int] specifies the increment for the elements of y.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
batchCount – [in] [int] number of instances in the batch.
The hbmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXhemv + Batched, StridedBatched#
-
hipblasStatus_t hipblasChemv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *x, int incx, const hipComplex *beta, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZhemv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *x, int incx, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy)#
BLAS Level 2 API
The hemv functions perform one of the matrix-vector operations:
wherey := alpha*A*x + beta*y
alphaandbetaare scalars,xandyaren-element vectors, andAis annbynHermitian matrix.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: the upper triangular part of the Hermitian matrix A is supplied.
HIPBLAS_FILL_MODE_LOWER: the lower triangular part of the Hermitian matrix A is supplied.
n – [in] [int] the order of the matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer storing matrix A. Of dimension (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A must contain the upper triangular part of a Hermitian matrix. The lower triangular part of A will not be referenced.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A must contain the lower triangular part of a Hermitian matrix. The upper triangular part of A will not be referenced.
As a Hermitian matrix, the imaginary part of the main diagonal of A will not be referenced and is assumed to be == 0.
lda – [in] [int] specifies the leading dimension of A. Must be >= max(1, n).
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
The hemv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChemvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const x[], int incx, const hipComplex *beta, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZhemvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *beta, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 2 API
The hemvBatched functions perform one of the matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i
alphaandbetaare scalars,x_iandy_iaren-element vectors, andA_iis annbynHermitian matrix, for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: the upper triangular part of the Hermitian matrix A is supplied.
HIPBLAS_FILL_MODE_LOWER: the lower triangular part of the Hermitian matrix A is supplied.
n – [in] [int] the order of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i of dimension (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i must contain the upper triangular part of a Hermitian matrix. The lower triangular part of each A_i will not be referenced.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i must contain the lower triangular part of a Hermitian matrix. The upper triangular part of each A_i will not be referenced.
As a Hermitian matrix, the imaginary part of the main diagonal of each A_i will not be referenced and is assumed to be == 0.
lda – [in] [int] specifies the leading dimension of each A_i. Must be >= max(1, n).
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of y.
batchCount – [in] [int] number of instances in the batch.
The hemvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChemvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *beta, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZhemvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The hemvStridedBatched functions perform one of the matrix-vector operations:
wherey_i := alpha*A_i*x_i + beta*y_i
alphaandbetaare scalars,x_iandy_iaren-element vectors, andA_iis annbynHermitian matrix, for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: the upper triangular part of the Hermitian matrix A is supplied.
HIPBLAS_FILL_MODE_LOWER: the lower triangular part of the Hermitian matrix A is supplied.
n – [in] [int] the order of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i of dimension (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i must contain the upper triangular part of a Hermitian matrix. The lower triangular part of each A_i will not be referenced.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i must contain the lower triangular part of a Hermitian matrix. The upper triangular part of each A_i will not be referenced.
As a Hermitian matrix, the imaginary part of the main diagonal of each A_i will not be referenced and is assumed to be == 0.
lda – [in] [int] specifies the leading dimension of each A_i. Must be >= max(1, n).
strideA – [in] [hipblasStride] stride from the start of one (A_i) to the next (A_i+1).
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of y.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
batchCount – [in] [int] number of instances in the batch.
The hemvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXher + Batched, StridedBatched#
-
hipblasStatus_t hipblasCher(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const hipComplex *x, int incx, hipComplex *AP, int lda)#
-
hipblasStatus_t hipblasZher(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const hipDoubleComplex *x, int incx, hipDoubleComplex *AP, int lda)#
BLAS Level 2 API
The her functions perform the matrix-vector operations:
whereA := A + alpha*x*x**H
alphais a real scalar,xis a vector, andAis annbynHermitian matrix.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A is supplied in A.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A is supplied in A.
n – [in] [int] the number of rows and columns of matrix A. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
AP – [inout] device pointer storing the specified triangular portion of the Hermitian matrix A. Of size (lda * n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of the Hermitian matrix A is supplied. The lower triangluar portion will not be touched.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of the Hermitian matrix A is supplied. The upper triangular portion will not be touched.
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
lda – [in] [int] specifies the leading dimension of A. Must be at least max(1, n).
The her functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCherBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const hipComplex *const x[], int incx, hipComplex *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasZherBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const hipDoubleComplex *const x[], int incx, hipDoubleComplex *const AP[], int lda, int batchCount)#
BLAS Level 2 API
herBatched performs the matrix-vector operations:
whereA_i := A_i + alpha*x_i*x_i**H
alphais a real scalar,x_iis a vector, andA_iis annbynsymmetric matrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in A.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in A.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
AP – [inout] device array of device pointers storing the specified triangular portion of each Hermitian matrix A_i of at least size ((n * (n + 1)) / 2). Array is of at least size batchCount.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The lower triangular portion of each A_i will not be touched.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The upper triangular portion of each A_i will not be touched.
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
lda – [in] [int] specifies the leading dimension of each A_i. Must be at least max(1, n).
batchCount – [in] [int] number of instances in the batch.
The herBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCherStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const hipComplex *x, int incx, hipblasStride stridex, hipComplex *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZherStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *AP, int lda, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The herStridedBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*x_i**H
alphais a real scalar,x_iis a vector, andA_iis annbynHermitian matrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in A.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in A.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector (x_1).
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
AP – [inout] device array of device pointers storing the specified triangular portion of each Hermitian matrix A_i. Points to the first matrix (A_1).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The lower triangular portion of each A_i will not be touched.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The upper triangular portion of each A_i will not be touched.
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
lda – [in] [int] specifies the leading dimension of each A_i.
strideA – [in] [hipblasStride] stride from the start of one (A_i) to the next (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The herStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXher2 + Batched, StridedBatched#
-
hipblasStatus_t hipblasCher2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, const hipComplex *y, int incy, hipComplex *AP, int lda)#
-
hipblasStatus_t hipblasZher2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, const hipDoubleComplex *y, int incy, hipDoubleComplex *AP, int lda)#
BLAS Level 2 API
The her2 functions perform the matrix-vector operations:
whereA := A + alpha*x*y**H + conj(alpha)*y*x**H
alphais a complex scalar,xandyare vectors, andAis an n by n Hermitian matrix.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A is supplied.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A is supplied.
n – [in] [int] the number of rows and columns of matrix A. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
AP – [inout] device pointer storing the specified triangular portion of the Hermitian matrix A. Of size (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of the Hermitian matrix A is supplied. The lower triangular portion of A will not be touched.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of the Hermitian matrix A is supplied. The upper triangular portion of A will not be touched.
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
lda – [in] [int] specifies the leading dimension of A. Must be at least max(lda, 1).
The her2 functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCher2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, const hipComplex *const y[], int incy, hipComplex *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasZher2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *const y[], int incy, hipDoubleComplex *const AP[], int lda, int batchCount)#
BLAS Level 2 API
The her2Batched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*y_i**H + conj(alpha)*y_i*x_i**H
alphais a complex scalar,x_iandy_iare vectors, andA_iis annbynHermitian matrix for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of x.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
AP – [inout] device array of device pointers storing the specified triangular portion of each Hermitian matrix A_i of size (lda, n).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The lower triangular portion of each A_i will not be touched.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The upper triangular portion of each A_i will not be touched.
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
lda – [in] [int] specifies the leading dimension of each A_i. Must be at least max(lda, 1).
batchCount – [in] [int] number of instances in the batch.
The her2Batched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCher2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *y, int incy, hipblasStride stridey, hipComplex *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZher2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *y, int incy, hipblasStride stridey, hipDoubleComplex *AP, int lda, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The her2StridedBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*y_i**H + conj(alpha)*y_i*x_i**H
alphais a complex scalar,x_iandy_iare vectors, andA_iis annbynHermitian matrix for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] specifies the stride between the beginning of one vector (x_i) and the next (x_i+1).
y – [in] device pointer pointing to the first vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] specifies the stride between the beginning of one vector (y_i) and the next (y_i+1).
AP – [inout] device pointer pointing to the first matrix (A_1). Stores the specified triangular portion of each Hermitian matrix A_i.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The lower triangular portion of each A_i will not be touched.
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The upper triangular portion of each A_i will not be touched.
Note that the imaginary part of the diagonal elements are not accessed and are assumed to be 0.
lda – [in] [int] specifies the leading dimension of each A_i. Must be at least max(lda, 1).
strideA – [in] [hipblasStride] specifies the stride between the beginning of one matrix (A_i) and the next (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The her2StridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXhpmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasChpmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *AP, const hipComplex *x, int incx, const hipComplex *beta, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZhpmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, const hipDoubleComplex *x, int incx, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy)#
BLAS Level 2 API
The hpmv functions perform the matrix-vector operation:
wherey := alpha*A*x + beta*y
alphaandbetaare scalars,xandyaren-element vectors andAis annbynHermitian matrix, supplied in packed form (see description below).Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: the upper triangular part of the Hermitian matrix A is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: the lower triangular part of the Hermitian matrix A is supplied in AP.
n – [in] [int] the order of the matrix A. Must be >= 0.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer storing the packed version of the specified triangular portion of the Hermitian matrix A. Of at least size ((n * (n + 1)) / 2).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of the Hermitian matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (3, 2)
(2,-1) (4, 0) (5,-1) -> [(1,0), (2,1), (4,0), (3,2), (5,-1), (6,0)]
(3,-2) (5, 1) (6, 0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of the Hermitian matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (3, 2)
(2,-1) (4, 0) (5,-1) -> [(1,0), (2,-1), (3,-2), (4,0), (5,1), (6,0)]
(3,-2) (5, 1) (6, 0)
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
The hpmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChpmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const AP[], const hipComplex *const x[], int incx, const hipComplex *beta, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZhpmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *beta, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 2 API
The hpmvBatched functions performs the matrix-vector operation:
wherey_i := alpha*A_i*x_i + beta*y_i
alphaandbetaare scalars,x_iandy_iaren-element vectors, andA_iis annbynHermitian matrix, supplied in packed form (see description below), for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: the upper triangular part of each Hermitian matrix A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: the lower triangular part of each Hermitian matrix A_i is supplied in AP.
n – [in] [int] the order of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer of device pointers storing the packed version of the specified triangular portion of each Hermitian matrix A_i. Each A_i is of at least size ((n * (n + 1)) / 2).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that each AP_i contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (3, 2)
(2,-1) (4, 0) (5,-1) -> [(1,0), (2,1), (4,0), (3,2), (5,-1), (6,0)]
(3,-2) (5, 1) (6, 0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that each AP_i contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (3, 2)
(2,-1) (4, 0) (5,-1) -> [(1,0), (2,-1), (3,-2), (4,0), (5,1), (6,0)]
(3,-2) (5, 1) (6, 0)
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of y.
batchCount – [in] [int] number of instances in the batch.
The hpmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChpmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *AP, hipblasStride strideA, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *beta, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZhpmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, hipblasStride strideA, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The hpmvStridedBatched functions perform the matrix-vector operation:
wherey_i := alpha*A_i*x_i + beta*y_i
alphaandbetaare scalars,x_iandy_iaren-element vectors, andA_iis annbynHermitian matrix, supplied in packed form (see description below), for each batch ini= [1,batchCount].Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: the upper triangular part of each Hermitian matrix A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: the lower triangular part of each Hermitian matrix A_i is supplied in AP.
n – [in] [int] the order of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer pointing to the beginning of the first matrix (AP_1). Stores the packed version of the specified triangular portion of each Hermitian matrix AP_i of size ((n * (n + 1)) / 2).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that each AP_i contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (3, 2)
(2,-1) (4, 0) (5,-1) -> [(1,0), (2,1), (4,0), (3,2), (5,-1), (6,0)]
(3,-2) (5, 1) (6, 0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that each AP_i contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (3, 2)
(2,-1) (4, 0) (5,-1) -> [(1,0), (2,-1), (3,-2), (4,0), (5,1), (6,0)]
(3,-2) (5, 1) (6, 0)
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
strideA – [in] [hipblasStride] stride from the start of one matrix (AP_i) to the next one (AP_i+1).
x – [in] device array pointing to the beginning of the first vector (x_1).
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
beta – [in] device pointer or host pointer to scalar beta.
y – [inout] device array pointing to the beginning of the first vector (y_1).
incy – [in] [int] specifies the increment for the elements of y.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
batchCount – [in] [int] number of instances in the batch.
The hpmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXhpr + Batched, StridedBatched#
-
hipblasStatus_t hipblasChpr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const hipComplex *x, int incx, hipComplex *AP)#
-
hipblasStatus_t hipblasZhpr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const hipDoubleComplex *x, int incx, hipDoubleComplex *AP)#
BLAS Level 2 API
The hpr functions perform the matrix-vector operations:
whereA := A + alpha*x*x**H
alphais a real scalar,xis a vector, andAis annbynHermitian matrix, supplied in packed form.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A is supplied in AP.
n – [in] [int] the number of rows and columns of matrix A. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
AP – [inout] device pointer storing the packed version of the specified triangular portion of the Hermitian matrix A. Of at least size ((n * (n + 1)) / 2).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of the Hermitian matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,1), (3,0), (4,9), (5,3), (6,0)]
(4,-9) (5,-3) (6,0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of the Hermitian matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,-1), (4,-9), (3,0), (5,-3), (6,0)]
(4,-9) (5,-3) (6,0)
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
The hpr functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChprBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const hipComplex *const x[], int incx, hipComplex *const AP[], int batchCount)#
-
hipblasStatus_t hipblasZhprBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const hipDoubleComplex *const x[], int incx, hipDoubleComplex *const AP[], int batchCount)#
BLAS Level 2 API
The hprBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*x_i**H
alphais a real scalar,x_iis a vector, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each Hermitian matrix A_i of at least size ((n * (n + 1)) / 2). Array is of at least size batchCount.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,1), (3,0), (4,9), (5,3), (6,0)]
(4,-9) (5,-3) (6,0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,-1), (4,-9), (3,0), (5,-3), (6,0)]
(4,-9) (5,-3) (6,0)
Note that the imaginary part of the diagonal elements are not accessed and are assumed to be 0.
batchCount – [in] [int] number of instances in the batch.
The hprBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChprStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const hipComplex *x, int incx, hipblasStride stridex, hipComplex *AP, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZhprStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *AP, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The hprStridedBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*x_i**H
alphais a real scalar,x_iis a vector, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector (x_1).
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each Hermitian matrix A_i. Points to the first matrix (A_1).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,1), (3,0), (4,9), (5,3), (6,0)]
(4,-9) (5,-3) (6,0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,-1), (4,-9), (3,0), (5,-3), (6,0)]
(4,-9) (5,-3) (6,0)
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
strideA – [in] [hipblasStride] stride from the start of one (A_i) to the next (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The hprStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXhpr2 + Batched, StridedBatched#
-
hipblasStatus_t hipblasChpr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, const hipComplex *y, int incy, hipComplex *AP)#
-
hipblasStatus_t hipblasZhpr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, const hipDoubleComplex *y, int incy, hipDoubleComplex *AP)#
BLAS Level 2 API
The hpr2 functions perform the matrix-vector operations:
whereA := A + alpha*x*y**H + conj(alpha)*y*x**H
alphais a complex scalar,xandyare vectors, andAis annbynHermitian matrix, supplied in packed form.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A is supplied in AP.
n – [in] [int] the number of rows and columns of matrix A. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
AP – [inout] device pointer storing the packed version of the specified triangular portion of the Hermitian matrix A. Of at least size ((n * (n + 1)) / 2).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of the Hermitian matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,1), (3,0), (4,9), (5,3), (6,0)]
(4,-9) (5,-3) (6,0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of the Hermitian matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,-1), (4,-9), (3,0), (5,-3), (6,0)]
(4,-9) (5,-3) (6,0)
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
The hpr2 functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChpr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, const hipComplex *const y[], int incy, hipComplex *const AP[], int batchCount)#
-
hipblasStatus_t hipblasZhpr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *const y[], int incy, hipDoubleComplex *const AP[], int batchCount)#
BLAS Level 2 API
The hpr2Batched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*y_i**H + conj(alpha)*y_i*x_i**H
alphais a complex scalar,x_iandy_iare vectors, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each Hermitian matrix A_i of at least size ((n * (n + 1)) / 2). Array is of at least size batchCount.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,1), (3,0), (4,9), (5,3), (6,0)]
(4,-9) (5,-3) (6,0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,-1), (4,-9), (3,0), (5,-3), (6,0)]
(4,-9) (5,-3) (6,0)
Note that the imaginary parts of the diagonal elements are not accessed and are assumed to be 0.
batchCount – [in] [int] number of instances in the batch.
The hpr2Batched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChpr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *y, int incy, hipblasStride stridey, hipComplex *AP, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZhpr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *y, int incy, hipblasStride stridey, hipDoubleComplex *AP, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The hpr2StridedBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*y_i**H + conj(alpha)*y_i*x_i**H
alphais a complex scalar,x_iandy_iare vectors, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector (x_1).
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
y – [in] device pointer pointing to the first vector (y_1).
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each Hermitian matrix A_i. Points to the first matrix (A_1).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,1), (3,0), (4,9), (5,3), (6,0)]
(4,-9) (5,-3) (6,0)
if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each Hermitian matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 3)
(1, 0) (2, 1) (4,9)
(2,-1) (3, 0) (5,3) -> [(1,0), (2,-1), (4,-9), (3,0), (5,-3), (6,0)]
(4,-9) (5,-3) (6,0)
Note that the imaginary part of the diagonal elements are not accessed and are assumed to be 0.
strideA – [in] [hipblasStride] stride from the start of one (A_i) to the next (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The hpr2StridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsbmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const float *alpha, const float *AP, int lda, const float *x, int incx, const float *beta, float *y, int incy)#
-
hipblasStatus_t hipblasDsbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const double *alpha, const double *AP, int lda, const double *x, int incx, const double *beta, double *y, int incy)#
BLAS Level 2 API
The sbmv functions perform the matrix-vector operation:
wherey := alpha*A*x + beta*y,
alphaandbetaare scalars,xandyaren-element vectors, andAshould contain an upper or lower triangularnbynsymmetric banded matrix.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS :
sandd.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int]
k – [in] [int] specifies the number of sub- and super-diagonals.
alpha – [in] specifies the scalar alpha.
AP – [in] pointer storing matrix A on the GPU.
lda – [in] [int] specifies the leading dimension of matrix A.
x – [in] pointer storing vector x on the GPU.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] specifies the scalar beta.
y – [out] pointer storing vector y on the GPU.
incy – [in] [int] specifies the increment for the elements of y.
The sbmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const float *alpha, const float *const AP[], int lda, const float *const x[], int incx, const float *beta, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDsbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const double *alpha, const double *const AP[], int lda, const double *const x[], int incx, const double *beta, double *const y[], int incy, int batchCount)#
BLAS Level 2 API
The sbmvBatched functions perform the matrix-vector operation:
wherey_i := alpha*A_i*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis annbynsymmetric banded matrix, fori= 1, …,batchCount.Ashould contain an upper or lower triangularnbynsymmetric banded matrix.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] number of rows and columns of each matrix A_i.
k – [in] [int] specifies the number of sub- and super-diagonals.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each matrix A_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [out] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
batchCount – [in] [int] number of instances in the batch.
The sbmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, const float *beta, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDsbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, int k, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, const double *beta, double *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The sbmvStridedBatched functions perform the matrix-vector operation:
wherey_i := alpha*A_i*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis annbynsymmetric banded matrix, fori= 1, …,batchCount.Ashould contain an upper or lower triangularnbynsymmetric banded matrix.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] number of rows and columns of each matrix A_i.
k – [in] [int] specifies the number of sub- and super-diagonals.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device pointer to the first matrix A_1 on the GPU.
lda – [in] [int] specifies the leading dimension of each matrix A_i.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
x – [in] device pointer to the first vector x_1 on the GPU.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. This typically means stridex >= n * incx. stridex should be non zero.
beta – [in] device pointer or host pointer to scalar beta.
y – [out] device pointer to the first vector y_1 on the GPU.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should ensure that stridey is of an appropriate size. This typically means stridey >= n * incy. stridey should be non zero.
batchCount – [in] [int] number of instances in the batch.
The sbmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXspmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasSspmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *AP, const float *x, int incx, const float *beta, float *y, int incy)#
-
hipblasStatus_t hipblasDspmv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *AP, const double *x, int incx, const double *beta, double *y, int incy)#
BLAS Level 2 API
The spmv functions perform the matrix-vector operation:
wherey := alpha*A*x + beta*y,
alphaandbetaare scalars,xandyaren-element vectors, andAshould contain an upper or lower triangularnbynpacked symmetric matrix.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS :
sandd.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int]
alpha – [in] specifies the scalar alpha.
AP – [in] pointer storing matrix A on the GPU.
x – [in] pointer storing vector x on the GPU.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] specifies the scalar beta.
y – [out] pointer storing vector y on the GPU.
incy – [in] [int] specifies the increment for the elements of y.
The spmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSspmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *const AP[], const float *const x[], int incx, const float *beta, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDspmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *const AP[], const double *const x[], int incx, const double *beta, double *const y[], int incy, int batchCount)#
BLAS Level 2 API
The spmvBatched functions perform the matrix-vector operation:
wherey_i := alpha*AP_i*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis annbynsymmetric matrix, fori= 1, …,batchCount.Ashould contain an upper or lower triangularnbynpacked symmetric matrix.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] number of rows and columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [out] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
batchCount – [in] [int] number of instances in the batch.
The spmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSspmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *AP, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, const float *beta, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDspmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *AP, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, const double *beta, double *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The spmvStridedBatched functions perform the matrix-vector operation:
wherey_i := alpha*A_i*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis annbynsymmetric matrix, fori= 1, …,batchCount.Ashould contain an upper or lower triangularnbynpacked symmetric matrix.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] number of rows and columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] Device pointer to the first matrix A_1 on the GPU.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
x – [in] Device pointer to the first vector x_1 on the GPU.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should take care to ensure that stridex is of an appropriate size. This typically means stridex >= n * incx. stridex should be non zero.
beta – [in] device pointer or host pointer to scalar beta.
y – [out] Device pointer to the first vector y_1 on the GPU.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should take care to ensure that stridey is of an appropriate size. This typically means stridey >= n * incy. stridey should be non zero.
batchCount – [in] [int] number of instances in the batch.
The spmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXspr + Batched, StridedBatched#
-
hipblasStatus_t hipblasSspr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, float *AP)#
-
hipblasStatus_t hipblasDspr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, double *AP)#
-
hipblasStatus_t hipblasCspr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipComplex *AP)#
-
hipblasStatus_t hipblasZspr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipDoubleComplex *AP)#
BLAS Level 2 API
The spr functions perform the matrix-vector operations:
whereA := A + alpha*x*x**T
alphais a scalar,xis a vector, andAis annbynsymmetric matrix, supplied in packed form.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A is supplied in AP.
n – [in] [int] the number of rows and columns of matrix A. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
AP – [inout] device pointer storing the packed version of the specified triangular portion of the symmetric matrix A. Of at least size ((n * (n + 1)) / 2).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 4)
1 2 4 7
2 3 5 8 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
4 5 6 9
7 8 9 0if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 4)
1 2 3 4
2 5 6 7 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
3 6 8 9
4 7 9 0
The spr functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsprBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *const x[], int incx, float *const AP[], int batchCount)#
-
hipblasStatus_t hipblasDsprBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *const x[], int incx, double *const AP[], int batchCount)#
-
hipblasStatus_t hipblasCsprBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, hipComplex *const AP[], int batchCount)#
-
hipblasStatus_t hipblasZsprBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, hipDoubleComplex *const AP[], int batchCount)#
BLAS Level 2 API
The sprBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*x_i**T
alphais a scalar,x_iis a vector, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each symmetric matrix A_i of at least size ((n * (n + 1)) / 2). Array is of at least size batchCount.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 4)
1 2 4 7
2 3 5 8 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
4 5 6 9
7 8 9 0if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 4)
1 2 3 4
2 5 6 7 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
3 6 8 9
4 7 9 0
batchCount – [in] [int] number of instances in the batch.
The sprBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsprStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, float *AP, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasDsprStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, double *AP, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasCsprStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, hipComplex *AP, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZsprStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *AP, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The sprStridedBatched functions perform the matrix-vector operations:
whereA_i := A_i + alpha*x_i*x_i**T
alphais a scalar,x_iis a vector, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector (x_1).
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
AP – [inout] device pointer storing the packed version of the specified triangular portion of each symmetric matrix A_i. Points to the first A_1.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 4)
1 2 4 7
2 3 5 8 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
4 5 6 9
7 8 9 0if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(2) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 4)
1 2 3 4
2 5 6 7 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
3 6 8 9
4 7 9 0
strideA – [in] [hipblasStride] stride from the start of one (A_i) to the next (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The sprStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXspr2 + Batched, StridedBatched#
-
hipblasStatus_t hipblasSspr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, const float *y, int incy, float *AP)#
-
hipblasStatus_t hipblasDspr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, const double *y, int incy, double *AP)#
BLAS Level 2 API
The spr2 functions perform the matrix-vector operation:
whereA := A + alpha*x*y**T + alpha*y*x**T
alphais a scalar,xandyare vectors, andAis annbynsymmetric matrix, supplied in packed form.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS :
sandd.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of A is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of A is supplied in AP.
n – [in] [int] the number of rows and columns of matrix A. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
AP – [inout] device pointer storing the packed version of the specified triangular portion of the symmetric matrix A. Of at least size ((n * (n + 1)) / 2).
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 4)
1 2 4 7
2 3 5 8 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
4 5 6 9
7 8 9 0if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of the symmetric matrix A is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(n) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 4)
1 2 3 4
2 5 6 7 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
3 6 8 9
4 7 9 0
The spr2 functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSspr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *const x[], int incx, const float *const y[], int incy, float *const AP[], int batchCount)#
-
hipblasStatus_t hipblasDspr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *const x[], int incx, const double *const y[], int incy, double *const AP[], int batchCount)#
BLAS Level 2 API
The spr2Batched functions perform the matrix-vector operation:
whereA_i := A_i + alpha*x_i*y_i**T + alpha*y_i*x_i**T
alphais a scalar,x_iandy_iare vectors, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
AP – [inout] device array of device pointers storing the packed version of the specified triangular portion of each symmetric matrix A_i of at least size ((n * (n + 1)) / 2). Array is of at least size batchCount.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 4)
1 2 4 7
2 3 5 8 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
4 5 6 9
7 8 9 0if uplo == HIPBLAS_FILL_MODE_LOWER:
The lower triangular portion of each symmetric matrix A_i is supplied.
The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(n) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 4)
1 2 3 4
2 5 6 7 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
3 6 8 9
4 7 9 0
batchCount – [in] [int] number of instances in the batch.
The spr2Batched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSspr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, const float *y, int incy, hipblasStride stridey, float *AP, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasDspr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, const double *y, int incy, hipblasStride stridey, double *AP, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The spr2StridedBatched functions perform the matrix-vector operation:
whereA_i := A_i + alpha*x_i*y_i**T + alpha*y_i*x_i**T
alphais a scalar,x_iandy_iare vectors, andA_iis annbynsymmetric matrix, supplied in packed form, fori= 1, …,batchCount.Supported precisions in rocBLAS :
sandd.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
HIPBLAS_FILL_MODE_UPPER: The upper triangular part of each A_i is supplied in AP.
HIPBLAS_FILL_MODE_LOWER: The lower triangular part of each A_i is supplied in AP.
n – [in] [int] the number of rows and columns of each matrix A_i. Must be at least 0.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer pointing to the first vector (x_1).
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
y – [in] device pointer pointing to the first vector (y_1).
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
AP – [inout] device pointer storing the packed version of the specified triangular portion of each symmetric matrix A_i. Points to the first A_1.
if uplo == HIPBLAS_FILL_MODE_UPPER: The upper triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(0,1)
AP(2) = A(1,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 4)
1 2 4 7
2 3 5 8 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
4 5 6 9
7 8 9 0if uplo == HIPBLAS_FILL_MODE_LOWER: The lower triangular portion of each symmetric matrix A_i is supplied. The matrix is compacted so that AP contains the triangular portion column-by-column so that:
AP(0) = A(0,0)
AP(1) = A(1,0)
AP(n) = A(2,1), and so forth.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 4)
1 2 3 4
2 5 6 7 -> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
3 6 8 9
4 7 9 0
strideA – [in] [hipblasStride] stride from the start of one (A_i) to the next (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The spr2StridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsymv + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsymv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *AP, int lda, const float *x, int incx, const float *beta, float *y, int incy)#
-
hipblasStatus_t hipblasDsymv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *AP, int lda, const double *x, int incx, const double *beta, double *y, int incy)#
-
hipblasStatus_t hipblasCsymv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *x, int incx, const hipComplex *beta, hipComplex *y, int incy)#
-
hipblasStatus_t hipblasZsymv(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *x, int incx, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy)#
BLAS Level 2 API
The symv functions perform the matrix-vector operation:
wherey := alpha*A*x + beta*y,
alphaandbetaare scalars,xandyaren-element vectors, andAshould contain an upper or lower triangularnbynsymmetric matrix.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int]
alpha – [in] specifies the scalar alpha.
AP – [in] pointer storing matrix A on the GPU.
lda – [in] [int] specifies the leading dimension of A.
x – [in] pointer storing vector x on the GPU.
incx – [in] [int] specifies the increment for the elements of x.
beta – [in] specifies the scalar beta.
y – [out] pointer storing vector y on the GPU.
incy – [in] [int] specifies the increment for the elements of y.
The symv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsymvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *const AP[], int lda, const float *const x[], int incx, const float *beta, float *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasDsymvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *const AP[], int lda, const double *const x[], int incx, const double *beta, double *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasCsymvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const x[], int incx, const hipComplex *beta, hipComplex *const y[], int incy, int batchCount)#
-
hipblasStatus_t hipblasZsymvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *beta, hipDoubleComplex *const y[], int incy, int batchCount)#
BLAS Level 2 API
symvBatched performs the matrix-vector operation:
wherey_i := alpha*A_i*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis an n by n symmetric matrix, fori= 1, …,batchCount.Ashould contain an upper or lower triangular symmetric matrix. The opposing triangular part ofAis not referenced.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] number of rows and columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each matrix A_i.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
beta – [in] device pointer or host pointer to scalar beta.
y – [out] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
batchCount – [in] [int] number of instances in the batch.
The symvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsymvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, const float *beta, float *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasDsymvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, const double *beta, double *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasCsymvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *beta, hipComplex *y, int incy, hipblasStride stridey, int batchCount)#
-
hipblasStatus_t hipblasZsymvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *beta, hipDoubleComplex *y, int incy, hipblasStride stridey, int batchCount)#
BLAS Level 2 API
The symvStridedBatched functions perform the matrix-vector operation:
wherey_i := alpha*A_i*x_i + beta*y_i,
(A_i, x_i, y_i)is thei-th instance of the batch,alphaandbetaare scalars,x_iandy_iare vectors, andA_iis annbynsymmetric matrix, fori= 1, …,batchCount.Ashould contain an upper or lower triangular symmetric matrix. The opposing triangular part of A is not referencedSupported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] number of rows and columns of each matrix A_i.
alpha – [in] device pointer or host pointer to scalar alpha.
AP – [in] Device pointer to the first matrix A_1 on the GPU.
lda – [in] [int] specifies the leading dimension of each matrix A_i.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
x – [in] Device pointer to the first vector x_1 on the GPU.
incx – [in] [int] specifies the increment for the elements of each vector x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should take care to ensure that stridex is of an appropriate size. This typically means stridex >= n * incx. stridex should be non zero.
beta – [in] device pointer or host pointer to scalar beta.
y – [out] Device pointer to the first vector y_1 on the GPU.
incy – [in] [int] specifies the increment for the elements of each vector y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should take care to ensure that stridey is of an appropriate size. This typically means stridey >= n * incy. stridey should be non zero.
batchCount – [in] [int] number of instances in the batch.
The symvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsyr + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsyr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, float *AP, int lda)#
-
hipblasStatus_t hipblasDsyr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, double *AP, int lda)#
-
hipblasStatus_t hipblasCsyr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipComplex *AP, int lda)#
-
hipblasStatus_t hipblasZsyr(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipDoubleComplex *AP, int lda)#
BLAS Level 2 API
The syr functions perform the matrix-vector operations:
whereA := A + alpha*x*x**T
alphais a scalar,xis a vector, andAis annbynsymmetric matrix.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
AP – [inout] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
The syr functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyrBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *const x[], int incx, float *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasDsyrBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *const x[], int incx, double *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasCsyrBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, hipComplex *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasZsyrBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, hipDoubleComplex *const AP[], int lda, int batchCount)#
BLAS Level 2 API
The syrBatched functions perform a batch of matrix-vector operations:
whereA[i] := A[i] + alpha*x[i]*x[i]**T
alphais a scalar,xis an array of vectors, andAis an array ofnbynsymmetric matrices, fori= 1 , … ,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
AP – [inout] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i.
batchCount – [in] [int] number of instances in the batch.
The syrBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyrStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, float *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasDsyrStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, double *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasCsyrStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, hipComplex *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZsyrStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *AP, int lda, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The syrStridedBatched functions perform the matrix-vector operations:
whereA[i] := A[i] + alpha*x[i]*x[i]**T
alphais a scalar,xis a pointer to an array of vectors, andAis an array ofnbynsymmetric matrices, fori= 1 , … ,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] the number of rows and columns of each matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] specifies the pointer increment between vectors (x_i) and (x_i+1).
AP – [inout] device pointer to the first matrix A_1.
lda – [in] [int] specifies the leading dimension of each A_i.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The syrStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsyr2 + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsyr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, const float *y, int incy, float *AP, int lda)#
-
hipblasStatus_t hipblasDsyr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, const double *y, int incy, double *AP, int lda)#
-
hipblasStatus_t hipblasCsyr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, const hipComplex *y, int incy, hipComplex *AP, int lda)#
-
hipblasStatus_t hipblasZsyr2(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, const hipDoubleComplex *y, int incy, hipDoubleComplex *AP, int lda)#
BLAS Level 2 API
The syr2 functions perform the matrix-vector operations:
whereA := A + alpha*x*y**T + alpha*y*x**T
alphais a scalar,xandyare vectors, andAis annbynsymmetric matrix.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [in] device pointer storing vector y.
incy – [in] [int] specifies the increment for the elements of y.
AP – [inout] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
The syr2 functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *const x[], int incx, const float *const y[], int incy, float *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasDsyr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *const x[], int incx, const double *const y[], int incy, double *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasCsyr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *const x[], int incx, const hipComplex *const y[], int incy, hipComplex *const AP[], int lda, int batchCount)#
-
hipblasStatus_t hipblasZsyr2Batched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const x[], int incx, const hipDoubleComplex *const y[], int incy, hipDoubleComplex *const AP[], int lda, int batchCount)#
BLAS Level 2 API
The syr2Batched functions perform a batch of matrix-vector operations:
whereA[i] := A[i] + alpha*x[i]*y[i]**T + alpha*y[i]*x[i]**T
alphais a scalar,x[i]andy[i]are vectors, andA[i]is annbynsymmetric matrix, fori= 1 , … ,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] the number of rows and columns of matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
AP – [inout] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i.
batchCount – [in] [int] number of instances in the batch.
The syr2Batched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const float *alpha, const float *x, int incx, hipblasStride stridex, const float *y, int incy, hipblasStride stridey, float *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasDsyr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const double *alpha, const double *x, int incx, hipblasStride stridex, const double *y, int incy, hipblasStride stridey, double *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasCsyr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipComplex *alpha, const hipComplex *x, int incx, hipblasStride stridex, const hipComplex *y, int incy, hipblasStride stridey, hipComplex *AP, int lda, hipblasStride strideA, int batchCount)#
-
hipblasStatus_t hipblasZsyr2StridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *x, int incx, hipblasStride stridex, const hipDoubleComplex *y, int incy, hipblasStride stridey, hipDoubleComplex *AP, int lda, hipblasStride strideA, int batchCount)#
BLAS Level 2 API
The syr2StridedBatched functions perform the matrix-vector operations:
whereA[i] := A[i] + alpha*x[i]*y[i]**T + alpha*y[i]*x[i]**T
alphais a scalar,x[i]andy[i]are vectors, andA[i]is annbynsymmetric matrices, fori= 1 , … ,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
n – [in] [int] the number of rows and columns of each matrix A.
alpha – [in] device pointer or host pointer to scalar alpha.
x – [in] device pointer to the first vector x_1.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] specifies the pointer increment between vectors (x_i) and (x_i+1).
y – [in] device pointer to the first vector y_1.
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] specifies the pointer increment between vectors (y_i) and (y_i+1).
AP – [inout] device pointer to the first matrix A_1.
lda – [in] [int] specifies the leading dimension of each A_i.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
batchCount – [in] [int] number of instances in the batch.
The syr2StridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtbmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasStbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const float *AP, int lda, float *x, int incx)#
-
hipblasStatus_t hipblasDtbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const double *AP, int lda, double *x, int incx)#
-
hipblasStatus_t hipblasCtbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipComplex *AP, int lda, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasZtbmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipDoubleComplex *AP, int lda, hipDoubleComplex *x, int incx)#
BLAS Level 2 API.
The tbmv functions perform one of the matrix-vector operations:
wherex := A*x or x := A**T*x or x := A**H*x,
xis a vector andAis a bandednbynmatrix (see description below).Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper banded triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower banded triangular matrix.
transA – [in] [hipblasOperation_t] indicates whether matrix A is tranposed (conjugated) or not.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: The main diagonal of A is assumed to consist of only 1’s and is not referenced.
HIPBLAS_DIAG_NON_UNIT: No assumptions are made of A’s main diagonal.
n – [in] [int] the number of rows and columns of the matrix represented by A.
k – [in] [int]
if uplo == HIPBLAS_FILL_MODE_UPPER, k specifies the number of super-diagonals of the matrix A.
if uplo == HIPBLAS_FILL_MODE_LOWER, k specifies the number of sub-diagonals of the matrix A.
k must satisfy k > 0 && k < lda.
AP – [in] device pointer storing banded triangular matrix A.
if uplo == HIPBLAS_FILL_MODE_UPPER: The matrix represented is an upper banded triangular matrix with the main diagonal and k super-diagonals. Everything else can be assumed to be 0.
The matrix is compacted so that the main diagonal resides on the k’th row, the first super diagonal resides on the RHS of the k-1’th row, and so forth, with the k’th diagonal on the RHS of the 0’th row.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 5; k = 2)
1 6 9 0 0 -> 0 0 9 8 7
0 2 7 8 0 -> 0 6 7 8 9
0 0 3 8 7 -> 1 2 3 4 5
0 0 0 4 9 -> 0 0 0 0 0
0 0 0 0 5 -> 0 0 0 0 0
if uplo == HIPBLAS_FILL_MODE_LOWER: The matrix represnted is a lower banded triangular matrix with the main diagonal and k sub-diagonals. Everything else can be assumed to be 0.
The matrix is compacted so that the main diagonal resides on the 0’th row, working up to the k’th diagonal residing on the LHS of the k’th row.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 5; k = 2)
1 0 0 0 0 -> 1 2 3 4 5
6 2 0 0 0 -> 6 7 8 9 0
9 7 3 0 0 -> 9 8 7 0 0
0 8 8 4 0 -> 0 0 0 0 0
0 0 7 9 5 -> 0 0 0 0 0
lda – [in] [int] specifies the leading dimension of A. lda must satisfy lda > k.
x – [inout] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
The tbmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const float *const AP[], int lda, float *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasDtbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const double *const AP[], int lda, double *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCtbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipComplex *const AP[], int lda, hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZtbmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipDoubleComplex *const AP[], int lda, hipDoubleComplex *const x[], int incx, int batchCount)#
BLAS Level 2 API
The tbmvBatched functions perform one of the matrix-vector operations:
wherex_i := A_i*x_i or x_i := A_i**T*x_i or x_i := A_i**H*x_i,
(A_i, x_i)is thei-th instance of the batch,x_iis a vector, andA_iis annbynmatrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper banded triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower banded triangular matrix.
transA – [in] [hipblasOperation_t] indicates whether each matrix A_i is tranposed (conjugated) or not.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: The main diagonal of each A_i is assumed to consist of only 1’s and is not referenced.
HIPBLAS_DIAG_NON_UNIT: No assumptions are made of each A_i’s main diagonal.
n – [in] [int] the number of rows and columns of the matrix represented by each A_i.
k – [in] [int]
if uplo == HIPBLAS_FILL_MODE_UPPER, k specifies the number of super-diagonals of each matrix A_i.
if uplo == HIPBLAS_FILL_MODE_LOWER, k specifies the number of sub-diagonals of each matrix A_i.
k must satisfy k > 0 && k < lda.
AP – [in] device array of device pointers storing each banded triangular matrix A_i.
if uplo == HIPBLAS_FILL_MODE_UPPER: The matrix represented is an upper banded triangular matrix with the main diagonal and k super-diagonals. Everything else can be assumed to be 0.
The matrix is compacted so that the main diagonal resides on the k’th row, the first super diagonal resides on the RHS of the k-1’th row, and so forth, with the k’th diagonal on the RHS of the 0’th row.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 5; k = 2)
1 6 9 0 0 -> 0 0 9 8 7
0 2 7 8 0 -> 0 6 7 8 9
0 0 3 8 7 -> 1 2 3 4 5
0 0 0 4 9 -> 0 0 0 0 0
0 0 0 0 5 -> 0 0 0 0 0
if uplo == HIPBLAS_FILL_MODE_LOWER: The matrix represnted is a lower banded triangular matrix with the main diagonal and k sub-diagonals. Everything else can be assumed to be 0.
The matrix is compacted so that the main diagonal resides on the 0’th row, working up to the k’th diagonal residing on the LHS of the k’th row.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 5; k = 2)
1 0 0 0 0 -> 1 2 3 4 5
6 2 0 0 0 -> 6 7 8 9 0
9 7 3 0 0 -> 9 8 7 0 0
0 8 8 4 0 -> 0 0 0 0 0
0 0 7 9 5 -> 0 0 0 0 0
lda – [in] [int] specifies the leading dimension of each A_i. lda must satisfy lda > k.
x – [inout] device array of device pointer storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
batchCount – [in] [int] number of instances in the batch.
The tbmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const float *AP, int lda, hipblasStride strideA, float *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasDtbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const double *AP, int lda, hipblasStride strideA, double *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCtbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipComplex *AP, int lda, hipblasStride strideA, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZtbmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipDoubleComplex *AP, int lda, hipblasStride strideA, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
BLAS Level 2 API
The tbmvStridedBatched functions perform one of the matrix-vector operations:
wherex_i := A_i*x_i or x_i := A_i**T*x_i or x_i := A_i**H*x_i,
(A_i, x_i)is thei-th instance of the batch,x_iis a vector, andA_iis annbynmatrix, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper banded triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower banded triangular matrix.
transA – [in] [hipblasOperation_t] indicates whether each matrix A_i is tranposed (conjugated) or not.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: The main diagonal of each A_i is assumed to consist of only 1’s and is not referenced.
HIPBLAS_DIAG_NON_UNIT: No assumptions are made of each A_i’s main diagonal.
n – [in] [int] the number of rows and columns of the matrix represented by each A_i.
k – [in] [int]
if uplo == HIPBLAS_FILL_MODE_UPPER, k specifies the number of super-diagonals of each matrix A_i.
if uplo == HIPBLAS_FILL_MODE_LOWER, k specifies the number of sub-diagonals of each matrix A_i.
k must satisfy k > 0 && k < lda.
AP – [in] device array to the first matrix A_i of the batch. Stores each banded triangular matrix A_i.
if uplo == HIPBLAS_FILL_MODE_UPPER: The matrix represented is an upper banded triangular matrix with the main diagonal and k super-diagonals. Everything else can be assumed to be 0.
The matrix is compacted so that the main diagonal resides on the k’th row, the first super diagonal resides on the RHS of the k-1’th row, and so forth, with the k’th diagonal on the RHS of the 0’th row.
Ex: (HIPBLAS_FILL_MODE_UPPER; n = 5; k = 2)
1 6 9 0 0 -> 0 0 9 8 7
0 2 7 8 0 -> 0 6 7 8 9
0 0 3 8 7 -> 1 2 3 4 5
0 0 0 4 9 -> 0 0 0 0 0
0 0 0 0 5 -> 0 0 0 0 0
if uplo == HIPBLAS_FILL_MODE_LOWER: The matrix represnted is a lower banded triangular matrix with the main diagonal and k sub-diagonals. Everything else can be assumed to be 0.
The matrix is compacted so that the main diagonal resides on the 0’th row, working up to the k’th diagonal residing on the LHS of the k’th row.
Ex: (HIPBLAS_FILL_MODE_LOWER; n = 5; k = 2)
1 0 0 0 0 -> 1 2 3 4 5
6 2 0 0 0 -> 6 7 8 9 0
9 7 3 0 0 -> 9 8 7 0 0
0 8 8 4 0 -> 0 0 0 0 0
0 0 7 9 5 -> 0 0 0 0 0
lda – [in] [int] specifies the leading dimension of each A_i. lda must satisfy lda > k.
strideA – [in] [hipblasStride] stride from the start of one A_i matrix to the next A_(i + 1).
x – [inout] device array to the first vector x_i of the batch.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one x_i matrix to the next x_(i + 1).
batchCount – [in] [int] number of instances in the batch.
The tbmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtbsv + Batched, StridedBatched#
-
hipblasStatus_t hipblasStbsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const float *AP, int lda, float *x, int incx)#
-
hipblasStatus_t hipblasDtbsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const double *AP, int lda, double *x, int incx)#
-
hipblasStatus_t hipblasCtbsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipComplex *AP, int lda, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasZtbsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipDoubleComplex *AP, int lda, hipDoubleComplex *x, int incx)#
BLAS Level 2 API
The tbsv functions solve:
whereA*x = b or A**T*x = b or A**H*x = b,
xandbare vectors andAis a banded triangular matrix.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: Solves A*x = b
HIPBLAS_OP_T: Solves A**T*x = b
HIPBLAS_OP_C: Solves A**H*x = b
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular (that is, the diagonal elements of A are not used in computations).
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of b. n >= 0.
k – [in] [int]
if(uplo == HIPBLAS_FILL_MODE_UPPER), k specifies the number of super-diagonals of A.
if(uplo == HIPBLAS_FILL_MODE_LOWER), k specifies the number of sub-diagonals of A.
k >= 0.
AP – [in] device pointer storing the matrix A in banded format.
lda – [in] [int] specifies the leading dimension of A. lda >= (k + 1).
x – [inout] device pointer storing input vector b. Overwritten by the output vector x.
incx – [in] [int] specifies the increment for the elements of x.
The tbsv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStbsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const float *const AP[], int lda, float *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasDtbsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const double *const AP[], int lda, double *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCtbsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipComplex *const AP[], int lda, hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZtbsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipDoubleComplex *const AP[], int lda, hipDoubleComplex *const x[], int incx, int batchCount)#
BLAS Level 2 API
The tbsvBatched functions solve:
whereA_i*x_i = b_i or A_i**T*x_i = b_i or A_i**H*x_i = b_i,
x_iandb_iare vectors andA_iis a banded triangular matrix, fori= [1,batchCount].The input vectors
b_iare overwritten by the output vectorsx_i.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: Solves A_i*x_i = b_i
HIPBLAS_OP_T: Solves A_i**T*x_i = b_i
HIPBLAS_OP_C: Solves A_i**H*x_i = b_i
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular (that is, the diagonal elements of each A_i are not used in computations).
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of each b_i. n >= 0.
k – [in] [int]
if(uplo == HIPBLAS_FILL_MODE_UPPER), k specifies the number of super-diagonals of each A_i.
if(uplo == HIPBLAS_FILL_MODE_LOWER), k specifies the number of sub-diagonals of each A_i.
k >= 0.
AP – [in] device vector of device pointers storing each matrix A_i in banded format.
lda – [in] [int] specifies the leading dimension of each A_i. lda >= (k + 1).
x – [inout] device vector of device pointers storing each input vector b_i. Overwritten by each output vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
batchCount – [in] [int] number of instances in the batch.
The tbsvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStbsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const float *AP, int lda, hipblasStride strideA, float *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasDtbsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const double *AP, int lda, hipblasStride strideA, double *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCtbsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipComplex *AP, int lda, hipblasStride strideA, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZtbsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, int k, const hipDoubleComplex *AP, int lda, hipblasStride strideA, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
BLAS Level 2 API
The tbsvStridedBatched functions solve:
whereA_i*x_i = b_i or A_i**T*x_i = b_i or A_i**H*x_i = b_i,
x_iandb_iare vectors andA_iis a banded triangular matrix, fori= [1,batchCount].The input vectors
b_iare overwritten by the output vectorsx_i.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: Solves A_i*x_i = b_i
HIPBLAS_OP_T: Solves A_i**T*x_i = b_i
HIPBLAS_OP_C: Solves A_i**H*x_i = b_i
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular (that is, the diagonal elements of each A_i are not used in computations).
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of each b_i. n >= 0.
k – [in] [int]
if(uplo == HIPBLAS_FILL_MODE_UPPER), k specifies the number of super-diagonals of each A_i.
if(uplo == HIPBLAS_FILL_MODE_LOWER), k specifies the number of sub-diagonals of each A_i.
k >= 0.
AP – [in] device pointer pointing to the first banded matrix A_1.
lda – [in] [int] specifies the leading dimension of each A_i. lda >= (k + 1).
strideA – [in] [hipblasStride] specifies the distance between the start of one matrix (A_i) and the next (A_i+1).
x – [inout] device pointer pointing to the first input vector b_1. Overwritten by output vectors x.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] specifies the distance between the start of one vector (x_i) and the next (x_i+1).
batchCount – [in] [int] number of instances in the batch.
The tbsvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtpmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasStpmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, float *x, int incx)#
-
hipblasStatus_t hipblasDtpmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, double *x, int incx)#
-
hipblasStatus_t hipblasCtpmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasZtpmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, hipDoubleComplex *x, int incx)#
BLAS Level 2 API
The tpmv functions perform one of the matrix-vector operations:
wherex = A*x or x = A**T*x,
xis ann-element vector andAis annbynunit, or non-unit, upper or lower triangular matrix, supplied in the pack form.The vector
xis overwritten.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of A. n >= 0.
AP – [in] device pointer storing matrix A, of dimension at least ( n * ( n + 1 ) / 2 ).
Before entry with uplo = HIPBLAS_FILL_MODE_UPPER, the array A must contain the upper triangular matrix packed sequentially, column by column, so that A[0] contains a_{0,0}, A[1] and A[2] contain a_{0,1} and a_{1, 1} respectively, and so on.
Before entry with uplo = HIPBLAS_FILL_MODE_LOWER, the array A must contain the lower triangular matrix packed sequentially, column by column, so that A[0] contains a_{0,0}, A[1] and A[2] contain a_{1,0} and a_{2,0} respectively, and so on.
Note that when DIAG = HIPBLAS_DIAG_UNIT, the diagonal elements of A are not referenced, but are assumed to be unity.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x. incx must not be zero.
The tpmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStpmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *const AP[], float *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasDtpmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *const AP[], double *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCtpmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *const AP[], hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZtpmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *const AP[], hipDoubleComplex *const x[], int incx, int batchCount)#
BLAS Level 2 API
The tpmvBatched functions perform one of the matrix-vector operations:
wherex_i = A_i*x_i or x_i = A**T*x_i, 0 \le i < batchCount
x_iis ann-element vector andA_iis annbyn(unit, or non-unit, upper or lower triangular matrix).The vectors
x_iare overwritten.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of matrices A_i. n >= 0.
AP – [in] device pointer storing pointer of matrices A_i of dimension ( lda, n ).
x – [in] device pointer storing vectors x_i.
incx – [in] [int] specifies the increment for the elements of vectors x_i.
batchCount – [in] [int] The number of batched matrices/vectors.
The tpmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStpmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, hipblasStride strideA, float *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasDtpmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, hipblasStride strideA, double *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCtpmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, hipblasStride strideA, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZtpmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, hipblasStride strideA, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
BLAS Level 2 API
The tpmvStridedBatched functions perform one of the matrix-vector operations:
wherex_i = A_i*x_i or x_i = A**T*x_i, 0 \le i < batchCount
x_iis an n element vector andA_iis annbyn(unit, or non-unit, upper or lower triangular matrix), with strides specifying how to retrieve$x_i$(resp.$A_i$) from$x_{i-1}$(resp.$A_i$).The vectors
x_iare overwritten.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of matrices A_i. n >= 0.
AP – [in] device pointer of the matrix A_0 of dimension ( lda, n ).
strideA – [in] [hipblasStride] stride from the start of one A_i matrix to the next A_{i + 1}.
x – [in] device pointer storing the vector x_0.
incx – [in] [int] specifies the increment for the elements of one vector x.
stridex – [in] [hipblasStride] stride from the start of one x_i vector to the next x_{i + 1}.
batchCount – [in] [int] The number of batched matrices/vectors.
The tpmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtpsv + Batched, StridedBatched#
-
hipblasStatus_t hipblasStpsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, float *x, int incx)#
-
hipblasStatus_t hipblasDtpsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, double *x, int incx)#
-
hipblasStatus_t hipblasCtpsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasZtpsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, hipDoubleComplex *x, int incx)#
BLAS Level 2 API
The tpsv functions solve:
whereA*x = b or A**T*x = b, or A**H*x = b,
xandbare vectors andAis a triangular matrix stored in the packed format.The input vector
bis overwritten by the output vectorx.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: Solves A*x = b
HIPBLAS_OP_T: Solves A**T*x = b
HIPBLAS_OP_C: Solves A**H*x = b
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular (that is, the diagonal elements of A are not used in computations).
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of b. n >= 0.
AP – [in] device pointer storing the packed version of matrix A of dimension >= (n * (n + 1) / 2).
x – [inout] device pointer storing vector b on input, overwritten by x on output.
incx – [in] [int] specifies the increment for the elements of x.
The tpsv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStpsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *const AP[], float *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasDtpsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *const AP[], double *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCtpsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *const AP[], hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZtpsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *const AP[], hipDoubleComplex *const x[], int incx, int batchCount)#
BLAS Level 2 API
The tpsvBatched functions solve:
whereA_i*x_i = b_i or A_i**T*x_i = b_i, or A_i**H*x_i = b_i,
x_iandb_iare vectors andA_iis a triangular matrix stored in the packed format, foriin [1,batchCount].The input vectors
b_iare overwritten by the output vectorsx_i.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: Solves A*x = b
HIPBLAS_OP_T: Solves A**T*x = b
HIPBLAS_OP_C: Solves A**H*x = b
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular (that is, the diagonal elements of each A_i are not used in computations).
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of each b_i. n >= 0.
AP – [in] device array of device pointers storing the packed versions of each matrix A_i of dimension >= (n * (n + 1) / 2).
x – [inout] device array of device pointers storing each input vector b_i, overwritten by x_i on output.
incx – [in] [int] specifies the increment for the elements of each x_i.
batchCount – [in] [int] specifies the number of instances in the batch.
The tpsvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStpsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, hipblasStride strideA, float *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasDtpsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, hipblasStride strideA, double *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCtpsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, hipblasStride strideA, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZtpsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, hipblasStride strideA, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
BLAS Level 2 API
The tpsvStridedBatched functions solve:
whereA_i*x_i = b_i or A_i**T*x_i = b_i, or A_i**H*x_i = b_i,
x_iandb_iare vectors andA_iis a triangular matrix stored in the packed format, foriin [1,batchCount].The input vectors
b_iare overwritten by the output vectorsx_i.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: Solves A*x = b
HIPBLAS_OP_T: Solves A**T*x = b
HIPBLAS_OP_C: Solves A**H*x = b
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular (that is, the diagonal elements of each A_i are not used in computations).
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of each b_i. n >= 0.
AP – [in] device pointer pointing to the first packed matrix A_1 of dimension >= (n * (n + 1) / 2).
strideA – [in] [hipblasStride] stride from the beginning of one packed matrix (AP_i) to the next (AP_i+1).
x – [inout] device pointer pointing to the first input vector b_1. Overwritten by each x_i on output.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the beginning of one vector (x_i) to the next (x_i+1).
batchCount – [in] [int] specifies the number of instances in the batch.
The tpsvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtrmv + Batched, StridedBatched#
-
hipblasStatus_t hipblasStrmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, int lda, float *x, int incx)#
-
hipblasStatus_t hipblasDtrmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, int lda, double *x, int incx)#
-
hipblasStatus_t hipblasCtrmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, int lda, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasZtrmv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, int lda, hipDoubleComplex *x, int incx)#
BLAS Level 2 API
The trmv functions perform one of the matrix-vector operations:
wherex = A*x or x = A**T*x,
xis ann-element vector andAis annbynunit, or non-unit, upper or lower triangular matrix.The vector
xis overwritten.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of A. n >= 0.
AP – [in] device pointer storing matrix A, of dimension ( lda, n ).
lda – [in] [int] specifies the leading dimension of A. lda = max( 1, n ).
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
The trmv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *const AP[], int lda, float *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasDtrmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *const AP[], int lda, double *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCtrmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *const AP[], int lda, hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZtrmvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *const AP[], int lda, hipDoubleComplex *const x[], int incx, int batchCount)#
BLAS Level 2 API
The trmvBatched functions perform one of the matrix-vector operations:
wherex_i = A_i*x_i or x_i = A**T*x_i, 0 \le i < batchCount
x_iis ann-element vector andA_iis annbyn(unit, or non-unit, upper or lower triangular) matrix.The vectors
x_iare overwritten.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of matrices A_i. n >= 0.
AP – [in] device pointer storing pointer of matrices A_i, of dimension ( lda, n ).
lda – [in] [int] specifies the leading dimension of A_i. lda >= max( 1, n ).
x – [in] device pointer storing vectors x_i.
incx – [in] [int] specifies the increment for the elements of vectors x_i.
batchCount – [in] [int] The number of batched matrices/vectors.
The trmvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, int lda, hipblasStride strideA, float *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasDtrmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, int lda, hipblasStride strideA, double *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCtrmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, int lda, hipblasStride strideA, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZtrmvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, int lda, hipblasStride strideA, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
BLAS Level 2 API
The trmvStridedBatched functions perform one of the matrix-vector operations:
wherex_i = A_i*x_i or x_i = A**T*x_i, 0 \le i < batchCount
x_iis ann-element vector andA_iis annbyn(unit, or non-unit, upper or lower triangular) matrix, with strides specifying how to retrieve$x_i$(resp.$A_i$) from$x_{i-1}$(resp.$A_i$).The vectors
x_iare overwritten.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A_i is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of matrices A_i. n >= 0.
AP – [in] device pointer of the matrix A_0, of dimension ( lda, n ).
lda – [in] [int] specifies the leading dimension of A_i. lda >= max( 1, n ).
strideA – [in] [hipblasStride] stride from the start of one A_i matrix to the next A_{i + 1}.
x – [in] device pointer storing the vector x_0.
incx – [in] [int] specifies the increment for the elements of one vector x.
stridex – [in] [hipblasStride] stride from the start of one x_i vector to the next x_{i + 1}.
batchCount – [in] [int] The number of batched matrices/vectors.
The trmvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtrsv + Batched, StridedBatched#
-
hipblasStatus_t hipblasStrsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, int lda, float *x, int incx)#
-
hipblasStatus_t hipblasDtrsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, int lda, double *x, int incx)#
-
hipblasStatus_t hipblasCtrsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, int lda, hipComplex *x, int incx)#
-
hipblasStatus_t hipblasZtrsv(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, int lda, hipDoubleComplex *x, int incx)#
BLAS Level 2 API
The trsv functions solve:
whereA*x = b or A**T*x = b,
xandbare vectors andAis a triangular matrix.The vector
xis overwritten onb.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of b. n >= 0.
AP – [in] device pointer storing matrix A, of dimension ( lda, n ).
lda – [in] [int] specifies the leading dimension of A. lda = max( 1, n ).
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment for the elements of x.
The trsv functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *const AP[], int lda, float *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasDtrsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *const AP[], int lda, double *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasCtrsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *const AP[], int lda, hipComplex *const x[], int incx, int batchCount)#
-
hipblasStatus_t hipblasZtrsvBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *const AP[], int lda, hipDoubleComplex *const x[], int incx, int batchCount)#
BLAS Level 2 API
The trsvBatched functions solve:
whereA_i*x_i = b_i or A_i**T*x_i = b_i,
(A_i, x_i, b_i)is thei-th instance of the batch,x_iandb_iare vectors, andA_iis annbyntriangular matrix.The vector
xis overwritten onb.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of b. n >= 0.
AP – [in] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i. lda = max(1, n).
x – [in] device array of device pointers storing each vector x_i.
incx – [in] [int] specifies the increment for the elements of x.
batchCount – [in] [int] number of instances in the batch
The trsvBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const float *AP, int lda, hipblasStride strideA, float *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasDtrsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const double *AP, int lda, hipblasStride strideA, double *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasCtrsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipComplex *AP, int lda, hipblasStride strideA, hipComplex *x, int incx, hipblasStride stridex, int batchCount)#
-
hipblasStatus_t hipblasZtrsvStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, int lda, hipblasStride strideA, hipDoubleComplex *x, int incx, hipblasStride stridex, int batchCount)#
BLAS Level 2 API
The trsvStridedBatched functions solve:
whereA_i*x_i = b_i or A_i**T*x_i = b_i,
(A_i, x_i, b_i)is thei-th instance of the batch,x_iandb_iare vectors, andA_iis annbyntriangular matrix, fori= 1, …,batchCount.The vector
xis overwritten onb.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
n – [in] [int] n specifies the number of rows of each b_i. n >= 0.
AP – [in] device pointer to the first matrix (A_1) in the batch, of dimension ( lda, n ).
strideA – [in] [hipblasStride] stride from the start of one A_i matrix to the next A_(i + 1).
lda – [in] [int] specifies the leading dimension of each A_i. lda = max( 1, n ).
x – [inout] device pointer to the first vector (x_1) in the batch.
stridex – [in] [hipblasStride] stride from the start of one x_i vector to the next x_(i + 1).
incx – [in] [int] specifies the increment for the elements of each x_i.
batchCount – [in] [int] number of instances in the batch.
The trsvStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
Level 3 BLAS#
hipblasXgemm + Batched, StridedBatched#
-
hipblasStatus_t hipblasHgemm(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipblasHalf *alpha, const hipblasHalf *AP, int lda, const hipblasHalf *BP, int ldb, const hipblasHalf *beta, hipblasHalf *CP, int ldc)#
-
hipblasStatus_t hipblasSgemm(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const float *alpha, const float *AP, int lda, const float *BP, int ldb, const float *beta, float *CP, int ldc)#
-
hipblasStatus_t hipblasDgemm(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const double *alpha, const double *AP, int lda, const double *BP, int ldb, const double *beta, double *CP, int ldc)#
-
hipblasStatus_t hipblasCgemm(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *BP, int ldb, const hipComplex *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZgemm(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *BP, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The gemm functions perform one of the matrix-matrix operations:
where op( X ) is one of:C = alpha*op( A )*op( B ) + beta*C,
op( X ) = X or op( X ) = X**T or op( X ) = X**H,
alphaandbetaare scalars, andA,B, andCare matrices, withop( A )anmbykmatrix,op( B )akbynmatrix, andCanmbynmatrix.Supported precisions in rocBLAS :
h,s,d,c, andz.Supported precisions in cuBLAS :
h,s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] number or rows of matrices op( A ) and C.
n – [in] [int] number of columns of matrices op( B ) and C.
k – [in] [int] number of columns of matrix op( A ) and number of rows of matrix op( B ).
alpha – [in] device pointer or host pointer specifying the scalar alpha.
AP – [in] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
BP – [in] device pointer storing matrix B.
ldb – [in] [int] specifies the leading dimension of B.
beta – [in] device pointer or host pointer specifying the scalar beta.
CP – [inout] device pointer storing matrix C on the GPU.
ldc – [in] [int] specifies the leading dimension of C.
The gemm functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasHgemmBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipblasHalf *alpha, const hipblasHalf *const AP[], int lda, const hipblasHalf *const BP[], int ldb, const hipblasHalf *beta, hipblasHalf *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasSgemmBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const float *alpha, const float *const AP[], int lda, const float *const BP[], int ldb, const float *beta, float *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDgemmBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const double *alpha, const double *const AP[], int lda, const double *const BP[], int ldb, const double *beta, double *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCgemmBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const BP[], int ldb, const hipComplex *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZgemmBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const BP[], int ldb, const hipDoubleComplex *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The gemmBatched functions perform one of the batched matrix-matrix operations:
whereC_i = alpha*op( A_i )*op( B_i ) + beta*C_i, for i = 1, ..., batchCount.
op( X )is one of:op( X ) = X or op( X ) = X**T or op( X ) = X**H,
alphaandbetaare scalars, andA,B, andCare strided batched matrices, withop( A )anmbykbybatchCountstrided_batched matrix,op( B )akbynbybatchCountstrided_batched matrix, andCanmbynbybatchCountstrided_batched matrix.Supported precisions in rocBLAS :
h,s,d,c, andz.Supported precisions in cuBLAS :
h,s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
k – [in] [int] matrix dimension k.
alpha – [in] device pointer or host pointer specifying the scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i.
BP – [in] device array of device pointers storing each matrix B_i.
ldb – [in] [int] specifies the leading dimension of each B_i.
beta – [in] device pointer or host pointer specifying the scalar beta.
CP – [inout] device array of device pointers storing each matrix C_i.
ldc – [in] [int] specifies the leading dimension of each C_i.
batchCount – [in] [int] number of gemm operations in the batch.
The gemmBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasHgemmStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipblasHalf *alpha, const hipblasHalf *AP, int lda, long long strideA, const hipblasHalf *BP, int ldb, long long strideB, const hipblasHalf *beta, hipblasHalf *CP, int ldc, long long strideC, int batchCount)#
-
hipblasStatus_t hipblasSgemmStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const float *alpha, const float *AP, int lda, long long strideA, const float *BP, int ldb, long long strideB, const float *beta, float *CP, int ldc, long long strideC, int batchCount)#
-
hipblasStatus_t hipblasDgemmStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const double *alpha, const double *AP, int lda, long long strideA, const double *BP, int ldb, long long strideB, const double *beta, double *CP, int ldc, long long strideC, int batchCount)#
-
hipblasStatus_t hipblasCgemmStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, long long strideA, const hipComplex *BP, int ldb, long long strideB, const hipComplex *beta, hipComplex *CP, int ldc, long long strideC, int batchCount)#
-
hipblasStatus_t hipblasZgemmStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, long long strideA, const hipDoubleComplex *BP, int ldb, long long strideB, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc, long long strideC, int batchCount)#
BLAS Level 3 API
The gemmStridedBatched functions perform one of the strided batched matrix-matrix operations:
whereC_i = alpha*op( A_i )*op( B_i ) + beta*C_i, for i = 1, ..., batchCount
op( X )is one of:op( X ) = X or op( X ) = X**T or op( X ) = X**H,
alphaandbetaare scalars, andA,B, andCare strided batched matrices, withop( A )anmbykbybatchCountstrided_batched matrix,op( B )akbynbybatchCountstrided_batched matrix, andCanmbynbybatchCountstrided_batched matrix.Supported precisions in rocBLAS :
h,s,d,c, andz.Supported precisions in cuBLAS :
h,s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
k – [in] [int] matrix dimension k.
alpha – [in] device pointer or host pointer specifying the scalar alpha.
AP – [in] device pointer pointing to the first matrix A_1.
lda – [in] [int] specifies the leading dimension of each A_i.
strideA – [in] [hipblasStride] stride from the start of one A_i matrix to the next A_(i + 1).
BP – [in] device pointer pointing to the first matrix B_1.
ldb – [in] [int] specifies the leading dimension of each B_i.
strideB – [in] [hipblasStride] stride from the start of one B_i matrix to the next B_(i + 1).
beta – [in] device pointer or host pointer specifying the scalar beta.
CP – [inout] device pointer pointing to the first matrix C_1.
ldc – [in] [int] specifies the leading dimension of each C_i.
strideC – [in] [hipblasStride] stride from the start of one C_i matrix to the next C_(i + 1).
batchCount – [in] [int] number of gemm operatons in the batch.
The gemmStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXherk + Batched, StridedBatched#
-
hipblasStatus_t hipblasCherk(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const hipComplex *AP, int lda, const float *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZherk(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const hipDoubleComplex *AP, int lda, const double *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The herk functions perform one of the matrix-matrix operations for a Hermitian rank-k update:
whereC := alpha*op( A )*op( A )^H + beta*C
alphaandbetaare scalars,op(A)is annbykmatrix, andCis annxnHermitian matrix stored as either upper or lower.op( A ) = A, and A is n by k if transA == HIPBLAS_OP_N op( A ) = A^H and A is k by n if transA == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op(A) = A^H
HIPBLAS_ON_N: op(A) = A
n – [in] [int] n specifies the number of rows and columns of C. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and A does not need to be set before entry.
AP – [in] pointer storing matrix A on the GPU. Matrix dimension is ( lda, k ) when transA = HIPBLAS_OP_N. Otherwise, (lda, n). Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A. If transA = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU. The imaginary component of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
The herk functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCherkBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const hipComplex *const AP[], int lda, const float *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZherkBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const hipDoubleComplex *const AP[], int lda, const double *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The herkBatched functions perform a batch of the matrix-matrix operations for a Hermitian rank-k update:
whereC_i := alpha*op( A_i )*op( A_i )^H + beta*C_i
alphaandbetaare scalars,op(A)is annbykmatrix, andC_iis annxnHermitian matrix stored as either upper or lower.op( A_i ) = A_i, and A_i is n by k if transA == HIPBLAS_OP_N op( A_i ) = A_i^H and A_i is k by n if transA == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op(A) = A^H
HIPBLAS_OP_N: op(A) = A
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and A does not need to be set before entry.
AP – [in] device array of device pointers storing each matrix_i A of dimension (lda, k) when transA is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If transA = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
batchCount – [in] [int] number of instances in the batch.
The herkBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCherkStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const float *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZherkStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const double *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The herkStridedBatched functions perform a batch of the matrix-matrix operations for a Hermitian rank-k update:
whereC_i := alpha*op( A_i )*op( A_i )^H + beta*C_i
alphaandbetaare scalars,op(A)is annbykmatrix, andC_iis annbynHermitian matrix stored as either upper or lower.op( A_i ) = A_i, and A_i is n by k if transA == HIPBLAS_OP_N op( A_i ) = A_i^H and A_i is k by n if transA == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op(A) = A^H
HIPBLAS_OP_N: op(A) = A
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and A does not need to be set before entry.
AP – [in] Device pointer to the first matrix A_1 on the GPU of dimension (lda, k) when transA is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If transA = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] Device pointer to the first matrix C_1 on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The herkStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXherkx + Batched, StridedBatched#
-
hipblasStatus_t hipblasCherkx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *BP, int ldb, const float *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZherkx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *BP, int ldb, const double *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The herkx functions perform one of the matrix-matrix operations for a Hermitian rank-k update:
whereC := alpha*op( A )*op( B )^H + beta*C
alphaandbetaare scalars,op(A)andop(B)arenbykmatrices, andCis annbynHermitian matrix stored as either upper or lower. This routine should only be used when the caller can guarantee that the result ofop( A )*op( B )^Twill be Hermitian.op( A ) = A, op( B ) = B, and A and B are n by k if trans == HIPBLAS_OP_N op( A ) = A^H, op( B ) = B^H, and A and B are k by n if trans == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op( A ) = A^H, op( B ) = B^H
HIPBLAS_OP_N: op( A ) = A, op( B ) = B
n – [in] [int] n specifies the number of rows and columns of C. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] pointer storing matrix A on the GPU. Matrix dimension is ( lda, k ) when trans = HIPBLAS_OP_N. Otherwise, (lda, n). Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A. if trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] pointer storing matrix B on the GPU. Matrix dimension is ( ldb, k ) when trans = HIPBLAS_OP_N. Otherwise, (ldb, n). Only the upper/lower triangular part is accessed.
ldb – [in] [int] ldb specifies the first dimension of B. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
The herkx functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCherkxBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const BP[], int ldb, const float *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZherkxBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const BP[], int ldb, const double *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The herkxBatched functions perform a batch of the matrix-matrix operations for a Hermitian rank-k update:
whereC_i := alpha*op( A_i )*op( B_i )^H + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)arenbykmatrices, andC_iis annbynHermitian matrix stored as either upper or lower. This routine should only be used when the caller can guarantee that the result ofop( A )*op( B )^Twill be Hermitian.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^H, op( B_i ) = B_i^H, and A_i and B_i are k by n if trans == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op(A) = A^H
HIPBLAS_OP_N: op(A) = A
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] device array of device pointers storing each matrix_i A of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] device array of device pointers storing each matrix_i B of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B_i. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
batchCount – [in] [int] number of instances in the batch.
The herkxBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCherkxStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *BP, int ldb, hipblasStride strideB, const float *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZherkxStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *BP, int ldb, hipblasStride strideB, const double *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The herkxStridedBatched functions perform a batch of the matrix-matrix operations for a Hermitian rank-k update:
whereC_i := alpha*op( A_i )*op( B_i )^H + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)arenbykmatrices, andC_iis annbynHermitian matrix stored as either upper or lower. This routine should only be used when the caller can guarantee that the result ofop( A )*op( B )^Twill be Hermitian.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^H, op( B_i ) = B_i^H, and A_i and B_i are k by n if trans == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op( A_i ) = A_i^H, op( B_i ) = B_i^H
HIPBLAS_OP_N: op( A_i ) = A_i, op( B_i ) = B_i
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] Device pointer to the first matrix A_1 on the GPU of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
BP – [in] Device pointer to the first matrix B_1 on the GPU of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B_i. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] Device pointer to the first matrix C_1 on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The herkxStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXher2k + Batched, StridedBatched#
-
hipblasStatus_t hipblasCher2k(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *BP, int ldb, const float *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZher2k(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *BP, int ldb, const double *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The her2k functions perform one of the matrix-matrix operations for a Hermitian rank-2k update:
whereC := alpha*op( A )*op( B )^H + conj(alpha)*op( B )*op( A )^H + beta*C
alphaandbetaare scalars,op(A)andop(B)arenbykmatrices, andCis annbynHermitian matrix stored as either upper or lower.op( A ) = A, op( B ) = B, and A and B are n by k if trans == HIPBLAS_OP_N op( A ) = A^H, op( B ) = B^H, and A and B are k by n if trans == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op( A ) = A^H, op( B ) = B^H
HIPBLAS_OP_N: op( A ) = A, op( B ) = B
n – [in] [int] n specifies the number of rows and columns of C. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] pointer storing matrix A on the GPU. Matrix dimension is ( lda, k ) when trans = HIPBLAS_OP_N. Otherwise, (lda, n). Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] pointer storing matrix B on the GPU. Matrix dimension is ( ldb, k ) when trans = HIPBLAS_OP_N. Otherwise, (ldb, n). Only the upper/lower triangular part is accessed.
ldb – [in] [int] ldb specifies the first dimension of B. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
The her2k functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCher2kBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const BP[], int ldb, const float *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZher2kBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const BP[], int ldb, const double *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The her2kBatched functions perform a batch of the matrix-matrix operations for a Hermitian rank-2k update:
whereC_i := alpha*op( A_i )*op( B_i )^H + conj(alpha)*op( B_i )*op( A_i )^H + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)arenbykmatrices, andC_iis annbynHermitian matrix stored as either upper or lower.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^H, op( B_i ) = B_i^H, and A_i and B_i are k by n if trans == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op(A) = A^H
HIPBLAS_OP_N: op(A) = A
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] device array of device pointers storing each matrix_i A of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] device array of device pointers storing each matrix_i B of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B_i. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
batchCount – [in] [int] number of instances in the batch.
The her2kBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasCher2kStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *BP, int ldb, hipblasStride strideB, const float *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZher2kStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *BP, int ldb, hipblasStride strideB, const double *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The her2kStridedBatched functions perform a batch of the matrix-matrix operations for a Hermitian rank-2k update:
whereC_i := alpha*op( A_i )*op( B_i )^H + conj(alpha)*op( B_i )*op( A_i )^H + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)arenbykmatrices, andC_iis annbynHermitian matrix stored as either upper or lower.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^H, op( B_i ) = B_i^H, and A_i and B_i are k by n if trans == HIPBLAS_OP_C
Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_C: op( A_i ) = A_i^H, op( B_i ) = B_i^H
HIPBLAS_OP_N: op( A_i ) = A_i, op( B_i ) = B_i
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] Device pointer to the first matrix A_1 on the GPU of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. if trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
BP – [in] Device pointer to the first matrix B_1 on the GPU of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B_i. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] Device pointer to the first matrix C_1 on the GPU. The imaginary components of the diagonal elements are not used but are set to zero, except for quick return.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The her2kStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsymm + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsymm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const float *alpha, const float *AP, int lda, const float *BP, int ldb, const float *beta, float *CP, int ldc)#
-
hipblasStatus_t hipblasDsymm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const double *alpha, const double *AP, int lda, const double *BP, int ldb, const double *beta, double *CP, int ldc)#
-
hipblasStatus_t hipblasCsymm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *BP, int ldb, const hipComplex *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZsymm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *BP, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The symm functions perform one of the matrix-matrix operations:
whereC := alpha*A*B + beta*C if side == HIPBLAS_SIDE_LEFT, C := alpha*B*A + beta*C if side == HIPBLAS_SIDE_RIGHT,
alphaandbetaare scalars,BandCarembynmatrices, andAis a symmetric matrix stored as either upper or lower.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: C := alpha*A*B + beta*C
HIPBLAS_SIDE_RIGHT: C := alpha*B*A + beta*C
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
m – [in] [int] m specifies the number of rows of B and C. m >= 0.
n – [in] [int] n specifies the number of columns of B and C. n >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A and B are not referenced.
AP – [in] pointer storing matrix A on the GPU. A is m by m if side == HIPBLAS_SIDE_LEFT. A is n by n if side == HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A. If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ). Otherwise, lda >= max( 1, n ).
BP – [in] pointer storing matrix B on the GPU. Matrix dimension is m by n.
ldb – [in] [int] ldb specifies the first dimension of B. ldb >= max( 1, m ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU. Matrix dimension is m by n.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, m ).
The symm functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsymmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const float *alpha, const float *const AP[], int lda, const float *const BP[], int ldb, const float *beta, float *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDsymmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const double *alpha, const double *const AP[], int lda, const double *const BP[], int ldb, const double *beta, double *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCsymmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const BP[], int ldb, const hipComplex *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZsymmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const BP[], int ldb, const hipDoubleComplex *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
symmBatched performs a batch of the matrix-matrix operations:
whereC_i := alpha*A_i*B_i + beta*C_i if side == HIPBLAS_SIDE_LEFT, C_i := alpha*B_i*A_i + beta*C_i if side == HIPBLAS_SIDE_RIGHT,
alphaandbetaare scalars,B_iandC_iarembynmatrices, andA_iis a symmetric matrix stored as either upper or lower.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: C_i := alpha*A_i*B_i + beta*C_i
HIPBLAS_SIDE_RIGHT: C_i := alpha*B_i*A_i + beta*C_i
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
m – [in] [int] m specifies the number of rows of B_i and C_i. m >= 0.
n – [in] [int] n specifies the number of columns of B_i and C_i. n >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A_i and B_i are not referenced.
AP – [in] device array of device pointers storing each matrix A_i on the GPU. A_i is m by m if side == HIPBLAS_SIDE_LEFT. A_i is n by n if side == HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A_i. If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ). Otherwise, lda >= max( 1, n ).
BP – [in] device array of device pointers storing each matrix B_i on the GPU. Matrix dimension is m by n.
ldb – [in] [int] ldb specifies the first dimension of B_i. ldb >= max( 1, m ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C_i does not need to be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU. Matrix dimension is m by n.
ldc – [in] [int] ldc specifies the first dimension of C_i. ldc >= max( 1, m ).
batchCount – [in] [int] number of instances in the batch.
The symmBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsymmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *BP, int ldb, hipblasStride strideB, const float *beta, float *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasDsymmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *BP, int ldb, hipblasStride strideB, const double *beta, double *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasCsymmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *BP, int ldb, hipblasStride strideB, const hipComplex *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZsymmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *BP, int ldb, hipblasStride strideB, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The symmStridedBatched functions perform a batch of the matrix-matrix operations:
whereC_i := alpha*A_i*B_i + beta*C_i if side == HIPBLAS_SIDE_LEFT, C_i := alpha*B_i*A_i + beta*C_i if side == HIPBLAS_SIDE_RIGHT,
alphaandbetaare scalars,B_iandC_iarembynmatrices, andA_iis a symmetric matrix stored as either upper or lower.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: C_i := alpha*A_i*B_i + beta*C_i
HIPBLAS_SIDE_RIGHT: C_i := alpha*B_i*A_i + beta*C_i
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
m – [in] [int] m specifies the number of rows of B_i and C_i. m >= 0.
n – [in] [int] n specifies the number of columns of B_i and C_i. n >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A_i and B_i are not referenced.
AP – [in] device pointer to first matrix A_1. A_i is m by m if side == HIPBLAS_SIDE_LEFT. A_i is n by n if side == HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A_i. If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ). Otherwise, lda >= max( 1, n ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
BP – [in] device pointer to first matrix B_1 of dimension (ldb, n) on the GPU.
ldb – [in] [int] ldb specifies the first dimension of B_i. ldb >= max( 1, m ).
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device pointer to first matrix C_1 of dimension (ldc, n) on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, m ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The symmStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsyrk + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsyrk(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *AP, int lda, const float *beta, float *CP, int ldc)#
-
hipblasStatus_t hipblasDsyrk(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *AP, int lda, const double *beta, double *CP, int ldc)#
-
hipblasStatus_t hipblasCsyrk(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZsyrk(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The syrk functions perform one of the matrix-matrix operations for a symmetric rank-k update:
whereC := alpha*op( A )*op( A )^T + beta*C
alphaandbetaare scalars,op(A)is annbykmatrix, andCis a symmetricnbynmatrix stored as either upper or lower.op( A ) = A, and A is n by k if transA == HIPBLAS_OP_N op( A ) = A^T and A is k by n if transA == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op(A) = A^T
HIPBLAS_OP_N: op(A) = A
HIPBLAS_OP_C: op(A) = A^T
HIPBLAS_OP_C is not supported for complex types. See cherk and zherk.
n – [in] [int] n specifies the number of rows and columns of C. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and A does not need to be set before entry.
AP – [in] pointer storing matrix A on the GPU. Matrix dimension is ( lda, k ) when transA = HIPBLAS_OP_N. Otherwise, (lda, n). Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A. If transA = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
The syrk functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyrkBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *const AP[], int lda, const float *beta, float *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDsyrkBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *const AP[], int lda, const double *beta, double *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCsyrkBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZsyrkBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The syrkBatched functions performs a batch of the matrix-matrix operations for a symmetric rank-k update:
whereC_i := alpha*op( A_i )*op( A_i )^T + beta*C_i
alphaandbetaare scalars,op(A_i)is annbykmatrix, andC_iis a symmetricnbynmatrix stored as either upper or lower.op( A_i ) = A_i, and A_i is n by k if transA == HIPBLAS_OP_N op( A_i ) = A_i^T and A_i is k by n if transA == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op(A) = A^T
HIPBLAS_OP_N: op(A) = A
HIPBLAS_OP_C: op(A) = A^T
HIPBLAS_OP_C is not supported for complex types. See cherk and zherk.
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] device array of device pointers storing each matrix_i A of dimension (lda, k) when transA is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If transA = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
batchCount – [in] [int] number of instances in the batch.
The syrkBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyrkStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *beta, float *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasDsyrkStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *beta, double *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasCsyrkStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZsyrkStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The syrkStridedBatched functions perform a batch of the matrix-matrix operations for a symmetric rank-k update:
whereC_i := alpha*op( A_i )*op( A_i )^T + beta*C_i
alphaandbetaare scalars,op(A_i)is annbykmatrix, andC_iis a symmetricnbynmatrix stored as either upper or lower.op( A_i ) = A_i, and A_i is n by k if transA == HIPBLAS_OP_N op( A_i ) = A_i^T and A_i is k by n if transA == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op(A) = A^T
HIPBLAS_OP_N: op(A) = A
HIPBLAS_OP_C: op(A) = A^T
HIPBLAS_OP_C is not supported for complex types. See cherk and zherk.
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] Device pointer to the first matrix A_1 on the GPU of dimension (lda, k) when transA is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If transA = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] Device pointer to the first matrix C_1 on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The syrkStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsyr2k + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsyr2k(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *AP, int lda, const float *BP, int ldb, const float *beta, float *CP, int ldc)#
-
hipblasStatus_t hipblasDsyr2k(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *AP, int lda, const double *BP, int ldb, const double *beta, double *CP, int ldc)#
-
hipblasStatus_t hipblasCsyr2k(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *BP, int ldb, const hipComplex *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZsyr2k(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *BP, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The syr2k functions perform one of the matrix-matrix operations for a symmetric rank-2k update:
whereC := alpha*(op( A )*op( B )^T + op( B )*op( A )^T) + beta*C
alphaandbetaare scalars,op(A)andop(B)arenbykmatrices, andCis a symmetricnbynmatrix stored as either upper or lower.op( A ) = A, op( B ) = B, and A and B are n by k if trans == HIPBLAS_OP_N op( A ) = A^T, op( B ) = B^T, and A and B are k by n if trans == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op( A ) = A^T, op( B ) = B^T
HIPBLAS_OP_N: op( A ) = A, op( B ) = B
n – [in] [int] n specifies the number of rows and columns of C. n >= 0.
k – [in] [int] k specifies the number of columns of op(A) and op(B). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] Pointer storing matrix A on the GPU. Matrix dimension is ( lda, k ) when trans = HIPBLAS_OP_N. Otherwise, (lda, n). Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] pointer storing matrix B on the GPU. Matrix dimension is ( ldb, k ) when trans = HIPBLAS_OP_N. Otherwise, (ldb, n). Only the upper/lower triangular part is accessed.
ldb – [in] [int] ldb specifies the first dimension of B. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
The syr2k functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyr2kBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *const AP[], int lda, const float *const BP[], int ldb, const float *beta, float *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDsyr2kBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *const AP[], int lda, const double *const BP[], int ldb, const double *beta, double *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCsyr2kBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const BP[], int ldb, const hipComplex *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZsyr2kBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const BP[], int ldb, const hipDoubleComplex *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The syr2kBatched functions perform a batch of the matrix-matrix operations for a symmetric rank-2k update:
whereC_i := alpha*(op( A_i )*op( B_i )^T + op( B_i )*op( A_i )^T) + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)arenbykmatrices, andC_iis a symmetricnbynmatrix stored as either upper or lower.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^T, op( B_i ) = B_i^T, and A_i and B_i are k by n if trans == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op( A_i ) = A_i^T, op( B_i ) = B_i^T
HIPBLAS_OP_N: op( A_i ) = A_i, op( B_i ) = B_i
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] device array of device pointers storing each matrix_i A of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] device array of device pointers storing each matrix_i B of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
batchCount – [in] [int] number of instances in the batch.
The syr2kBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyr2kStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *BP, int ldb, hipblasStride strideB, const float *beta, float *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasDsyr2kStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *BP, int ldb, hipblasStride strideB, const double *beta, double *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasCsyr2kStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *BP, int ldb, hipblasStride strideB, const hipComplex *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZsyr2kStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *BP, int ldb, hipblasStride strideB, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The syr2kStridedBatched functions perform a batch of the matrix-matrix operations for a symmetric rank-2k update:
whereC_i := alpha*(op( A_i )*op( B_i )^T + op( B_i )*op( A_i )^T) + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)arenbykmatrices, andC_iis a symmetricnbynmatrix stored as either upper or lower.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^T, op( B_i ) = B_i^T, and A_i and B_i are k by n if trans == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op( A_i ) = A_i^T, op( B_i ) = B_i^T
HIPBLAS_OP_N: op( A_i ) = A_i, op( B_i ) = B_i
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] Device pointer to the first matrix A_1 on the GPU of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
BP – [in] Device pointer to the first matrix B_1 on the GPU of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B_i. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] Device pointer to the first matrix C_1 on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The syr2kStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXsyrkx + Batched, StridedBatched#
-
hipblasStatus_t hipblasSsyrkx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *AP, int lda, const float *BP, int ldb, const float *beta, float *CP, int ldc)#
-
hipblasStatus_t hipblasDsyrkx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *AP, int lda, const double *BP, int ldb, const double *beta, double *CP, int ldc)#
-
hipblasStatus_t hipblasCsyrkx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *BP, int ldb, const hipComplex *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZsyrkx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *BP, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The syrkx functions performs one of the matrix-matrix operations for a symmetric rank-k update:
whereC := alpha*op( A )*op( B )^T + beta*C
alphaandbetaare scalars,op(A)andop(B)arenbykmatrices, andCis a symmetricnbynmatrix stored as either upper or lower. This routine should only be used when the caller can guarantee that the result ofop( A )*op( B )^Twill be symmetric.op( A ) = A, op( B ) = B, and A and B are n by k if trans == HIPBLAS_OP_N op( A ) = A^T, op( B ) = B^T, and A and B are k by n if trans == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op( A ) = A^T, op( B ) = B^T
HIPBLAS_OP_N: op( A ) = A, op( B ) = B
n – [in] [int] n specifies the number of rows and columns of C. n >= 0.
k – [in] [int] k specifies the number of columns of op(A) and op(B). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] pointer storing matrix A on the GPU. Matrix dimension is ( lda, k ) when trans = HIPBLAS_OP_N. Otherwise, (lda, n). Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] pointer storing matrix B on the GPU. Matrix dimension is ( ldb, k ) when trans = HIPBLAS_OP_N. Otherwise, (ldb, n). Only the upper/lower triangular part is accessed.
ldb – [in] [int] ldb specifies the first dimension of B. if trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
The syrkx functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyrkxBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *const AP[], int lda, const float *const BP[], int ldb, const float *beta, float *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDsyrkxBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *const AP[], int lda, const double *const BP[], int ldb, const double *beta, double *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCsyrkxBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const BP[], int ldb, const hipComplex *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZsyrkxBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const BP[], int ldb, const hipDoubleComplex *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The syrkxBatched functions perform a batch of the matrix-matrix operations for a symmetric rank-k update:
whereC_i := alpha*op( A_i )*op( B_i )^T + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)are annbykmatrix, andC_iis a symmetricnxnmatrix stored as either upper or lower. This routine should only be used when the caller can guarantee that the result ofop( A_i )*op( B_i )^Twill be symmetric.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^T, op( B_i ) = B_i^T, and A_i and B_i are k by n if trans == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op( A_i ) = A_i^T, op( B_i ) = B_i^T
HIPBLAS_OP_N: op( A_i ) = A_i, op( B_i ) = B_i
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and A does not need to be set before entry.
AP – [in] device array of device pointers storing each matrix_i A of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
BP – [in] device array of device pointers storing each matrix_i B of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
batchCount – [in] [int] number of instances in the batch.
The syrkxBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSsyrkxStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *BP, int ldb, hipblasStride strideB, const float *beta, float *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasDsyrkxStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *BP, int ldb, hipblasStride strideB, const double *beta, double *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasCsyrkxStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *BP, int ldb, hipblasStride strideB, const hipComplex *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZsyrkxStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *BP, int ldb, hipblasStride strideB, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The syrkxStridedBatched functions perform a batch of the matrix-matrix operations for a symmetric rank-k update:
whereC_i := alpha*op( A_i )*op( B_i )^T + beta*C_i
alphaandbetaare scalars,op(A_i)andop(B_i)arenbykmatrices, andC_iis a symmetricnbynmatrix stored as either upper or lower. This routine should only be used when the caller can guarantee that the result ofop( A_i )*op( B_i )^Twill be symmetric.op( A_i ) = A_i, op( B_i ) = B_i, and A_i and B_i are n by k if trans == HIPBLAS_OP_N op( A_i ) = A_i^T, op( B_i ) = B_i^T, and A_i and B_i are k by n if trans == HIPBLAS_OP_T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: C_i is an upper triangular matrix
HIPBLAS_FILL_MODE_LOWER: C_i is a lower triangular matrix
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_T: op( A_i ) = A_i^T, op( B_i ) = B_i^T
HIPBLAS_OP_N: op( A_i ) = A_i, op( B_i ) = B_i
n – [in] [int] n specifies the number of rows and columns of C_i. n >= 0.
k – [in] [int] k specifies the number of columns of op(A). k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and does not need to be set before entry.
AP – [in] Device pointer to the first matrix A_1 on the GPU of dimension (lda, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (lda, n).
lda – [in] [int] lda specifies the first dimension of A_i. If trans = HIPBLAS_OP_N, lda >= max( 1, n ). Otherwise, lda >= max( 1, k ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
BP – [in] Device pointer to the first matrix B_1 on the GPU of dimension (ldb, k) when trans is HIPBLAS_OP_N. Otherwise, of dimension (ldb, n).
ldb – [in] [int] ldb specifies the first dimension of B_i. If trans = HIPBLAS_OP_N, ldb >= max( 1, n ). Otherwise, ldb >= max( 1, k ).
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] Device pointer to the first matrix C_1 on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, n ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The syrkxStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXgeam + Batched, StridedBatched#
-
hipblasStatus_t hipblasSgeam(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const float *alpha, const float *AP, int lda, const float *beta, const float *BP, int ldb, float *CP, int ldc)#
-
hipblasStatus_t hipblasDgeam(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const double *alpha, const double *AP, int lda, const double *beta, const double *BP, int ldb, double *CP, int ldc)#
-
hipblasStatus_t hipblasCgeam(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *beta, const hipComplex *BP, int ldb, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZgeam(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *beta, const hipDoubleComplex *BP, int ldb, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The geam functions perform one of the matrix-matrix operations:
where op( X ) is one ofC = alpha*op( A ) + beta*op( B ),
op( X ) = X or op( X ) = X**T or op( X ) = X**H,
alphaandbetaare scalars, andA,BandCare matrices, withop( A )anmbynmatrix,op( B )anmbynmatrix, andCanmbynmatrix.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
alpha – [in] device pointer or host pointer specifying the scalar alpha.
AP – [in] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
beta – [in] device pointer or host pointer specifying the scalar beta.
BP – [in] device pointer storing matrix B.
ldb – [in] [int] specifies the leading dimension of B.
CP – [inout] device pointer storing matrix C.
ldc – [in] [int] specifies the leading dimension of C.
The geam functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgeamBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const float *alpha, const float *const AP[], int lda, const float *beta, const float *const BP[], int ldb, float *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDgeamBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const double *alpha, const double *const AP[], int lda, const double *beta, const double *const BP[], int ldb, double *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCgeamBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *beta, const hipComplex *const BP[], int ldb, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZgeamBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *beta, const hipDoubleComplex *const BP[], int ldb, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The geamBatched functions perform one of the batched matrix-matrix operations:
whereC_i = alpha*op( A_i ) + beta*op( B_i ) for i = 0, 1, ... batchCount - 1
alphaandbetaare scalars,op(A_i),op(B_i), andC_iarembynmatrices, andop( X )is one of:op( X ) = X or op( X ) = X**T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
alpha – [in] device pointer or host pointer specifying the scalar alpha.
AP – [in] device array of device pointers storing each matrix A_i on the GPU. Each A_i is of dimension ( lda, k ), where k is m when transA == HIPBLAS_OP_N and is n when transA == HIPBLAS_OP_T.
lda – [in] [int] specifies the leading dimension of A.
beta – [in] device pointer or host pointer specifying the scalar beta.
BP – [in] device array of device pointers storing each matrix B_i on the GPU. Each B_i is of dimension ( ldb, k ), where k is m when transB == HIPBLAS_OP_N and is n when transB == HIPBLAS_OP_T.
ldb – [in] [int] specifies the leading dimension of B.
CP – [inout] device array of device pointers storing each matrix C_i on the GPU. Each C_i is of dimension ( ldc, n ).
ldc – [in] [int] specifies the leading dimension of C.
batchCount – [in] [int] number of instances i in the batch.
The geamBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSgeamStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const float *alpha, const float *AP, int lda, hipblasStride strideA, const float *beta, const float *BP, int ldb, hipblasStride strideB, float *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasDgeamStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const double *alpha, const double *AP, int lda, hipblasStride strideA, const double *beta, const double *BP, int ldb, hipblasStride strideB, double *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasCgeamStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *beta, const hipComplex *BP, int ldb, hipblasStride strideB, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZgeamStridedBatched(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *beta, const hipDoubleComplex *BP, int ldb, hipblasStride strideB, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The geamStridedBatched functions perform one of the batched matrix-matrix operations:
whereC_i = alpha*op( A_i ) + beta*op( B_i ) for i = 0, 1, ... batchCount - 1
alphaandbetaare scalars,op(A_i),op(B_i), andC_iarembynmatrices, andop( X )is one of:op( X ) = X or op( X ) = X**T
Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
alpha – [in] device pointer or host pointer specifying the scalar alpha.
AP – [in] device pointer to the first matrix A_0 on the GPU. Each A_i is of dimension ( lda, k ), where k is m when transA == HIPBLAS_OP_N and is n when transA == HIPBLAS_OP_T.
lda – [in] [int] specifies the leading dimension of A.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
beta – [in] device pointer or host pointer specifying the scalar beta.
BP – [in] pointer to the first matrix B_0 on the GPU. Each B_i is of dimension ( ldb, k ), where k is m when transB == HIPBLAS_OP_N and is n when transB == HIPBLAS_OP_T.
ldb – [in] [int] specifies the leading dimension of B.
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
CP – [inout] pointer to the first matrix C_0 on the GPU. Each C_i is of dimension ( ldc, n ).
ldc – [in] [int] specifies the leading dimension of C.
strideC – [in] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances i in the batch.
The geamStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXhemm + Batched, StridedBatched#
-
hipblasStatus_t hipblasChemm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, const hipComplex *BP, int ldb, const hipComplex *beta, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZhemm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *BP, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The hemm functions perform one of the matrix-matrix operations:
whereC := alpha*A*B + beta*C if side == HIPBLAS_SIDE_LEFT, C := alpha*B*A + beta*C if side == HIPBLAS_SIDE_RIGHT,
alphaandbetaare scalars,BandCarembynmatrices, andAis a Hermitian matrix stored as either upper or lower.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS :
candz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: C := alpha*A*B + beta*C
HIPBLAS_SIDE_RIGHT: C := alpha*B*A + beta*C
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
n – [in] [int] n specifies the number of rows of B and C. n >= 0.
k – [in] [int] n specifies the number of columns of B and C. k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A and B are not referenced.
AP – [in] pointer storing matrix A on the GPU.
A is m by m if side == HIPBLAS_SIDE_LEFT.
A is n by n if side == HIPBLAS_SIDE_RIGHT.
Only the upper/lower triangular part is accessed.
The imaginary component of the diagonal elements is not used.
lda – [in] [int] lda specifies the first dimension of A. If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ). Otherwise, lda >= max( 1, n ).
BP – [in] pointer storing matrix B on the GPU. Matrix dimension is m by n.
ldb – [in] [int] ldb specifies the first dimension of B. ldb >= max( 1, m ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] pointer storing matrix C on the GPU. Matrix dimension is m by n.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, m ).
The hemm functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChemmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int n, int k, const hipComplex *alpha, const hipComplex *const AP[], int lda, const hipComplex *const BP[], int ldb, const hipComplex *beta, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZhemmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const BP[], int ldb, const hipDoubleComplex *beta, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The hemmBatched functions perform a batch of the matrix-matrix operations:
whereC_i := alpha*A_i*B_i + beta*C_i if side == HIPBLAS_SIDE_LEFT, C_i := alpha*B_i*A_i + beta*C_i if side == HIPBLAS_SIDE_RIGHT,
alphaandbetaare scalars,B_iandC_iarembynmatrices, andA_iis a Hermitian matrix stored as either upper or lower.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: C_i := alpha*A_i*B_i + beta*C_i
HIPBLAS_SIDE_RIGHT: C_i := alpha*B_i*A_i + beta*C_i
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
n – [in] [int] n specifies the number of rows of B_i and C_i. n >= 0.
k – [in] [int] k specifies the number of columns of B_i and C_i. k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A_i and B_i are not referenced.
AP – [in] device array of device pointers storing each matrix A_i on the GPU.
A_i is m by m if side == HIPBLAS_SIDE_LEFT.
A_i is n by n if side == HIPBLAS_SIDE_RIGHT.
Only the upper/lower triangular part is accessed.
The imaginary component of the diagonal elements is not used.
lda – [in] [int] lda specifies the first dimension of A_i. If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ). Otherwise, lda >= max( 1, n ).
BP – [in] device array of device pointers storing each matrix B_i on the GPU. Matrix dimension is m by n.
ldb – [in] [int] ldb specifies the first dimension of B_i. ldb >= max( 1, m ).
beta – [in] beta specifies the scalar beta. When beta is zero, then C_i need not be set before entry.
CP – [in] device array of device pointers storing each matrix C_i on the GPU. Matrix dimension is m by n.
ldc – [in] [int] ldc specifies the first dimension of C_i. ldc >= max( 1, m ).
batchCount – [in] [int] number of instances in the batch.
The hemmBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasChemmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int n, int k, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *BP, int ldb, hipblasStride strideB, const hipComplex *beta, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZhemmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, int n, int k, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *BP, int ldb, hipblasStride strideB, const hipDoubleComplex *beta, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The hemmStridedBatched functions perform a batch of the matrix-matrix operations:
whereC_i := alpha*A_i*B_i + beta*C_i if side == HIPBLAS_SIDE_LEFT, C_i := alpha*B_i*A_i + beta*C_i if side == HIPBLAS_SIDE_RIGHT,
alphaandbetaare scalars,B_iandC_iarembynmatrices, andA_iis a Hermitian matrix stored as either upper or lower.Supported precisions in rocBLAS :
candz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: C_i := alpha*A_i*B_i + beta*C_i
HIPBLAS_SIDE_RIGHT: C_i := alpha*B_i*A_i + beta*C_i
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A_i is a lower triangular matrix.
n – [in] [int] n specifies the number of rows of B_i and C_i. n >= 0.
k – [in] [int] k specifies the number of columns of B_i and C_i. k >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A_i and B_i are not referenced.
AP – [in] device pointer to first matrix A_1
A_i is m by m if side == HIPBLAS_SIDE_LEFT.
A_i is n by n if side == HIPBLAS_SIDE_RIGHT.
Only the upper/lower triangular part is accessed.
The imaginary component of the diagonal elements is not used.
lda – [in] [int] lda specifies the first dimension of A_i. If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ). Otherwise, lda >= max( 1, n ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
BP – [in] device pointer to first matrix B_1 of dimension (ldb, n) on the GPU.
ldb – [in] [int] ldb specifies the first dimension of B_i. If side = HIPBLAS_OP_N, ldb >= max( 1, m ). Otherwise, ldb >= max( 1, n ).
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
beta – [in] beta specifies the scalar beta. When beta is zero, then C does not need to be set before entry.
CP – [in] device pointer to first matrix C_1 of dimension (ldc, n) on the GPU.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, m ).
strideC – [inout] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances in the batch.
The hemmStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtrmm + Batched, StridedBatched#
-
hipblasStatus_t hipblasStrmm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const float *alpha, const float *A, int lda, const float *B, int ldb, float *C, int ldc)#
-
hipblasStatus_t hipblasDtrmm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const double *alpha, const double *A, int lda, const double *B, int ldb, double *C, int ldc)#
-
hipblasStatus_t hipblasCtrmm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipComplex *alpha, const hipComplex *A, int lda, const hipComplex *B, int ldb, hipComplex *C, int ldc)#
-
hipblasStatus_t hipblasZtrmm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *A, int lda, const hipDoubleComplex *B, int ldb, hipDoubleComplex *C, int ldc)#
BLAS Level 3 API
The trmm functions perform one of the matrix-matrix operations:
whereC := alpha*op( A )*B, or C := alpha*B*op( A )
alphais a scalar,BandCarembynmatrices,Ais a unit, non-unit, upper, or lower triangular matrix, andop( A )is one of:Note that trmm can provide in-place functionality by passing in the same address for both matricesop( A ) = A or op( A ) = A^T or op( A ) = A^H.
BandCand settingldbequal toldc.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t] Specifies whether op(A) multiplies B from the left or right as follows:
HIPBLAS_SIDE_LEFT: C := alpha*op( A )*B.
HIPBLAS_SIDE_RIGHT: C := alpha*B*op( A ).
uplo – [in] [hipblasFillMode_t] Specifies whether the matrix A is an upper or lower triangular matrix as follows:
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t] Specifies the form of op(A) to be used in the matrix multiplication as follows:
HIPBLAS_OP_N: op(A) = A.
HIPBLAS_OP_T: op(A) = A^T.
HIPBLAS_OP_C: op(A) = A^H.
diag – [in] [hipblasDiagType_t] Specifies whether or not A is unit triangular as follows:
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of B and C. m >= 0.
n – [in] [int] n specifies the number of columns of B and C. n >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A is not referenced and B does not need to be set before entry.
A – [in] Device pointer to matrix A on the GPU. A has dimension ( lda, k ), where k is m when side == HIPBLAS_SIDE_LEFT and is n when side == HIPBLAS_SIDE_RIGHT.
When uplo == HIPBLAS_FILL_MODE_UPPER, the leading k by k upper triangular part of the array A must contain the upper triangular matrix and the strictly lower triangular part of A is not referenced.
When uplo == HIPBLAS_FILL_MODE_LOWER, the leading k by k lower triangular part of the array A must contain the lower triangular matrix and the strictly upper triangular part of A is not referenced. Note that when diag == HIPBLAS_DIAG_UNIT, the diagonal elements of A are not referenced either, but are assumed to be unity.
lda – [in] [int] lda specifies the first dimension of A.
If side == HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side == HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
B – [inout] Device pointer to the matrix B of dimension (ldb, n) on the GPU.
ldb – [in] [int] ldb specifies the first dimension of B. ldb >= max( 1, m ).
C – [in] Device pointer to the matrix C of dimension (ldc, n) on the GPU. Users can pass in the same matrix B to parameter C to achieve in-place functionality for trmm.
ldc – [in] [int] ldc specifies the first dimension of C. ldc >= max( 1, m ).
The trmm functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const float *alpha, const float *const A[], int lda, const float *const B[], int ldb, float *const C[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDtrmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const double *alpha, const double *const A[], int lda, const double *const B[], int ldb, double *const C[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCtrmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipComplex *alpha, const hipComplex *const A[], int lda, const hipComplex *const B[], int ldb, hipComplex *const C[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZtrmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const A[], int lda, const hipDoubleComplex *const B[], int ldb, hipDoubleComplex *const C[], int ldc, int batchCount)#
BLAS Level 3 API
The trmmBatched functions perform one of the batched matrix-matrix operations:
whereC_i := alpha*op( A_i )*B_i, or C_i := alpha*B_i*op( A_i ) for i = 0, 1, ... batchCount -1
alphais a scalar,B_iandC_iarembynmatrices,A_iis a unit, non-unit, upper, or lower triangular matrix, andop( A_i )is one of:Note that trmmBatched can provide in-place functionality by passing in the same address for both matricesop( A_i ) = A_i or op( A_i ) = A_i^T or op( A_i ) = A_i^H.
BandCand by settingldbequal toldc.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t] Specifies whether op(A_i) multiplies B_i from the left or right as follows:
HIPBLAS_SIDE_LEFT: B_i := alpha*op( A_i )*B_i.
HIPBLAS_SIDE_RIGHT: B_i := alpha*B_i*op( A_i ).
uplo – [in] [hipblasFillMode_t] Specifies whether the matrix A is an upper or lower triangular matrix as follows:
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t] Specifies the form of op(A_i) to be used in the matrix multiplication as follows:
HIPBLAS_OP_N: op(A_i) = A_i.
HIPBLAS_OP_T: op(A_i) = A_i^T.
HIPBLAS_OP_C: op(A_i) = A_i^H.
diag – [in] [hipblasDiagType_t] Specifies whether or not A_i is unit triangular as follows:
HIPBLAS_DIAG_UNIT: A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A_i is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of B_i and C_i. m >= 0.
n – [in] [int] n specifies the number of columns of B_i and C_i. n >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A_i is not referenced and B_i does not need to be set before entry.
A – [in] Device array of device pointers storing each matrix A_i on the GPU. Each A_i is of dimension ( lda, k ), where k is m when side == HIPBLAS_SIDE_LEFT and is n when side == HIPBLAS_SIDE_RIGHT.
When uplo == HIPBLAS_FILL_MODE_UPPER, the leading k by k upper triangular part of the array A must contain the upper triangular matrix and the strictly lower triangular part of A is not referenced.
When uplo == HIPBLAS_FILL_MODE_LOWER, the leading k by k lower triangular part of the array A must contain the lower triangular matrix and the strictly upper triangular part of A is not referenced.
Note that when diag == HIPBLAS_DIAG_UNIT, the diagonal elements of A_i are not referenced either, but are assumed to be unity.
lda – [in] [int] lda specifies the first dimension of A.
If side == HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side == HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
B – [inout] device array of device pointers storing each matrix B_i of dimension (ldb, n) on the GPU.
ldb – [in] [int] ldb specifies the first dimension of B_i. ldb >= max( 1, m ).
C – [in] device array of device pointers storing each matrix C_i of dimension (ldc, n) on the GPU. Users can pass in the same matrices B to parameter C to achieve in-place functionality of trmmBatched.
ldc – [in] ldc specifies the first dimension of C_i. ldc >= max( 1, m ).
batchCount – [in] [int] number of instances i in the batch.
The trmmBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const float *alpha, const float *A, int lda, hipblasStride strideA, const float *B, int ldb, hipblasStride strideB, float *C, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasDtrmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const double *alpha, const double *A, int lda, hipblasStride strideA, const double *B, int ldb, hipblasStride strideB, double *C, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasCtrmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipComplex *alpha, const hipComplex *A, int lda, hipblasStride strideA, const hipComplex *B, int ldb, hipblasStride strideB, hipComplex *C, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZtrmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *A, int lda, hipblasStride strideA, const hipDoubleComplex *B, int ldb, hipblasStride strideB, hipDoubleComplex *C, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The trmmStridedBatched functions perform one of the strided_batched matrix-matrix operations:
whereC_i := alpha*op( A_i )*B_i, or C_i := alpha*B_i*op( A_i ) for i = 0, 1, ... batchCount -1
alphais a scalar,B_iandC_iarembynmatrices,A_iis a unit, or non-unit, upper, or lower triangular matrix, andop( A_i )is one of:Note that trmmStridedBatched can provide in-place functionality by passing in the same address for both matricesop( A_i ) = A_i or op( A_i ) = A_i^T or op( A_i ) = A_i^H.
BandCand by settingldbequal toldc.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t] Specifies whether op(A_i) multiplies B_i from the left or right as follows:
HIPBLAS_SIDE_LEFT: C_i := alpha*op( A_i )*B_i.
HIPBLAS_SIDE_RIGHT: C_i := alpha*B_i*op( A_i ).
uplo – [in] [hipblasFillMode_t] Specifies whether the matrix A is an upper or lower triangular matrix as follows:
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t] Specifies the form of op(A_i) to be used in the matrix multiplication as follows:
HIPBLAS_OP_N: op(A_i) = A_i.
HIPBLAS_OP_T: op(A_i) = A_i^T.
HIPBLAS_OP_C: op(A_i) = A_i^H.
diag – [in] [hipblasDiagType_t] Specifies whether or not A_i is unit triangular as follows:
HIPBLAS_DIAG_UNIT: A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A_i is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of B_i and C_i. m >= 0.
n – [in] [int] n specifies the number of columns of B_i and C_i. n >= 0.
alpha – [in] alpha specifies the scalar alpha. When alpha is zero, then A_i is not referenced and B_i does not need to be set before entry.
A – [in] Device pointer to the first matrix A_0 on the GPU. Each A_i is of dimension ( lda, k ), where k is m when side == HIPBLAS_SIDE_LEFT and is n when side == HIPBLAS_SIDE_RIGHT.
When uplo == HIPBLAS_FILL_MODE_UPPER, the leading k by k upper triangular part of the array A must contain the upper triangular matrix and the strictly lower triangular part of A is not referenced.
When uplo == HIPBLAS_FILL_MODE_LOWER, the leading k by k lower triangular part of the array A must contain the lower triangular matrix and the strictly upper triangular part of A is not referenced.
Note that when diag == HIPBLAS_DIAG_UNIT, the diagonal elements of A_i are not referenced either, but are assumed to be unity.
lda – [in] [int] lda specifies the first dimension of A.
if side == HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
if side == HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
B – [inout] Device pointer to the first matrix B_0 on the GPU. Each B_i is of dimension ( ldb, n ).
ldb – [in] [int] ldb specifies the first dimension of B_i. ldb >= max( 1, m ).
strideB – [in] [hipblasStride] stride from the start of one matrix (B_i) to the next one (B_i+1).
C – [in] Device pointer to the first matrix C_0 on the GPU. Each C_i is of dimension ( ldc, n ).
ldc – [in] [int] ldc specifies the first dimension of C_i. ldc >= max( 1, m ).
strideC – [in] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances i in the batch.
The trmmStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtrsm + Batched, StridedBatched#
-
hipblasStatus_t hipblasStrsm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const float *alpha, const float *AP, int lda, float *BP, int ldb)#
-
hipblasStatus_t hipblasDtrsm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const double *alpha, const double *AP, int lda, double *BP, int ldb)#
-
hipblasStatus_t hipblasCtrsm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, hipComplex *BP, int ldb)#
-
hipblasStatus_t hipblasZtrsm(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipDoubleComplex *BP, int ldb)#
BLAS Level 3 API
The trsm functions solve:
whereop(A)*X = alpha*B or X*op(A) = alpha*B,
alphais a scalar,XandBarembynmatrices,Ais triangular matrix, andop(A)is one of:The matrixop( A ) = A or op( A ) = A^T or op( A ) = A^H.
Xis overwritten onB.Note about memory allocation: When trsm is launched with a
kevenly divisible by the internal block size of 128 and is no larger than 10 of these blocks, the API uses preallocated memory found in the handle to increase overall performance (wherekismwhenHIPBLAS_SIDE_LEFTand isnwhenHIPBLAS_SIDE_RIGHT). For more information on preallocated memory in the handle, see the Device Memory Allocation in rocBLAS section of the rocBLAS API Reference.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: op(A)*X = alpha*B.
HIPBLAS_SIDE_RIGHT: X*op(A) = alpha*B.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: op(A) = A.
HIPBLAS_OP_T: op(A) = A^T.
HIPBLAS_OP_C: op(A) = A^H.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of B. m >= 0.
n – [in] [int] n specifies the number of columns of B. n >= 0.
alpha – [in] device pointer or host pointer specifying the scalar alpha. When alpha is &zero, then A is not referenced and B does not need to be set before entry.
AP – [in] device pointer storing matrix A. Of dimension ( lda, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A.
If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side = HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
BP – [inout] device pointer storing matrix B.
ldb – [in] [int] ldb specifies the first dimension of B. ldb >= max( 1, m ).
The trsm functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrsmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const float *alpha, const float *const AP[], int lda, float *const BP[], int ldb, int batchCount)#
-
hipblasStatus_t hipblasDtrsmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const double *alpha, const double *const AP[], int lda, double *const BP[], int ldb, int batchCount)#
-
hipblasStatus_t hipblasCtrsmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipComplex *alpha, const hipComplex *const AP[], int lda, hipComplex *const BP[], int ldb, int batchCount)#
-
hipblasStatus_t hipblasZtrsmBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *const AP[], int lda, hipDoubleComplex *const BP[], int ldb, int batchCount)#
BLAS Level 3 API
The trsmBatched functions perform the following batched operation:
whereop(A_i)*X_i = alpha*B_i or X_i*op(A_i) = alpha*B_i, for i = 1, ..., batchCount.
alphais a scalar,XandBare batchedmbynmatrices,Ais a triangular batched matrix, andop(A)is one of:Each matrixop( A ) = A or op( A ) = A^T or op( A ) = A^H.
X_iis overwritten onB_ifori= 1, …,batchCount.Note about memory allocation: When trsm is launched with a
kevenly divisible by the internal block size of 128 and is no larger than 10 of these blocks, the API uses preallocated memory found in the handle to increase overall performance (wherekismwhenHIPBLAS_SIDE_LEFTand isnwhenHIPBLAS_SIDE_RIGHT). For more information on preallocated memory in the handle, see the Device Memory Allocation in rocBLAS section of the rocBLAS API Reference.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: op(A)*X = alpha*B.
HIPBLAS_SIDE_RIGHT: X*op(A) = alpha*B.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: op(A) = A.
HIPBLAS_OP_T: op(A) = A^T.
HIPBLAS_OP_C: op(A) = A^H.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of each B_i. m >= 0.
n – [in] [int] n specifies the number of columns of each B_i. n >= 0.
alpha – [in] device pointer or host pointer specifying the scalar alpha. When alpha is &zero, then A is not referenced and B does not need to be set before entry.
AP – [in] device array of device pointers storing each matrix A_i on the GPU. Matricies are of dimension ( lda, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of each A_i.
If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side = HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
BP – [inout] device array of device pointers storing each matrix B_i on the GPU.
ldb – [in] [int] ldb specifies the first dimension of each B_i. ldb >= max( 1, m ).
batchCount – [in] [int] number of trsm operatons in the batch.
The trsmBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasStrsmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const float *alpha, const float *AP, int lda, hipblasStride strideA, float *BP, int ldb, hipblasStride strideB, int batchCount)#
-
hipblasStatus_t hipblasDtrsmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const double *alpha, const double *AP, int lda, hipblasStride strideA, double *BP, int ldb, hipblasStride strideB, int batchCount)#
-
hipblasStatus_t hipblasCtrsmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipComplex *alpha, const hipComplex *AP, int lda, hipblasStride strideA, hipComplex *BP, int ldb, hipblasStride strideB, int batchCount)#
-
hipblasStatus_t hipblasZtrsmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const hipDoubleComplex *alpha, const hipDoubleComplex *AP, int lda, hipblasStride strideA, hipDoubleComplex *BP, int ldb, hipblasStride strideB, int batchCount)#
BLAS Level 3 API
The trsmStridedBatched functions perform the following strided batched operation:
whereop(A_i)*X_i = alpha*B_i or X_i*op(A_i) = alpha*B_i, for i = 1, ..., batchCount.
alphais a scalar,XandBare strided batchedmbynmatrices,Ais a triangular strided batched matrix, andop(A)is one of:Each matrixop( A ) = A or op( A ) = A^T or op( A ) = A^H.
X_iis overwritten onB_ifori= 1, …,batchCount.Note about memory allocation: When trsm is launched with a
kevenly divisible by the internal block size of 128 and is no larger than 10 of these blocks, the API uses preallocated memory found in the handle to increase overall performance (wherekismwhenHIPBLAS_SIDE_LEFTand isnwhenHIPBLAS_SIDE_RIGHT). For more information on preallocated memory in the handle, see the Device Memory Allocation in rocBLAS section of the rocBLAS API Reference.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: op(A)*X = alpha*B.
HIPBLAS_SIDE_RIGHT: X*op(A) = alpha*B.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: op(A) = A.
HIPBLAS_OP_T: op(A) = A^T.
HIPBLAS_OP_C: op(A) = A^H.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of each B_i. m >= 0.
n – [in] [int] n specifies the number of columns of each B_i. n >= 0.
alpha – [in] device pointer or host pointer specifying the scalar alpha. When alpha is &zero, then A is not referenced and B does not need to be set before entry.
AP – [in] device pointer pointing to the first matrix A_1. Of dimension ( lda, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of each A_i.
If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side = HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
strideA – [in] [hipblasStride] stride from the start of one A_i matrix to the next A_(i + 1).
BP – [inout] device pointer pointing to the first matrix B_1.
ldb – [in] [int] ldb specifies the first dimension of each B_i. ldb >= max( 1, m ).
strideB – [in] [hipblasStride] stride from the start of one B_i matrix to the next B_(i + 1).
batchCount – [in] [int] number of trsm operatons in the batch.
The trsmStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasXtrtri + Batched, StridedBatched#
-
hipblasStatus_t hipblasStrtri(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const float *AP, int lda, float *invA, int ldinvA)#
-
hipblasStatus_t hipblasDtrtri(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const double *AP, int lda, double *invA, int ldinvA)#
-
hipblasStatus_t hipblasCtrtri(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const hipComplex *AP, int lda, hipComplex *invA, int ldinvA)#
-
hipblasStatus_t hipblasZtrtri(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, int lda, hipDoubleComplex *invA, int ldinvA)#
BLAS Level 3 API
The trtri functions compute the inverse of a matrix A, namely:
and write the result intoinvAinvA.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
If HIPBLAS_FILL_MODE_UPPER, the lower part of A is not referenced.
If HIPBLAS_FILL_MODE_LOWER, the upper part of A is not referenced.
diag – [in] [hipblasDiagType_t]
’HIPBLAS_DIAG_NON_UNIT’, A is non-unit triangular.
’HIPBLAS_DIAG_UNIT’, A is unit triangular.
n – [in] [int] size of matrix A and invA.
AP – [in] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
invA – [out] device pointer storing matrix invA.
ldinvA – [in] [int] specifies the leading dimension of invA.
-
hipblasStatus_t hipblasStrtriBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const float *const AP[], int lda, float *invA[], int ldinvA, int batchCount)#
-
hipblasStatus_t hipblasDtrtriBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const double *const AP[], int lda, double *invA[], int ldinvA, int batchCount)#
-
hipblasStatus_t hipblasCtrtriBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const hipComplex *const AP[], int lda, hipComplex *invA[], int ldinvA, int batchCount)#
-
hipblasStatus_t hipblasZtrtriBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const hipDoubleComplex *const AP[], int lda, hipDoubleComplex *invA[], int ldinvA, int batchCount)#
BLAS Level 3 API
The trtriBatched functions compute the inverse of
A_iand write intoinvA_i, whereA_iandinvA_iare thei-th matrices in the batch, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
diag – [in] [hipblasDiagType_t]
’HIPBLAS_DIAG_NON_UNIT’, A is non-unit triangular.
’HIPBLAS_DIAG_UNIT’, A is unit triangular.
n – [in] [int]
AP – [in] device array of device pointers storing each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i.
invA – [out] device array of device pointers storing the inverse of each matrix A_i. Partial inplace operation is supported, see below.
If UPLO = ‘U’, the leading N-by-N upper triangular part of the invA will store the inverse of the upper triangular matrix, and the strictly lower triangular part of invA is cleared.
If UPLO = ‘L’, the leading N-by-N lower triangular part of the invA will store the inverse of the lower triangular matrix, and the strictly upper triangular part of invA is cleared.
ldinvA – [in] [int] specifies the leading dimension of each invA_i.
batchCount – [in] [int] numbers of matrices in the batch.
-
hipblasStatus_t hipblasStrtriStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const float *AP, int lda, hipblasStride strideA, float *invA, int ldinvA, hipblasStride stride_invA, int batchCount)#
-
hipblasStatus_t hipblasDtrtriStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const double *AP, int lda, hipblasStride strideA, double *invA, int ldinvA, hipblasStride stride_invA, int batchCount)#
-
hipblasStatus_t hipblasCtrtriStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const hipComplex *AP, int lda, hipblasStride strideA, hipComplex *invA, int ldinvA, hipblasStride stride_invA, int batchCount)#
-
hipblasStatus_t hipblasZtrtriStridedBatched(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasDiagType_t diag, int n, const hipDoubleComplex *AP, int lda, hipblasStride strideA, hipDoubleComplex *invA, int ldinvA, hipblasStride stride_invA, int batchCount)#
BLAS Level 3 API
The trtriStridedBatched functions compute the inverse of
A_iand write intoinvA_i, whereA_iandinvA_iare thei-th matrices in the batch, fori= 1, …,batchCount.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] specifies either upper (HIPBLAS_FILL_MODE_UPPER) or lower (HIPBLAS_FILL_MODE_LOWER):
diag – [in] [hipblasDiagType_t]
’HIPBLAS_DIAG_NON_UNIT’, A is non-unit triangular.
’HIPBLAS_DIAG_UNIT’, A is unit triangular.
n – [in] [int]
AP – [in] device pointer pointing to address of first matrix A_1.
lda – [in] [int] specifies the leading dimension of each A.
strideA – [in] [hipblasStride] “batch stride a”: stride from the start of one A_i matrix to the next A_(i + 1).
invA – [out] device pointer storing the inverses of each matrix A_i. Partial inplace operation is supported, see below.
If UPLO = ‘U’, the leading N-by-N upper triangular part of the invA will store the inverse of the upper triangular matrix, and the strictly lower triangular part of invA is cleared.
If UPLO = ‘L’, the leading N-by-N lower triangular part of the invA will store the inverse of the lower triangular matrix, and the strictly upper triangular part of invA is cleared.
ldinvA – [in] [int] specifies the leading dimension of each invA_i.
stride_invA – [in] [hipblasStride] “batch stride invA”: stride from the start of one invA_i matrix to the next invA_(i + 1).
batchCount – [in] [int] numbers of matrices in the batch.
hipblasXdgmm + Batched, StridedBatched#
-
hipblasStatus_t hipblasSdgmm(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const float *AP, int lda, const float *x, int incx, float *CP, int ldc)#
-
hipblasStatus_t hipblasDdgmm(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const double *AP, int lda, const double *x, int incx, double *CP, int ldc)#
-
hipblasStatus_t hipblasCdgmm(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const hipComplex *AP, int lda, const hipComplex *x, int incx, hipComplex *CP, int ldc)#
-
hipblasStatus_t hipblasZdgmm(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const hipDoubleComplex *AP, int lda, const hipDoubleComplex *x, int incx, hipDoubleComplex *CP, int ldc)#
BLAS Level 3 API
The dgmm functions perform one of the matrix-matrix operations:
whereC = A * diag(x) if side == HIPBLAS_SIDE_RIGHT C = diag(x) * A if side == HIPBLAS_SIDE_LEFT
CandAarembyndimensional matrices,diag( x )is a diagonal matrix, andxis a vector of dimensionnifside == HIPBLAS_SIDE_RIGHTand dimensionmifside == HIPBLAS_SIDE_LEFT.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t] specifies the side of diag(x).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
AP – [in] device pointer storing matrix A.
lda – [in] [int] specifies the leading dimension of A.
x – [in] device pointer storing vector x.
incx – [in] [int] specifies the increment between values of x
CP – [inout] device pointer storing matrix C.
ldc – [in] [int] specifies the leading dimension of C.
The dgmm functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSdgmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const float *const AP[], int lda, const float *const x[], int incx, float *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasDdgmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const double *const AP[], int lda, const double *const x[], int incx, double *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasCdgmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const hipComplex *const AP[], int lda, const hipComplex *const x[], int incx, hipComplex *const CP[], int ldc, int batchCount)#
-
hipblasStatus_t hipblasZdgmmBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const hipDoubleComplex *const AP[], int lda, const hipDoubleComplex *const x[], int incx, hipDoubleComplex *const CP[], int ldc, int batchCount)#
BLAS Level 3 API
The dgmmBatched functions perform one of the batched matrix-matrix operations:
whereC_i = A_i * diag(x_i) for i = 0, 1, ... batchCount-1 if side == HIPBLAS_SIDE_RIGHT C_i = diag(x_i) * A_i for i = 0, 1, ... batchCount-1 if side == HIPBLAS_SIDE_LEFT
C_iandA_iarembyndimensional matrices,diag(x_i)is a diagonal matrix andx_iis vector of dimensionnifside == HIPBLAS_SIDE_RIGHTand dimensionmifside == HIPBLAS_SIDE_LEFT.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t] specifies the side of diag(x).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
AP – [in] device array of device pointers storing each matrix A_i on the GPU. Each A_i is of dimension ( lda, n ).
lda – [in] [int] specifies the leading dimension of A_i.
x – [in] device array of device pointers storing each vector x_i on the GPU. Each x_i is of dimension n if side == HIPBLAS_SIDE_RIGHT and dimension m if side == HIPBLAS_SIDE_LEFT.
incx – [in] [int] specifies the increment between values of x_i.
CP – [inout] device array of device pointers storing each matrix C_i on the GPU. Each C_i is of dimension ( ldc, n ).
ldc – [in] [int] specifies the leading dimension of C_i.
batchCount – [in] [int] number of instances in the batch.
The dgmmBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasSdgmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const float *AP, int lda, hipblasStride strideA, const float *x, int incx, hipblasStride stridex, float *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasDdgmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const double *AP, int lda, hipblasStride strideA, const double *x, int incx, hipblasStride stridex, double *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasCdgmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const hipComplex *AP, int lda, hipblasStride strideA, const hipComplex *x, int incx, hipblasStride stridex, hipComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
-
hipblasStatus_t hipblasZdgmmStridedBatched(hipblasHandle_t handle, hipblasSideMode_t side, int m, int n, const hipDoubleComplex *AP, int lda, hipblasStride strideA, const hipDoubleComplex *x, int incx, hipblasStride stridex, hipDoubleComplex *CP, int ldc, hipblasStride strideC, int batchCount)#
BLAS Level 3 API
The dgmmStridedBatched functions perform one of the batched matrix-matrix operations:
whereC_i = A_i * diag(x_i) if side == HIPBLAS_SIDE_RIGHT for i = 0, 1, ... batchCount-1 C_i = diag(x_i) * A_i if side == HIPBLAS_SIDE_LEFT for i = 0, 1, ... batchCount-1
C_iandA_iarembyndimensional matrices,diag(x_i)is a diagonal matrix, andx_iis a vector of dimensionnifside == HIPBLAS_SIDE_RIGHTand dimensionmifside == HIPBLAS_SIDE_LEFT.Supported precisions in rocBLAS :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t] specifies the side of diag(x).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
AP – [in] device pointer to the first matrix A_0 on the GPU. Each A_i is of dimension ( lda, n ).
lda – [in] [int] specifies the leading dimension of A.
strideA – [in] [hipblasStride] stride from the start of one matrix (A_i) to the next one (A_i+1).
x – [in] pointer to the first vector x_0 on the GPU. Each x_i is of dimension n if side == HIPBLAS_SIDE_RIGHT and dimension m if side == HIPBLAS_SIDE_LEFT.
incx – [in] [int] specifies the increment between values of x.
stridex – [in] [hipblasStride] stride from the start of one vector(x_i) to the next one (x_i+1).
CP – [inout] device pointer to the first matrix C_0 on the GPU. Each C_i is of dimension ( ldc, n ).
ldc – [in] [int] specifies the leading dimension of C.
strideC – [in] [hipblasStride] stride from the start of one matrix (C_i) to the next one (C_i+1).
batchCount – [in] [int] number of instances i in the batch.
The dgmmStridedBatched functions support the 64-bit integer interface. See the ILP64 interfaces section.
BLAS extensions#
hipblasGemmEx + Batched, StridedBatched#
-
hipblasStatus_t hipblasGemmEx(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const void *alpha, const void *A, hipDataType aType, int lda, const void *B, hipDataType bType, int ldb, const void *beta, void *C, hipDataType cType, int ldc, hipblasComputeType_t computeType, hipblasGemmAlgo_t algo)#
BLAS EX API
The gemmEx functions perform one of the matrix-matrix operations:
whereC = alpha*op( A )*op( B ) + beta*C,
op( X )is one of:op( X ) = X or op( X ) = X**T or op( X ) = X**H,
alphaandbetaare scalars, andA,B, andCare matrices, withop( A )anmbykmatrix,op( B )akbynmatrix, andCanmbynmatrix.Supported types are determined by the backend. See the cuBLAS documentation for cuBLAS backend information. For the rocBLAS backend, conversion from
hipblasComputeType_ttorocblas_datatype_thappens within hipBLAS. Supported types are as follows:
aType
bType
cType
computeType
HIP_R_16F
HIP_R_16F
HIP_R_16F
HIPBLAS_COMPUTE_16F
HIP_R_16F
HIP_R_16F
HIP_R_16F
HIPBLAS_COMPUTE_32F
HIP_R_16F
HIP_R_16F
HIP_R_32F
HIPBLAS_COMPUTE_32F
HIP_R_16BF
HIP_R_16BF
HIP_R_16BF
HIPBLAS_COMPUTE_32F
HIP_R_16BF
HIP_R_16BF
HIP_R_32F
HIPBLAS_COMPUTE_32F
HIP_R_32F
HIP_R_32F
HIP_R_32F
HIPBLAS_COMPUTE_32F
HIP_R_64F
HIP_R_64F
HIP_R_64F
HIPBLAS_COMPUTE_64F
HIP_R_8I
HIP_R_8I
HIP_R_32I
HIPBLAS_COMPUTE_32I
HIP_C_32F
HIP_C_32F
HIP_C_32F
HIPBLAS_COMPUTE_32F
HIP_C_64F
HIP_C_64F
HIP_C_64F
HIPBLAS_COMPUTE_64F
hipblasGemmExWithFlagsis also available. This is identical tohipblasGemmExwith the addition of aflagsparameter which controls the flags used in Tensile to control gemm algorithms with the rocBLAS backend. When using a cuBLAS backend, this parameter is ignored.- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
k – [in] [int] matrix dimension k.
alpha – [in] [const void *] device pointer or host pointer specifying the scalar alpha. Same datatype as computeType.
A – [in] [void *] device pointer storing matrix A.
aType – [in] [hipDataType] specifies the datatype of matrix A.
lda – [in] [int] specifies the leading dimension of A.
B – [in] [void *] device pointer storing matrix B.
bType – [in] [hipDataType] specifies the datatype of matrix B.
ldb – [in] [int] specifies the leading dimension of B.
beta – [in] [const void *] device pointer or host pointer specifying the scalar beta. Same datatype as computeType.
C – [in] [void *] device pointer storing matrix C.
cType – [in] [hipDataType] specifies the datatype of matrix C.
ldc – [in] [int] specifies the leading dimension of C.
computeType – [in] [hipblasComputeType_t] specifies the datatype of computation.
algo – [in] [hipblasGemmAlgo_t] enumerant specifying the algorithm type.
-
hipblasStatus_t hipblasGemmBatchedEx(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const void *alpha, const void *A[], hipDataType aType, int lda, const void *B[], hipDataType bType, int ldb, const void *beta, void *C[], hipDataType cType, int ldc, int batchCount, hipblasComputeType_t computeType, hipblasGemmAlgo_t algo)#
BLAS EX API
The gemmBatchedEx functions perform one of the batched matrix-matrix operations:
whereC_i = alpha*op(A_i)*op(B_i) + beta*C_i, for i = 1, ..., batchCount.
op( X )is one of:op( X ) = X or op( X ) = X**T or op( X ) = X**H,
alphaandbetaare scalars, andA,B, andCare batched pointers to matrices, withop( A )anmbykbybatchCountbatched matrix,op( B )akbynbybatchCountbatched matrix, andCanmbynbybatchCountbatched matrix. The batched matrices are an array of pointers to matrices. The number of pointers to matrices isbatchCount.Supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.
hipblasGemmBatchedExWithFlagsis also available. This is identical tohipblasGemmBatchedExwith the addition of aflagsparameter which controls the flags used in Tensile to control gemm algorithms with the rocBLAS backend. When using a cuBLAS backend, this parameter is ignored.- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
k – [in] [int] matrix dimension k.
alpha – [in] [const void *] device pointer or host pointer specifying the scalar alpha. Same datatype as computeType.
A – [in] [void *] device pointer storing array of pointers to each matrix A_i.
aType – [in] [hipDataType] specifies the datatype of each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i.
B – [in] [void *] device pointer storing array of pointers to each matrix B_i.
bType – [in] [hipDataType] specifies the datatype of each matrix B_i.
ldb – [in] [int] specifies the leading dimension of each B_i.
beta – [in] [const void *] device pointer or host pointer specifying the scalar beta. Same datatype as computeType.
C – [in] [void *] device array of device pointers to each matrix C_i.
cType – [in] [hipDataType] specifies the datatype of each matrix C_i.
ldc – [in] [int] specifies the leading dimension of each C_i.
batchCount – [in] [int] number of gemm operations in the batch.
computeType – [in] [hipblasComputeType_t] specifies the datatype of computation.
algo – [in] [hipblasGemmAlgo_t] enumerant specifying the algorithm type.
-
hipblasStatus_t hipblasGemmStridedBatchedEx(hipblasHandle_t handle, hipblasOperation_t transA, hipblasOperation_t transB, int m, int n, int k, const void *alpha, const void *A, hipDataType aType, int lda, hipblasStride strideA, const void *B, hipDataType bType, int ldb, hipblasStride strideB, const void *beta, void *C, hipDataType cType, int ldc, hipblasStride strideC, int batchCount, hipblasComputeType_t computeType, hipblasGemmAlgo_t algo)#
BLAS EX API
The gemmStridedBatchedEx functions perform one of the strided_batched matrix-matrix operations:
whereC_i = alpha*op(A_i)*op(B_i) + beta*C_i, for i = 1, ..., batchCount
op( X )is one of:op( X ) = X or op( X ) = X**T or op( X ) = X**H,
alphaandbetaare scalars, andA,B, andCare strided_batched matrices, withop( A )anmbykbybatchCountstrided_batched matrix,op( B )akbynbybatchCountstrided_batched matrix, andCanmbynbybatchCountstrided_batched matrix.The strided_batched matrices are multiple matrices separated by a constant stride. The number of matrices is
batchCount.Supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.
hipblasGemmStridedBatchedExWithFlagsis also available. This is identical tohipblasStridedBatchedGemmExwith the addition of aflagsparameter which controls the flags used in Tensile to control gemm algorithms with the rocBLAS backend. When using a cuBLAS backend, this parameter is ignored.- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
transB – [in] [hipblasOperation_t] specifies the form of op( B ).
m – [in] [int] matrix dimension m.
n – [in] [int] matrix dimension n.
k – [in] [int] matrix dimension k.
alpha – [in] [const void *] device pointer or host pointer specifying the scalar alpha. Same datatype as computeType.
A – [in] [void *] device pointer pointing to first matrix A_1.
aType – [in] [hipDataType] specifies the datatype of each matrix A_i.
lda – [in] [int] specifies the leading dimension of each A_i.
strideA – [in] [hipblasStride] specifies stride from start of one A_i matrix to the next A_(i + 1).
B – [in] [void *] device pointer pointing to first matrix B_1.
bType – [in] [hipDataType] specifies the datatype of each matrix B_i.
ldb – [in] [int] specifies the leading dimension of each B_i.
strideB – [in] [hipblasStride] specifies stride from start of one B_i matrix to the next B_(i + 1).
beta – [in] [const void *] device pointer or host pointer specifying the scalar beta. Same datatype as computeType.
C – [in] [void *] device pointer pointing to first matrix C_1.
cType – [in] [hipDataType] specifies the datatype of each matrix C_i.
ldc – [in] [int] specifies the leading dimension of each C_i.
strideC – [in] [hipblasStride] specifies stride from start of one C_i matrix to the next C_(i + 1).
batchCount – [in] [int] number of gemm operations in the batch.
computeType – [in] [hipblasComputeType_t] specifies the datatype of computation.
algo – [in] [hipblasGemmAlgo_t] enumerant specifying the algorithm type.
The gemmEx, gemmBatchedEx, and gemmStridedBatchedEx functions support the 64-bit integer interface. See the ILP64 interfaces section.
hipblasSyrkEx#
-
hipblasStatus_t hipblasSyrkEx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const void *alpha, const void *A, hipDataType aType, int lda, const void *beta, void *C, hipDataType cType, int ldc, hipDataType computeType)#
BLAS EX API
The syrkEx function performs one of the matrix-matrix operations for a symmetric rank-k update:
whereC := alpha*op( A )*op( A )^T + beta*C
alphaandbetaare scalars,op(A)is annbykmatrix, andCis a symmetricnxnmatrix stored as either upper or lower.op( A ) = A, and A is n by k if transA == HIPBLAS_OP_N op( A ) = A^T and A is k by n if transA == HIPBLAS_OP_T
Supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] Specifies whether the matrix C is an upper or lower triangular matrix as follows:
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
n – [in] [int] matrix dimension n.
k – [in] [int] matrix dimension k.
alpha – [in] [const void *] device pointer or host pointer specifying the scalar alpha. Same datatype as computeType.
A – [in] [void *] device pointer storing matrix A.
aType – [in] [hipDataType] specifies the datatype of matrix A.
lda – [in] [int] specifies the leading dimension of A.
beta – [in] [const void *] device pointer or host pointer specifying the scalar beta. Same datatype as computeType.
C – [in] [void *] device pointer storing matrix C.
cType – [in] [hipDataType] specifies the datatype of matrix C.
ldc – [in] [int] specifies the leading dimension of C.
computeType – [in] [hipDataType] specifies the datatype of the computation.
hipblasHerkEx#
-
hipblasStatus_t hipblasHerkEx(hipblasHandle_t handle, hipblasFillMode_t uplo, hipblasOperation_t transA, int n, int k, const void *alpha, const void *A, hipDataType aType, int lda, const void *beta, void *C, hipDataType cType, int ldc, hipDataType computeType)#
BLAS EX API
The herkEx function performs one of the matrix-matrix operations for a Hermitian rank-k update:
whereC := alpha*op( A )*op( A )^H + beta*C
alphaandbetaare scalars,op(A)is annbykmatrix, andCis a Hermitiannxnmatrix stored as either upper or lower.op( A ) = A, and A is n by k if transA == HIPBLAS_OP_N op( A ) = A^H and A is k by n if transA == HIPBLAS_OP_C
Supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
uplo – [in] [hipblasFillMode_t] Specifies whether the matrix C is an upper or lower triangular matrix as follows:
HIPBLAS_FILL_MODE_UPPER: C is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: C is a lower triangular matrix.
transA – [in] [hipblasOperation_t] specifies the form of op( A ).
n – [in] [int] matrix dimension n.
k – [in] [int] matrix dimension k.
alpha – [in] [const void *] device pointer or host pointer specifying the scalar alpha. Same datatype as computeType.
A – [in] [void *] device pointer storing matrix A.
aType – [in] [hipDataType] specifies the datatype of matrix A.
lda – [in] [int] specifies the leading dimension of A.
beta – [in] [const void *] device pointer or host pointer specifying the scalar beta. Same datatype as computeType.
C – [in] [void *] device pointer storing matrix C.
cType – [in] [hipDataType] specifies the datatype of matrix C.
ldc – [in] [int] specifies the leading dimension of C.
computeType – [in] [hipDataType] specifies the datatype of the computation.
hipblasTrsmEx + Batched, StridedBatched#
-
hipblasStatus_t hipblasTrsmEx(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const void *alpha, void *A, int lda, void *B, int ldb, const void *invA, int invAsize, hipDataType computeType)#
BLAS EX API
The trsmEx functions solve:
whereop(A)*X = alpha*B or X*op(A) = alpha*B,
alphais a scalar,XandBarembynmatrices,Ais a triangular matrix, andop(A)is one ofThe matrixop( A ) = A or op( A ) = A^T or op( A ) = A^H.
Xis overwritten onB.This function gives the user the ability to reuse the
invAmatrix between runs. IfinvA == NULL,hipblasTrsmExwill automatically calculateinvAon every run.Setting up invA: The accepted
invAmatrix consists of the packed 128x128 inverses of the diagonal blocks of matrixA, followed by any smaller diagonal block that remains. To set upinvA, it is recommended thathipblasTrtriBatchedbe used with matrixAas the input.Device memory of size 128 x
kshould be allocated forinvAahead of time, wherekismwhenHIPBLAS_SIDE_LEFTand isnwhenHIPBLAS_SIDE_RIGHT. The actual number of elements ininvAshould be passed asinvAsize.To begin,
hipblasTrtriBatchedmust be called on the full 128x128 sized diagonal blocks of matrixA. Here are the restricted parameters:n= 128ldinvA= 128stride_invA= 128x128batchCount=k / 128,
Then any remaining block can be added:
n=k % 128invA=invA + stride_invA * previousBatchCountldinvA= 128batchCount= 1
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: op(A)*X = alpha*B.
HIPBLAS_SIDE_RIGHT: X*op(A) = alpha*B.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: A is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: A is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: op(A) = A.
HIPBLAS_OP_T: op(A) = A^T.
HIPBLAS_ON_C: op(A) = A^H.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: A is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: A is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of B. m >= 0.
n – [in] [int] n specifies the number of columns of B. n >= 0.
alpha – [in] [void *] device pointer or host pointer specifying the scalar alpha. When alpha is &zero, then A is not referenced, and B does not need to be set before entry.
A – [in] [void *] device pointer storing matrix A. Of dimension ( lda, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A.
If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side = HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
B – [inout] [void *] device pointer storing matrix B. B is of dimension ( ldb, n ). Before entry, the leading m by n part of the array B must contain the right-hand side matrix B, and on exit is overwritten by the solution matrix X.
ldb – [in] [int] ldb specifies the first dimension of B. ldb >= max( 1, m ).
invA – [in] [void *] device pointer storing the inverse diagonal blocks of A. invA is of dimension ( ld_invA, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. ld_invA must be equal to 128.
invAsize – [in] [int] invAsize specifies the number of elements of device memory in invA.
computeType – [in] [hipDataType] specifies the datatype of computation.
-
hipblasStatus_t hipblasTrsmBatchedEx(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const void *alpha, void *A, int lda, void *B, int ldb, int batchCount, const void *invA, int invAsize, hipDataType computeType)#
BLAS EX API
The trsmBatchedEx functions solve:
forop(A_i)*X_i = alpha*B_i or X_i*op(A_i) = alpha*B_i,
i= 1, …,batchCount, wherealphais a scalar,XandBare arrays ofmbynmatrices,Ais an array of triangular matrices, and eachop(A_i)is one of:Each matrixop( A_i ) = A_i or op( A_i ) = A_i^T or op( A_i ) = A_i^H.
X_iis overwritten onB_i.This function gives the user the ability to reuse the
invAmatrix between runs. IfinvA == NULL,hipblasTrsmBatchedExwill automatically calculate eachinvA_ion every run.Setting up
invA: Each acceptedinvA_imatrix consists of the packed 128x128 inverses of the diagonal blocks of matrixA_i, followed by any smaller diagonal block that remains. To set up eachinvA_i, it is recommended thathipblasTrtriBatchedbe used with matrixA_ias the input.invAis an array of pointers ofbatchCountlength holding eachinvA_i.Device memory of size 128 x
kshould be allocated for eachinvA_iahead of time, wherekismwhenHIPBLAS_SIDE_LEFTand isnwhenHIPBLAS_SIDE_RIGHT. The actual number of elements in eachinvA_ishould be passed asinvAsize.To begin,
hipblasTrtriBatchedmust be called on the full 128x128 sized diagonal blocks of each matrixA_i. Below are the restricted parameters:n= 128ldinvA= 128stride_invA= 128x128batchCount=k / 128,
Then any remaining block can be added:
n=k % 128invA=invA + stride_invA * previousBatchCountldinvA= 128batchCount= 1
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: op(A)*X = alpha*B.
HIPBLAS_SIDE_RIGHT: X*op(A) = alpha*B.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: op(A) = A.
HIPBLAS_OP_T: op(A) = A^T.
HIPBLAS_OP_C: op(A) = A^H.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of each B_i. m >= 0.
n – [in] [int] n specifies the number of columns of each B_i. n >= 0.
alpha – [in] [void *] device pointer or host pointer alpha specifying the scalar alpha. When alpha is &zero, then A is not referenced, and B does not need to be set before entry.
A – [in] [void *] device array of device pointers storing each matrix A_i. Each A_i is of dimension ( lda, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of each A_i.
If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side = HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
B – [inout] [void *] device array of device pointers storing each matrix B_i. Each B_i is of dimension ( ldb, n ). Before entry, the leading m by n part of the array B_i must contain the right-hand side matrix B_i, and on exit is overwritten by the solution matrix X_i.
ldb – [in] [int] ldb specifies the first dimension of each B_i. ldb >= max( 1, m ).
batchCount – [in] [int] specifies how many batches.
invA – [in] [void *] device array of device pointers storing the inverse diagonal blocks of each A_i. Each invA_i is of dimension ( ld_invA, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. ld_invA must be equal to 128.
invAsize – [in] [int] invAsize specifies the number of elements of device memory in each invA_i.
computeType – [in] [hipDataType] specifies the datatype of computation.
-
hipblasStatus_t hipblasTrsmStridedBatchedEx(hipblasHandle_t handle, hipblasSideMode_t side, hipblasFillMode_t uplo, hipblasOperation_t transA, hipblasDiagType_t diag, int m, int n, const void *alpha, void *A, int lda, hipblasStride strideA, void *B, int ldb, hipblasStride strideB, int batchCount, const void *invA, int invAsize, hipblasStride strideInvA, hipDataType computeType)#
BLAS EX API
The trsmStridedBatchedEx functions solve:
forop(A_i)*X_i = alpha*B_i or X_i*op(A_i) = alpha*B_i,
i= 1, …,batchCount, wherealphais a scalar,XandBare strided batchedmbynmatrices,Ais a strided batched triangular matrix, andop(A_i)is one of:Each matrixop( A_i ) = A_i or op( A_i ) = A_i^T or op( A_i ) = A_i^H.
X_iis overwritten onB_i.This function gives the user the ability to reuse each
invA_imatrix between runs. IfinvA == NULL,hipblasTrsmStridedBatchedExwill automatically calculate eachinvA_ion every run.Setting up invA: Each accepted
invA_imatrix consists of the packed 128x128 inverses of the diagonal blocks of matrixA_i, followed by any smaller diagonal block that remains. To set upinvA_i, it is recommended thathipblasTrtriBatchedbe used with matrixA_ias the input.invAis a contiguous piece of memory holding eachinvA_i.Device memory of size 128 x
kshould be allocated for eachinvA_iahead of time, wherekismwhenHIPBLAS_SIDE_LEFTand isnwhenHIPBLAS_SIDE_RIGHT. The actual number of elements in eachinvA_ishould be passed asinvAsize.To begin,
hipblasTrtriBatchedmust be called on the full 128x128 sized diagonal blocks of each matrixA_i. Below are the restricted parameters:n= 128ldinvA= 128stride_invA= 128x128batchCount=k / 128,
Then any remaining block can be added:
n=k % 128invA=invA + stride_invA * previousBatchCountldinvA= 128batchCount= 1
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
side – [in] [hipblasSideMode_t]
HIPBLAS_SIDE_LEFT: op(A)*X = alpha*B.
HIPBLAS_SIDE_RIGHT: X*op(A) = alpha*B.
uplo – [in] [hipblasFillMode_t]
HIPBLAS_FILL_MODE_UPPER: each A_i is an upper triangular matrix.
HIPBLAS_FILL_MODE_LOWER: each A_i is a lower triangular matrix.
transA – [in] [hipblasOperation_t]
HIPBLAS_OP_N: op(A) = A.
HIPBLAS_OP_T: op(A) = A^T.
HIPBLAS_OP_C: op(A) = A^H.
diag – [in] [hipblasDiagType_t]
HIPBLAS_DIAG_UNIT: each A_i is assumed to be unit triangular.
HIPBLAS_DIAG_NON_UNIT: each A_i is not assumed to be unit triangular.
m – [in] [int] m specifies the number of rows of each B_i. m >= 0.
n – [in] [int] n specifies the number of columns of each B_i. n >= 0.
alpha – [in] [void *] device pointer or host pointer specifying the scalar alpha. When alpha is &zero, then A is not referenced, and B does not need to be set before entry.
A – [in] [void *] device pointer storing matrix A. Of dimension ( lda, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. Only the upper/lower triangular part is accessed.
lda – [in] [int] lda specifies the first dimension of A.
If side = HIPBLAS_SIDE_LEFT, lda >= max( 1, m ).
If side = HIPBLAS_SIDE_RIGHT, lda >= max( 1, n ).
strideA – [in] [hipblasStride] The stride between each A matrix.
B – [inout] [void *] device pointer pointing to first matrix B_i. Each B_i is of dimension ( ldb, n ). Before entry, the leading m by n part of each array B_i must contain the right-hand side of matrix B_i, and on exit is overwritten by the solution matrix X_i.
ldb – [in] [int] ldb specifies the first dimension of each B_i. ldb >= max( 1, m ).
strideB – [in] [hipblasStride] The stride between each B_i matrix.
batchCount – [in] [int] specifies how many batches.
invA – [in] [void *] device pointer storing the inverse diagonal blocks of each A_i. invA points to the first invA_1. Each invA_i is of dimension ( ld_invA, k ), where k is m when HIPBLAS_SIDE_LEFT and is n when HIPBLAS_SIDE_RIGHT. ld_invA must be equal to 128.
invAsize – [in] [int] invAsize specifies the number of elements of device memory in each invA_i.
strideInvA – [in] [hipblasStride] The stride between each invA matrix.
computeType – [in] [hipDataType] specifies the datatype of computation.
hipblasAxpyEx + Batched, StridedBatched#
-
hipblasStatus_t hipblasAxpyEx(hipblasHandle_t handle, int n, const void *alpha, hipDataType alphaType, const void *x, hipDataType xType, int incx, void *y, hipDataType yType, int incy, hipDataType executionType)#
BLAS EX API
The axpyEx funtions compute a constant
alphamultiplied by vectorx, plus vectory:The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.y := alpha * x + y
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
alpha – [in] device pointer or host pointer to specify the scalar alpha.
alphaType – [in] [hipDataType] specifies the datatype of alpha.
x – [in] device pointer storing vector x.
xType – [in] [hipDataType] specifies the datatype of vector x.
incx – [in] [int] specifies the increment for the elements of x.
y – [inout] device pointer storing vector y.
yType – [in] [hipDataType] specifies the datatype of vector y.
incy – [in] [int] specifies the increment for the elements of y.
executionType – [in] [hipDataType] specifies the datatype of computation.
The axpyEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasAxpyBatchedEx(hipblasHandle_t handle, int n, const void *alpha, hipDataType alphaType, const void *x, hipDataType xType, int incx, void *y, hipDataType yType, int incy, int batchCount, hipDataType executionType)#
BLAS EX API
The axpyBatchedEx functions compute a constant
alphamultiplied by vectorx, plus vectory, over a set of batched vectors.The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.y := alpha * x + y
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
alpha – [in] device pointer or host pointer to specify the scalar alpha.
alphaType – [in] [hipDataType] specifies the datatype of alpha.
x – [in] device array of device pointers storing each vector x_i.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [inout] device array of device pointers storing each vector y_i.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
batchCount – [in] [int] number of instances in the batch.
executionType – [in] [hipDataType] specifies the datatype of computation.
The axpyBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasAxpyStridedBatchedEx(hipblasHandle_t handle, int n, const void *alpha, hipDataType alphaType, const void *x, hipDataType xType, int incx, hipblasStride stridex, void *y, hipDataType yType, int incy, hipblasStride stridey, int batchCount, hipDataType executionType)#
BLAS EX API
The axpyStridedBatchedEx functions compute a constant
alphamultiplied by vectorx, plus vectory, over a set of strided batched vectors.The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.y := alpha * x + y
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
alpha – [in] device pointer or host pointer to specify the scalar alpha.
alphaType – [in] [hipDataType] specifies the datatype of alpha.
x – [in] device pointer to the first vector x_1.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. For a typical case, this means stridex >= n * incx.
y – [inout] device pointer to the first vector y_1.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1). There are no restrictions placed on stridey. However, the user should ensure that stridey is of appropriate size. For a typical case, this means stridey >= n * incy.
batchCount – [in] [int] number of instances in the batch.
executionType – [in] [hipDataType] specifies the datatype of computation.
The axpyStridedBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasDotEx + Batched, StridedBatched#
-
hipblasStatus_t hipblasDotEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, const void *y, hipDataType yType, int incy, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The dotEx functions perform the dot product of vectors
xandy:The dotcEx functions perform the dot product of the conjugate of complex vectorresult = x * y;
xand complex vectory:The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.result = conjugate (x) * y;
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
x – [in] device pointer storing vector x.
xType – [in] [hipDataType] specifies the datatype of vector x.
incx – [in] [int] specifies the increment for the elements of y.
y – [in] device pointer storing vector y.
yType – [in] [hipDataType] specifies the datatype of vector y.
incy – [in] [int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the dot product. Return value is 0.0 if n <= 0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The dotEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasDotBatchedEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, const void *y, hipDataType yType, int incy, int batchCount, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The dotBatchedEx functions perform a batch of dot products of vectors
xandy:The dotcBatchedEx functions performs a batch of dot products of the conjugate of complex vectorresult_i = x_i * y_i;
xand complex vectory:whereresult_i = conjugate (x_i) * y_i;
(x_i, y_i)is thei-th instance of the batch andx_iandy_iare vectors, fori= 1, …,batchCount.The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [in] device array of device pointers storing each vector x_i.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
batchCount – [in] [int] number of instances in the batch.
result – [inout] device array or host array of batchCount size to store the dot products of each batch. Returns 0.0 for each element if n <= 0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The dotBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasDotStridedBatchedEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, hipblasStride stridex, const void *y, hipDataType yType, int incy, hipblasStride stridey, int batchCount, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The dotStridedBatchedEx functions perform a batch of dot products of vectors
xandy:The dotc_strided_batched_ex functions performs a batch of dot products of the conjugate of complex vectorresult_i = x_i * y_i;
xand complex vectory:whereresult_i = conjugate (x_i) * y_i;
(x_i, y_i)is thei-th instance of the batch andx_iandy_iare vectors, fori= 1, …,batchCount.The supported types are determined by the backend. See rocBLAS/cuBLAS documentation.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [in] device pointer to the first vector (x_1) in the batch.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
y – [in] device pointer to the first vector (y_1) in the batch.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
batchCount – [in] [int] number of instances in the batch.
result – [inout] device array or host array of batchCount size to store the dot products of each batch. Returns 0.0 for each element if n <= 0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The dotStridedBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasDotcEx + Batched, StridedBatched#
-
hipblasStatus_t hipblasDotcEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, const void *y, hipDataType yType, int incy, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The dotEx functions perform the dot product of vectors
xandy:The dotcEx functions perform the dot product of the conjugate of complex vectorresult = x * y;
xand complex vectory:The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.result = conjugate (x) * y;
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x and y.
x – [in] device pointer storing vector x.
xType – [in] [hipDataType] specifies the datatype of vector x.
incx – [in] [int] specifies the increment for the elements of y.
y – [in] device pointer storing vector y.
yType – [in] [hipDataType] specifies the datatype of vector y.
incy – [in] [int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the dot product. Return value is 0.0 if n <= 0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The dotcEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasDotcBatchedEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, const void *y, hipDataType yType, int incy, int batchCount, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The dotBatchedEx functions perform a batch of dot products of vectors
xandy:The dotcBatchedEx functions performs a batch of dot products of the conjugate of complex vectorresult_i = x_i * y_i;
xand complex vectory:whereresult_i = conjugate (x_i) * y_i;
(x_i, y_i)is thei-th instance of the batch andx_iandy_iare vectors, fori= 1, …,batchCount.The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [in] device array of device pointers storing each vector x_i.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
y – [in] device array of device pointers storing each vector y_i.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
batchCount – [in] [int] number of instances in the batch.
result – [inout] device array or host array of batchCount size to store the dot products of each batch. Returns 0.0 for each element if n <= 0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The dotcBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasDotcStridedBatchedEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, hipblasStride stridex, const void *y, hipDataType yType, int incy, hipblasStride stridey, int batchCount, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The dotStridedBatchedEx functions perform a batch of dot products of vectors
xandy:The dotc_strided_batched_ex functions performs a batch of dot products of the conjugate of complex vectorresult_i = x_i * y_i;
xand complex vectory:whereresult_i = conjugate (x_i) * y_i;
(x_i, y_i)is thei-th instance of the batch andx_iandy_iare vectors, fori= 1, …,batchCount.The supported types are determined by the backend. See rocBLAS/cuBLAS documentation.
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in each x_i and y_i.
x – [in] device pointer to the first vector (x_1) in the batch.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1).
y – [in] device pointer to the first vector (y_1) in the batch.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment for the elements of each y_i.
stridey – [in] [hipblasStride] stride from the start of one vector (y_i) to the next one (y_i+1).
batchCount – [in] [int] number of instances in the batch.
result – [inout] device array or host array of batchCount size to store the dot products of each batch. Returns 0.0 for each element if n <= 0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The dotcStridedBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasNrm2Ex + Batched, StridedBatched#
-
hipblasStatus_t hipblasNrm2Ex(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The nrm2Ex functions compute the Euclidean norm of a real or complex vector:
The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.result := sqrt( x'*x ) for real vectors result := sqrt( x**H*x ) for complex vectors
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
x – [in] device pointer storing vector x.
xType – [in] [hipDataType] specifies the datatype of the vector x.
incx – [in] [int] specifies the increment for the elements of y.
result – [inout] device pointer or host pointer to store the nrm2 product. The return value is 0.0 if n, incx<=0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The nrm2Ex function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasNrm2BatchedEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, int batchCount, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The nrm2BatchedEx functions compute the Euclidean norm over a batch of real or complex vectors:
The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.result := sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batchCount result := sqrt( x_i**H*x_i ) for complex vectors x, for i = 1, ..., batchCount
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each x_i.
x – [in] device array of device pointers storing each vector x_i.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
batchCount – [in] [int] number of instances in the batch.
result – [out] device pointer or host pointer to array of batchCount size for nrm2 results. Returns 0.0 for each element if n <= 0, incx<=0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The nrm2BatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasNrm2StridedBatchedEx(hipblasHandle_t handle, int n, const void *x, hipDataType xType, int incx, hipblasStride stridex, int batchCount, void *result, hipDataType resultType, hipDataType executionType)#
BLAS EX API
The nrm2StridedBatchedEx computes the Euclidean norm over a batch of real or complex vectors:
The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.:= sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batchCount := sqrt( x_i**H*x_i ) for complex vectors, for i = 1, ..., batchCount
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each x_i.
x – [in] device pointer to the first vector x_1.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i. incx must be > 0.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stridex. However, the user should ensure that stridex is of an appropriate size. For a typical case, this means stridex >= n * incx.
batchCount – [in] [int] number of instances in the batch.
result – [out] device pointer or host pointer to array for storing contiguous batchCount results. Returns 0.0 for each element if n <= 0, incx<=0.
resultType – [in] [hipDataType] specifies the datatype of the result.
executionType – [in] [hipDataType] specifies the datatype of computation.
The nrm2StridedBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasRotEx + Batched, StridedBatched#
-
hipblasStatus_t hipblasRotEx(hipblasHandle_t handle, int n, void *x, hipDataType xType, int incx, void *y, hipDataType yType, int incy, const void *c, const void *s, hipDataType csType, hipDataType executionType)#
BLAS EX API
The rotEx functions applies the Givens rotation matrix defined by
c=cos(alpha)ands=sin(alpha)to vectorsxandy. Scalarscandscan be stored in either the host or device memory. The location is specified by callinghipblasSetPointerMode.If
cs_typeis real:Ifx := c * x + s * y y := c * y - s * x
cs_typeis complex, the imaginary part ofcis ignored:The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.x := real(c) * x + s * y y := real(c) * y - conj(s) * x
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in the x and y vectors.
x – [inout] device pointer storing vector x.
xType – [in] [hipDataType] specifies the datatype of vector x.
incx – [in] [int] specifies the increment between elements of x.
y – [inout] device pointer storing vector y.
yType – [in] [hipDataType] specifies the datatype of vector y.
incy – [in] [int] specifies the increment between elements of y.
c – [in] device pointer or host pointer storing scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer storing scalar sine component of the rotation matrix.
csType – [in] [hipDataType] specifies the datatype of c and s.
executionType – [in] [hipDataType] specifies the datatype of computation.
The rotEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasRotBatchedEx(hipblasHandle_t handle, int n, void *x, hipDataType xType, int incx, void *y, hipDataType yType, int incy, const void *c, const void *s, hipDataType csType, int batchCount, hipDataType executionType)#
BLAS EX API
The rotBatchedEx functions apply the Givens rotation matrix defined by
c=cos(alpha)ands=sin(alpha)to batched vectorsx_iandy_i, fori= 1, …,batchCount. Scalarscandscan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode.If
cs_typeis real:Ifx := c * x + s * y y := c * y - s * x
cs_typeis complex, the imaginary part ofcis ignored:The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.x := real(c) * x + s * y y := real(c) * y - conj(s) * x
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] number of elements in each x_i and y_i vectors.
x – [inout] device array of device pointers storing each vector x_i.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment between elements of each x_i.
y – [inout] device array of device pointers storing each vector y_i.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment between elements of each y_i.
c – [in] device pointer or host pointer to scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer to scalar sine component of the rotation matrix.
csType – [in] [hipDataType] specifies the datatype of c and s.
batchCount – [in] [int] the number of x and y arrays, that is, the number of batches.
executionType – [in] [hipDataType] specifies the datatype of computation.
The rotBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasRotStridedBatchedEx(hipblasHandle_t handle, int n, void *x, hipDataType xType, int incx, hipblasStride stridex, void *y, hipDataType yType, int incy, hipblasStride stridey, const void *c, const void *s, hipDataType csType, int batchCount, hipDataType executionType)#
BLAS EX API
The rotStridedBatchedEx functions apply the Givens rotation matrix defined by
c=cos(alpha)ands=sin(alpha)to strided batched vectorsx_iandy_i, fori= 1, …,batchCount. Scalarscandscan be stored in either host or device memory. The location is specified by callinghipblasSetPointerMode.If
cs_typeis real:Ifx := c * x + s * y y := c * y - s * x
cs_typeis complex, the imaginary part ofcis ignored:The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.x := real(c) * x + s * y y := real(c) * y - conj(s) * x
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipblas library context queue.
n – [in] [int] number of elements in each x_i and y_i vectors.
x – [inout] device pointer to the first vector x_1.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment between elements of each x_i.
stridex – [in] [hipblasStride] specifies the increment from the beginning of x_i to the beginning of x_(i+1).
y – [inout] device pointer to the first vector y_1.
yType – [in] [hipDataType] specifies the datatype of each vector y_i.
incy – [in] [int] specifies the increment between elements of each y_i.
stridey – [in] [hipblasStride] specifies the increment from the beginning of y_i to the beginning of y_(i+1).
c – [in] device pointer or host pointer to scalar cosine component of the rotation matrix.
s – [in] device pointer or host pointer to scalar sine component of the rotation matrix.
csType – [in] [hipDataType] specifies the datatype of c and s.
batchCount – [in] [int] the number of x and y arrays, that is, the number of batches.
executionType – [in] [hipDataType] specifies the datatype of computation.
The rotStridedBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
hipblasScalEx + Batched, StridedBatched#
-
hipblasStatus_t hipblasScalEx(hipblasHandle_t handle, int n, const void *alpha, hipDataType alphaType, void *x, hipDataType xType, int incx, hipDataType executionType)#
BLAS EX API
The scalEx functions scale each element of vector
xwith scalaralpha.The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.x := alpha * x
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
alpha – [in] device pointer or host pointer for the scalar alpha.
alphaType – [in] [hipDataType] specifies the datatype of alpha.
x – [inout] device pointer storing vector x.
xType – [in] [hipDataType] specifies the datatype of vector x.
incx – [in] [int] specifies the increment for the elements of x.
executionType – [in] [hipDataType] specifies the datatype of computation.
The scalEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasScalBatchedEx(hipblasHandle_t handle, int n, const void *alpha, hipDataType alphaType, void *x, hipDataType xType, int incx, int batchCount, hipDataType executionType)#
BLAS EX API
The scalBatchedEx functions scale each element of each vector
x_iwith scalaralpha:The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.x_i := alpha * x_i
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
alpha – [in] device pointer or host pointer for the scalar alpha.
alphaType – [in] [hipDataType] specifies the datatype of alpha.
x – [inout] device array of device pointers storing each vector x_i.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
batchCount – [in] [int] number of instances in the batch.
executionType – [in] [hipDataType] specifies the datatype of computation.
The scalBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
-
hipblasStatus_t hipblasScalStridedBatchedEx(hipblasHandle_t handle, int n, const void *alpha, hipDataType alphaType, void *x, hipDataType xType, int incx, hipblasStride stridex, int batchCount, hipDataType executionType)#
BLAS EX API
The scalStridedBatchedEx functions scale each element of vector
xwith scalar alpha over a set of strided batched vectors.The supported types are determined by the backend. See the rocBLAS or cuBLAS documentation.x := alpha * x
- Parameters:
handle – [in] [hipblasHandle_t] handle to the hipBLAS library context queue.
n – [in] [int] the number of elements in x.
alpha – [in] device pointer or host pointer for the scalar alpha.
alphaType – [in] [hipDataType] specifies the datatype of alpha.
x – [inout] device pointer to the first vector x_1.
xType – [in] [hipDataType] specifies the datatype of each vector x_i.
incx – [in] [int] specifies the increment for the elements of each x_i.
stridex – [in] [hipblasStride] stride from the start of one vector (x_i) to the next one (x_i+1). There are no restrictions placed on stridex. However, the user should take care to ensure that stridex is of an appropriate size. For a typical case, this means stridex >= n * incx.
batchCount – [in] [int] number of instances in the batch.
executionType – [in] [hipDataType] specifies the datatype of computation.
The scalStridedBatchedEx function supports the 64-bit integer interface. See the ILP64 interfaces section.
SOLVER API#
hipblasXgetrf + Batched, stridedBatched#
-
hipblasStatus_t hipblasSgetrf(hipblasHandle_t handle, const int n, float *A, const int lda, int *ipiv, int *info)#
-
hipblasStatus_t hipblasDgetrf(hipblasHandle_t handle, const int n, double *A, const int lda, int *ipiv, int *info)#
-
hipblasStatus_t hipblasCgetrf(hipblasHandle_t handle, const int n, hipComplex *A, const int lda, int *ipiv, int *info)#
-
hipblasStatus_t hipblasZgetrf(hipblasHandle_t handle, const int n, hipDoubleComplex *A, const int lda, int *ipiv, int *info)#
SOLVER API
The getrf functions compute the LU factorization of a general
n-by-nmatrixAusing partial pivoting with row interchanges. The LU factorization can be done without pivoting ifipivis passed as a nullptr.When
ipivis not null, the factorization has the form:\[ A = PLU \]where
Pis a permutation matrix,Lis lower triangular with unit diagonal elements, andUis upper triangular.When
ipivis null, the factorization is done without pivoting:\[ A = LU \]Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
n – [in] int. n >= 0. The number of columns and rows of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n.
On entry, the n-by-n matrix A to be factored.
On exit, the factors L and U from the factorization.
The unit diagonal elements of L are not stored.
lda – [in] int. lda >= n. Specifies the leading dimension of A.
ipiv – [out] pointer to int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= n, row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv. This factorization can be done without pivoting if ipiv is passed in as a nullptr.
info – [out] pointer to a int on the GPU.
If info = 0, successful exit.
If info = j > 0, U is singular. U[j,j] is the first zero pivot.
-
hipblasStatus_t hipblasSgetrfBatched(hipblasHandle_t handle, const int n, float *const A[], const int lda, int *ipiv, int *info, const int batchCount)#
-
hipblasStatus_t hipblasDgetrfBatched(hipblasHandle_t handle, const int n, double *const A[], const int lda, int *ipiv, int *info, const int batchCount)#
-
hipblasStatus_t hipblasCgetrfBatched(hipblasHandle_t handle, const int n, hipComplex *const A[], const int lda, int *ipiv, int *info, const int batchCount)#
-
hipblasStatus_t hipblasZgetrfBatched(hipblasHandle_t handle, const int n, hipDoubleComplex *const A[], const int lda, int *ipiv, int *info, const int batchCount)#
SOLVER API
The getrfBatched functions compute the LU factorization of a batch of general
n-by-nmatrices using partial pivoting with row interchanges. The LU factorization can be done without pivoting ifipivis passed as a nullptr.When ipiv is not null, the factorization of matrix \(A_i\) in the batch has the form:
\[ A_i = P_iL_iU_i \]where \(P_i\) is a permutation matrix, \(L_i\) is lower triangular with unit diagonal elements, and \(U_i\) is upper triangular.
When
ipivis null, the factorization is done without pivoting:\[ A_i = L_iU_i \]Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
n – [in] int. n >= 0. The number of columns and rows of all matrices A_i in the batch.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the n-by-n matrices A_i to be factored.
On exit, the factors L_i and U_i from the factorizations.
The unit diagonal elements of L_i are not stored.
lda – [in] int. lda >= n. Specifies the leading dimension of matrices A_i.
ipiv – [out] pointer to int. Array on the GPU. Contains the vectors of pivot indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is n. Elements of ipiv_i are 1-based indices. For each instance A_i in the batch and for 1 <= j <= n, row j of the matrix A_i was interchanged with row ipiv_i[j]. Matrix P_i of the factorization can be derived from ipiv_i. This factorization can be done without pivoting if ipiv is passed in as a nullptr.
info – [out] pointer to int. Array of batchCount integers on the GPU.
If info[i] = 0, successful exit for factorization of A_i.
If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero pivot.
batchCount – [in] int. batchCount >= 0. Number of matrices in the batch.
-
hipblasStatus_t hipblasSgetrfStridedBatched(hipblasHandle_t handle, const int n, float *A, const int lda, const hipblasStride strideA, int *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
-
hipblasStatus_t hipblasDgetrfStridedBatched(hipblasHandle_t handle, const int n, double *A, const int lda, const hipblasStride strideA, int *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
-
hipblasStatus_t hipblasCgetrfStridedBatched(hipblasHandle_t handle, const int n, hipComplex *A, const int lda, const hipblasStride strideA, int *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
-
hipblasStatus_t hipblasZgetrfStridedBatched(hipblasHandle_t handle, const int n, hipDoubleComplex *A, const int lda, const hipblasStride strideA, int *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
SOLVER API
The getrfStridedBatched functions compute the LU factorization of a batch of general
n-by-nmatrices using partial pivoting with row interchanges. The LU factorization can be done without pivoting ifipivis passed as a nullptr.When
ipivis not null, the factorization of matrix \(A_i\) in the batch has the form:\[ A_i = P_iL_iU_i \]where \(P_i\) is a permutation matrix, \(L_i\) is lower triangular with unit diagonal elements, and \(U_i\) is upper triangular.
When
ipivis null, the factorization is done without pivoting:\[ A_i = L_iU_i \]Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
n – [in] int. n >= 0. The number of columns and rows of all matrices A_i in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the n-by-n matrices A_i to be factored.
On exit, the factors L_i and U_i from the factorization.
The unit diagonal elements of L_i are not stored.
lda – [in] int. lda >= n. Specifies the leading dimension of matrices A_i.
strideA – [in] hipblasStride. Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivots indices ipiv_i (corresponding to A_i). Dimension of ipiv_i is n. Elements of ipiv_i are 1-based indices. For each instance A_i in the batch and for 1 <= j <= n, row j of the matrix A_i was interchanged with row ipiv_i[j]. Matrix P_i of the factorization can be derived from ipiv_i. The factorization here can be done without pivoting if ipiv is passed in as a nullptr.
strideP – [in] hipblasStride. Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out] pointer to int. Array of batchCount integers on the GPU.
If info[i] = 0, successful exit for factorization of A_i.
If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero pivot.
batchCount – [in] int. batchCount >= 0. Number of matrices in the batch.
hipblasXgetrs + Batched, stridedBatched#
-
hipblasStatus_t hipblasSgetrs(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, float *A, const int lda, const int *ipiv, float *B, const int ldb, int *info)#
-
hipblasStatus_t hipblasDgetrs(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, double *A, const int lda, const int *ipiv, double *B, const int ldb, int *info)#
-
hipblasStatus_t hipblasCgetrs(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, hipComplex *A, const int lda, const int *ipiv, hipComplex *B, const int ldb, int *info)#
-
hipblasStatus_t hipblasZgetrs(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, hipDoubleComplex *A, const int lda, const int *ipiv, hipDoubleComplex *B, const int ldb, int *info)#
SOLVER API
The getrs functions solve a system of
nlinear equations onnvariables in its factorized form.They solve one of the following systems, depending on the value of
trans:\[\begin{split} \begin{array}{cl} A X = B & \: \text{not transposed,}\\ A^T X = B & \: \text{transposed, or}\\ A^H X = B & \: \text{conjugate transposed.} \end{array} \end{split}\]Matrix A is defined by its triangular factors as returned by getrf.
Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
trans – [in] hipblasOperation_t. Specifies the form of the system of equations.
n – [in] int. n >= 0. The order of the system, that is, the number of columns and rows of A.
nrhs – [in] int. nrhs >= 0. The number of right hand sides, that is, the number of columns of the matrix B.
A – [in] pointer to type. Array on the GPU of dimension lda*n. The factors L and U of the factorization A = P*L*U returned by getrf.
lda – [in] int. lda >= n. The leading dimension of A.
ipiv – [in] pointer to int. Array on the GPU of dimension n. The pivot indices returned by getrf.
B – [inout] pointer to type. Array on the GPU of dimension ldb*nrhs.
On entry, the right hand side matrix B.
On exit, the solution matrix X.
ldb – [in] int. ldb >= n. The leading dimension of B.
info – [out] pointer to a int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
-
hipblasStatus_t hipblasSgetrsBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, float *const A[], const int lda, const int *ipiv, float *const B[], const int ldb, int *info, const int batchCount)#
-
hipblasStatus_t hipblasDgetrsBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, double *const A[], const int lda, const int *ipiv, double *const B[], const int ldb, int *info, const int batchCount)#
-
hipblasStatus_t hipblasCgetrsBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, hipComplex *const A[], const int lda, const int *ipiv, hipComplex *const B[], const int ldb, int *info, const int batchCount)#
-
hipblasStatus_t hipblasZgetrsBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, hipDoubleComplex *const A[], const int lda, const int *ipiv, hipDoubleComplex *const B[], const int ldb, int *info, const int batchCount)#
SOLVER API
The getrsBatched functions solve a batch of systems of
nlinear equations onnvariables in its factorized forms.For each instance
iin the batch, they solve one of the following systems, depending on the value oftrans:\[\begin{split} \begin{array}{cl} A_i X_i = B_i & \: \text{not transposed,}\\ A_i^T X_i = B_i & \: \text{transposed, or}\\ A_i^H X_i = B_i & \: \text{conjugate transposed.} \end{array} \end{split}\]Matrix \(A_i\) is defined by its triangular factors as returned by getrfBatched.
Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
trans – [in] hipblasOperation_t. Specifies the form of the system of equations of each instance in the batch.
n – [in] int. n >= 0. The order of the system, that is, the number of columns and rows of all A_i matrices.
nrhs – [in] int. nrhs >= 0. The number of right hand sides, that is, the number of columns of all the matrices B_i.
A – [in] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. The factors L_i and U_i of the factorization A_i = P_i*L_i*U_i returned by getrfBatched.
lda – [in] int. lda >= n. The leading dimension of matrices A_i.
ipiv – [in] pointer to int. Array on the GPU. Contains the vectors ipiv_i of pivot indices returned by getrfBatched.
B – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*nrhs.
On entry, the right hand side matrices B_i.
On exit, the solution matrix X_i of each system in the batch.
ldb – [in] int. ldb >= n. The leading dimension of matrices B_i.
info – [out] pointer to a int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
batchCount – [in] int. batchCount >= 0. Number of instances (systems) in the batch.
-
hipblasStatus_t hipblasSgetrsStridedBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, float *A, const int lda, const hipblasStride strideA, const int *ipiv, const hipblasStride strideP, float *B, const int ldb, const hipblasStride strideB, int *info, const int batchCount)#
-
hipblasStatus_t hipblasDgetrsStridedBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, double *A, const int lda, const hipblasStride strideA, const int *ipiv, const hipblasStride strideP, double *B, const int ldb, const hipblasStride strideB, int *info, const int batchCount)#
-
hipblasStatus_t hipblasCgetrsStridedBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, hipComplex *A, const int lda, const hipblasStride strideA, const int *ipiv, const hipblasStride strideP, hipComplex *B, const int ldb, const hipblasStride strideB, int *info, const int batchCount)#
-
hipblasStatus_t hipblasZgetrsStridedBatched(hipblasHandle_t handle, const hipblasOperation_t trans, const int n, const int nrhs, hipDoubleComplex *A, const int lda, const hipblasStride strideA, const int *ipiv, const hipblasStride strideP, hipDoubleComplex *B, const int ldb, const hipblasStride strideB, int *info, const int batchCount)#
SOLVER API
The getrsStridedBatched functions solve a batch of systems of
nlinear equations onnvariables in its factorized forms.For each instance
iin the batch, they solve one of the following systems, depending on the value oftrans:\[\begin{split} \begin{array}{cl} A_i X_i = B_i & \: \text{not transposed,}\\ A_i^T X_i = B_i & \: \text{transposed, or}\\ A_i^H X_i = B_i & \: \text{conjugate transposed.} \end{array} \end{split}\]Matrix \(A_i\) is defined by its triangular factors as returned by getrfStridedBatched.
Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] hipblasHandle_t.
trans – [in] hipblasOperation_t. Specifies the form of the system of equations of each instance in the batch.
n – [in] int. n >= 0. The order of the system, that is, the number of columns and rows of all A_i matrices.
nrhs – [in] int. nrhs >= 0. The number of right hand sides, that is, the number of columns of all the matrices B_i.
A – [in] pointer to type. Array on the GPU (the size depends on the value of strideA). The factors L_i and U_i of the factorization A_i = P_i*L_i*U_i returned by getrfStridedBatched.
lda – [in] int. lda >= n. The leading dimension of matrices A_i.
strideA – [in] hipblasStride. Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [in] pointer to int. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_i of pivot indices returned by getrfStridedBatched.
strideP – [in] hipblasStride. Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
B – [inout] pointer to type. Array on the GPU (size depends on the value of strideB).
On entry, the right hand side matrices B_i.
On exit, the solution matrix X_i of each system in the batch.
ldb – [in] int. ldb >= n. The leading dimension of matrices B_i.
strideB – [in] hipblasStride. Stride from the start of one matrix B_i to the next one B_(i+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.
info – [out] pointer to a int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
batchCount – [in] int. batchCount >= 0. Number of instances (systems) in the batch.
hipblasXgetri + Batched, stridedBatched#
-
hipblasStatus_t hipblasSgetriBatched(hipblasHandle_t handle, const int n, float *const A[], const int lda, int *ipiv, float *const C[], const int ldc, int *info, const int batchCount)#
-
hipblasStatus_t hipblasDgetriBatched(hipblasHandle_t handle, const int n, double *const A[], const int lda, int *ipiv, double *const C[], const int ldc, int *info, const int batchCount)#
-
hipblasStatus_t hipblasCgetriBatched(hipblasHandle_t handle, const int n, hipComplex *const A[], const int lda, int *ipiv, hipComplex *const C[], const int ldc, int *info, const int batchCount)#
-
hipblasStatus_t hipblasZgetriBatched(hipblasHandle_t handle, const int n, hipDoubleComplex *const A[], const int lda, int *ipiv, hipDoubleComplex *const C[], const int ldc, int *info, const int batchCount)#
SOLVER API
The getriBatched functions computes the inverse \(C_i = A_i^{-1}\) of a batch of general n-by-n matrices \(A_i\).
The inverse is computed by solving the linear system
\[ A_i C_i = I \]where I is the identity matrix and \(A_i\) is factorized as \(A_i = P_i L_i U_i\), as given by getrfBatched.
Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
n – [in] int. n >= 0. The number of rows and columns of all matrices A_i in the batch.
A – [in] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. The factors L_i and U_i of the factorization A_i = P_i*L_i*U_i returned by getrfBatched.
lda – [in] int. lda >= n. Specifies the leading dimension of matrices A_i.
ipiv – [in] pointer to int. Array on the GPU (the size depends on the value of strideP). The pivot indices returned by getrfBatched. ipiv can be passed in as a nullptr. This will assume that getrfBatched was called without partial pivoting.
C – [out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n. If info[i] = 0, the inverse of matrices A_i. Otherwise, undefined.
ldc – [in] int. ldc >= n. Specifies the leading dimension of C_i.
info – [out] pointer to int. Array of batchCount integers on the GPU.
If info[i] = 0, successful exit for inversion of A_i.
If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero pivot.
batchCount – [in] int. batchCount >= 0. Number of matrices in the batch.
hipblasXgeqrf + Batched, stridedBatched#
-
hipblasStatus_t hipblasSgeqrf(hipblasHandle_t handle, const int m, const int n, float *A, const int lda, float *ipiv, int *info)#
-
hipblasStatus_t hipblasDgeqrf(hipblasHandle_t handle, const int m, const int n, double *A, const int lda, double *ipiv, int *info)#
-
hipblasStatus_t hipblasCgeqrf(hipblasHandle_t handle, const int m, const int n, hipComplex *A, const int lda, hipComplex *ipiv, int *info)#
-
hipblasStatus_t hipblasZgeqrf(hipblasHandle_t handle, const int m, const int n, hipDoubleComplex *A, const int lda, hipDoubleComplex *ipiv, int *info)#
SOLVER API
The geqrf functions compute a QR factorization of a general
m-by-nmatrixA. The factorization has the form:\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]where
Ris upper triangular (upper trapezoidal ifm < n), andQis anm-by-morthogonal/unitary matrix represented as the product of Householder matrices:\[ Q = H_1H_2\cdots H_k, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_i\) is given by:
\[ H_i = I - \text{ipiv}[i] \cdot v_i v_i' \]where the first
i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
m – [in] int. m >= 0. The number of rows of the matrix A.
n – [in] int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n.
On entry, the m-by-n matrix to be factored.
On exit, the elements on and above the diagonal contain the factor R. The elements below the diagonal are the last m - i elements of Householder vector v_i.
lda – [in] int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
info – [out] pointer to a int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
-
hipblasStatus_t hipblasSgeqrfBatched(hipblasHandle_t handle, const int m, const int n, float *const A[], const int lda, float *const ipiv[], int *info, const int batchCount)#
-
hipblasStatus_t hipblasDgeqrfBatched(hipblasHandle_t handle, const int m, const int n, double *const A[], const int lda, double *const ipiv[], int *info, const int batchCount)#
-
hipblasStatus_t hipblasCgeqrfBatched(hipblasHandle_t handle, const int m, const int n, hipComplex *const A[], const int lda, hipComplex *const ipiv[], int *info, const int batchCount)#
-
hipblasStatus_t hipblasZgeqrfBatched(hipblasHandle_t handle, const int m, const int n, hipDoubleComplex *const A[], const int lda, hipDoubleComplex *const ipiv[], int *info, const int batchCount)#
SOLVER API
The geqrfBatched function computes the QR factorization of a batch of general
m-by-nmatrices.The factorization of matrix \(A_i\) in the batch has the form:
\[\begin{split} A_i = Q_i\left[\begin{array}{c} R_i\\ 0 \end{array}\right] \end{split}\]where \(R_i\) is upper triangular (upper trapezoidal if
m<n) and \(Q_i\) is anm-by-morthogonal/unitary matrix represented as the product of Householder matrices:\[ Q_i = H_{i_1}H_{i_2}\cdots H_{i_k}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{i_j}\) is given by:
\[ H_{i_j} = I - \text{ipiv}_i[j] \cdot v_{i_j} v_{i_j}' \]where the first
j-1 elements of Householder vector \(v_{i_j}\) are zero and \(v_{i_j}[j] = 1\).Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
- Parameters:
handle – [in] hipblasHandle_t.
m – [in] int. m >= 0. The number of rows of all the matrices A_i in the batch.
n – [in] int. n >= 0. The number of columns of all the matrices A_i in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the m-by-n matrices A_i to be factored.
On exit, the elements on and above the diagonal contain the factor R_i. The elements below the diagonal are the last m - j elements of Householder vector v_(i_j).
lda – [in] int. lda >= m. Specifies the leading dimension of matrices A_i.
ipiv – [out] array of pointers to type. Each pointer points to an array on the GPU of dimension min(m, n). Contains the vectors ipiv_i of corresponding Householder scalars.
info – [out] pointer to a int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
batchCount – [in] int. batchCount >= 0. Number of matrices in the batch.
-
hipblasStatus_t hipblasSgeqrfStridedBatched(hipblasHandle_t handle, const int m, const int n, float *A, const int lda, const hipblasStride strideA, float *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
-
hipblasStatus_t hipblasDgeqrfStridedBatched(hipblasHandle_t handle, const int m, const int n, double *A, const int lda, const hipblasStride strideA, double *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
-
hipblasStatus_t hipblasCgeqrfStridedBatched(hipblasHandle_t handle, const int m, const int n, hipComplex *A, const int lda, const hipblasStride strideA, hipComplex *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
-
hipblasStatus_t hipblasZgeqrfStridedBatched(hipblasHandle_t handle, const int m, const int n, hipDoubleComplex *A, const int lda, const hipblasStride strideA, hipDoubleComplex *ipiv, const hipblasStride strideP, int *info, const int batchCount)#
SOLVER API
The geqrfStridedBatched functions compute the QR factorization of a batch of general
m-by-nmatrices.The factorization of matrix \(A_i\) in the batch has the form:
\[\begin{split} A_i = Q_i\left[\begin{array}{c} R_i\\ 0 \end{array}\right] \end{split}\]where \(R_i\) is upper triangular (upper trapezoidal if
m<n), and \(Q_i\) is anm-by-morthogonal/unitary matrix represented as the product of Householder matrices:\[ Q_i = H_{i_1}H_{i_2}\cdots H_{i_k}, \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_{i_j}\) is given by:
\[ H_{i_j} = I - \text{ipiv}_j[j] \cdot v_{i_j} v_{i_j}' \]where the first
j-1 elements of Householder vector \(v_{i_j}\) are zero, and \(v_{i_j}[j] = 1\).Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] hipblasHandle_t.
m – [in] int. m >= 0. The number of rows of all the matrices A_i in the batch.
n – [in] int. n >= 0. The number of columns of all the matrices A_i in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the m-by-n matrices A_i to be factored.
On exit, the elements on and above the diagonal contain the factor R_i. The elements below the diagonal are the last m - j elements of Householder vector v_(i_j).
lda – [in] int. lda >= m. Specifies the leading dimension of matrices A_i.
strideA – [in] hipblasStride. Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_i of corresponding Householder scalars.
strideP – [in] hipblasStride. Stride from the start of one vector ipiv_i to the next one ipiv_(i+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
info – [out] pointer to a int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
batchCount – [in] int. batchCount >= 0. Number of matrices in the batch.
hipblasXgels + Batched, StridedBatched#
-
hipblasStatus_t hipblasSgels(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, float *A, const int lda, float *B, const int ldb, int *info, int *deviceInfo)#
-
hipblasStatus_t hipblasDgels(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, double *A, const int lda, double *B, const int ldb, int *info, int *deviceInfo)#
-
hipblasStatus_t hipblasCgels(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, hipComplex *A, const int lda, hipComplex *B, const int ldb, int *info, int *deviceInfo)#
-
hipblasStatus_t hipblasZgels(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, hipDoubleComplex *A, const int lda, hipDoubleComplex *B, const int ldb, int *info, int *deviceInfo)#
SOLVER API
The gels functions solve an overdetermined (or underdetermined) linear system defined by an
m-by-nmatrixAand a corresponding matrixB, using the QR factorization computed by GEQRF (or the LQ factorization computed byGELQF).Depending on the value of
trans, the problem solved by this function is either of the form:\[\begin{split} \begin{array}{cl} A X = B & \: \text{not transposed, or}\\ A' X = B & \: \text{transposed if real, or conjugate transposed if complex} \end{array} \end{split}\]If
m >= n(orm < nin the case of transpose/conjugate transpose), the system is overdetermined and a least-squares solution approximatingXis found by minimizing:\[ || B - A X || \quad \text{(or} \: || B - A' X ||\text{)} \]If
m < n(orm >= nin the case of transpose/conjugate transpose), the system is underdetermined and a unique solution forXis chosen such that \(|| X ||\) is minimal.Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] hipblasHandle_t.
trans – [in] hipblasOperation_t. Specifies the form of the system of equations.
m – [in] int. m >= 0. The number of rows of matrix A.
n – [in] int. n >= 0. The number of columns of matrix A.
nrhs – [in] int. nrhs >= 0. The number of columns of matrices B and X, that is, the columns on the right hand side.
A – [inout] pointer to type. Array on the GPU of dimension lda*n.
On entry, the matrix A.
On exit, the QR (or LQ) factorization of A as returned by “GEQRF” (or “GELQF”).
lda – [in] int. lda >= m. Specifies the leading dimension of matrix A.
B – [inout] pointer to type. Array on the GPU of dimension ldb*nrhs.
On entry, the matrix B.
On exit, when info = 0, B is overwritten by the solution vectors (and the residuals in the overdetermined cases) stored as columns.
ldb – [in] int. ldb >= max(m,n). Specifies the leading dimension of matrix B.
info – [out] pointer to an int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
deviceInfo – [out] pointer to int on the GPU.
If info = 0, successful exit.
If info = i > 0, the solution could not be computed because input matrix A is rank deficient; the i-th diagonal element of its triangular factor is zero.
-
hipblasStatus_t hipblasSgelsBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, float *const A[], const int lda, float *const B[], const int ldb, int *info, int *deviceInfo, const int batchCount)#
-
hipblasStatus_t hipblasDgelsBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, double *const A[], const int lda, double *const B[], const int ldb, int *info, int *deviceInfo, const int batchCount)#
-
hipblasStatus_t hipblasCgelsBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, hipComplex *const A[], const int lda, hipComplex *const B[], const int ldb, int *info, int *deviceInfo, const int batchCount)#
-
hipblasStatus_t hipblasZgelsBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, hipDoubleComplex *const A[], const int lda, hipDoubleComplex *const B[], const int ldb, int *info, int *deviceInfo, const int batchCount)#
SOLVER API
The gelsBatched functions solve a batch of overdetermined (or underdetermined) linear systems defined by a set of
m-by-nmatrices \(A_j\) and corresponding matrices \(B_j\), using the QR factorizations computed byGEQRF_BATCHED(or the LQ factorizations computed byGELQF_BATCHED).For each instance in the batch, depending on the value of
trans, the problem solved by this function is either of the form:\[\begin{split} \begin{array}{cl} A_j X_j = B_j & \: \text{not transposed, or}\\ A_j' X_j = B_j & \: \text{transposed if real, or conjugate transposed if complex} \end{array} \end{split}\]If
m >= n(orm < nin the case of transpose/conjugate transpose), the system is overdetermined and a least-squares solution approximatingX_jis found by minimizing:\[ || B_j - A_j X_j || \quad \text{(or} \: || B_j - A_j' X_j ||\text{)} \]If
m < n(orm >= nin the case of transpose/conjugate transpose), the system is underdetermined and a unique solution for X_j is chosen such that \(|| X_j ||\) is minimal.Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS :
s,d,c, andz.
Note that the cuBLAS backend supports only the non-transpose operation and only solves over-determined systems (
m >= n).- Parameters:
handle – [in] hipblasHandle_t.
trans – [in] hipblasOperation_t. Specifies the form of the system of equations.
m – [in] int. m >= 0. The number of rows of all matrices A_j in the batch.
n – [in] int. n >= 0. The number of columns of all matrices A_j in the batch.
nrhs – [in] int. nrhs >= 0. The number of columns of all matrices B_j and X_j in the batch, that is, the columns on the right hand side.
A – [inout] array of pointer to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j.
On exit, the QR (or LQ) factorizations of A_j as returned by “GEQRF_BATCHED” (or “GELQF_BATCHED”).
lda – [in] int. lda >= m. Specifies the leading dimension of matrices A_j.
B – [inout] array of pointer to type. Each pointer points to an array on the GPU of dimension ldb*nrhs.
On entry, the matrices B_j.
On exit, when info[j] = 0, B_j is overwritten by the solution vectors (and the residuals in the overdetermined cases) stored as columns.
ldb – [in] int. ldb >= max(m,n). Specifies the leading dimension of matrices B_j.
info – [out] pointer to an int on the host. If info = 0, successful exit. If info = j < 0, the argument at position -j is invalid.
deviceInfo – [out] pointer to int. Array of batchCount integers on the GPU.
If deviceInfo[j] = 0, successful exit for solution of A_j.
If deviceInfo[j] = i > 0, the solution of A_j could not be computed because input matrix A_j is rank deficient; the i-th diagonal element of its triangular factor is zero.
batchCount – [in] int. batchCount >= 0. Number of matrices in the batch.
-
hipblasStatus_t hipblasSgelsStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, float *A, const int lda, const hipblasStride strideA, float *B, const int ldb, const hipblasStride strideB, int *info, int *deviceInfo, const int batchCount)#
-
hipblasStatus_t hipblasDgelsStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, double *A, const int lda, const hipblasStride strideA, double *B, const int ldb, const hipblasStride strideB, int *info, int *deviceInfo, const int batchCount)#
-
hipblasStatus_t hipblasCgelsStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, hipComplex *A, const int lda, const hipblasStride strideA, hipComplex *B, const int ldb, const hipblasStride strideB, int *info, int *deviceInfo, const int batchCount)#
-
hipblasStatus_t hipblasZgelsStridedBatched(hipblasHandle_t handle, hipblasOperation_t trans, const int m, const int n, const int nrhs, hipDoubleComplex *A, const int lda, const hipblasStride strideA, hipDoubleComplex *B, const int ldb, const hipblasStride strideB, int *info, int *deviceInfo, const int batchCount)#
SOLVER API
The gelsStridedBatched functions solve a batch of overdetermined (or underdetermined) linear systems defined by a set of
m-by-nmatrices \(A_j\) and corresponding matrices \(B_j\), using the QR factorizations computed byGEQRF_STRIDED_BATCHED(or the LQ factorizations computed byGELQF_STRIDED_BATCHED).For each instance in the batch, depending on the value of
trans, the problem solved by this function is either of the form:\[\begin{split} \begin{array}{cl} A_j X_j = B_j & \: \text{not transposed, or}\\ A_j' X_j = B_j & \: \text{transposed if real, or conjugate transposed if complex} \end{array} \end{split}\]If
m >= n(orm < nin the case of transpose/conjugate transpose), the system is overdetermined and a least-squares solution approximatingX_jis found by minimizing:\[ || B_j - A_j X_j || \quad \text{(or} \: || B_j - A_j' X_j ||\text{)} \]If
m < n(orm >= nin the case of transpose/conjugate transpose), the system is underdetermined and a unique solution forX_jis chosen such that \(|| X_j ||\) is minimal.Supported precisions in rocSOLVER :
s,d,c, andz.Supported precisions in cuBLAS : No support.
- Parameters:
handle – [in] hipblasHandle_t.
trans – [in] hipblasOperation_t. Specifies the form of the system of equations.
m – [in] int. m >= 0. The number of rows of all matrices A_j in the batch.
n – [in] int. n >= 0. The number of columns of all matrices A_j in the batch.
nrhs – [in] int. nrhs >= 0. The number of columns of all matrices B_j and X_j in the batch, that is, the columns on the right hand side.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j.
On exit, the QR (or LQ) factorizations of A_j as returned by “GEQRF_STRIDED_BATCHED” (or “GELQF_STRIDED_BATCHED”).
lda – [in] int. lda >= m. Specifies the leading dimension of matrices A_j.
strideA – [in] hipblasStride. Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
B – [inout] pointer to type. Array on the GPU (the size depends on the value of strideB).
On entry, the matrices B_j.
On exit, when info[j] = 0, each B_j is overwritten by the solution vectors (and the residuals in the overdetermined cases) stored as columns.
ldb – [in] int. ldb >= max(m,n). Specifies the leading dimension of matrices B_j.
strideB – [in] hipblasStride. Stride from the start of one matrix B_j to the next one B_(j+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.
info – [out] pointer to an int on the host.
If info = 0, successful exit.
If info = j < 0, the argument at position -j is invalid.
deviceInfo – [out] pointer to int. Array of batchCount integers on the GPU.
If deviceInfo[j] = 0, successful exit for solution of A_j.
If deviceInfo[j] = i > 0, the solution of A_j could not be computed because input matrix A_j is rank deficient; the i-th diagonal element of its triangular factor is zero.
batchCount – [in] int. batchCount >= 0. Number of matrices in the batch.
Auxiliary#
hipblasCreate#
-
hipblasStatus_t hipblasCreate(hipblasHandle_t *handle)#
Create the hipBLAS handle.
hipblasDestroy#
-
hipblasStatus_t hipblasDestroy(hipblasHandle_t handle)#
Destroys the library context created using hipblasCreate().
hipblasSetStream#
-
hipblasStatus_t hipblasSetStream(hipblasHandle_t handle, hipStream_t streamId)#
Sets the stream for the handle.
hipblasGetStream#
-
hipblasStatus_t hipblasGetStream(hipblasHandle_t handle, hipStream_t *streamId)#
Gets stream[0] for the handle.
hipblasSetPointerMode#
-
hipblasStatus_t hipblasSetPointerMode(hipblasHandle_t handle, hipblasPointerMode_t mode)#
Sets hipBLAS pointer mode.
hipblasGetPointerMode#
-
hipblasStatus_t hipblasGetPointerMode(hipblasHandle_t handle, hipblasPointerMode_t *mode)#
Gets hipBLAS pointer mode.
hipblasSetVector#
-
hipblasStatus_t hipblasSetVector(int n, int elemSize, const void *x, int incx, void *y, int incy)#
Copy vector from host to device.
- Parameters:
n – [in] [int] number of elements in the vector.
elemSize – [in] [int] Size of both vectors in bytes.
x – [in] pointer to vector on the host.
incx – [in] [int] specifies the increment for the elements of the vector.
y – [out] pointer to vector on the device.
incy – [in] [int] specifies the increment for the elements of the vector.
hipblasGetVector#
-
hipblasStatus_t hipblasGetVector(int n, int elemSize, const void *x, int incx, void *y, int incy)#
Copy vector from device to host.
- Parameters:
n – [in] [int] number of elements in the vector.
elemSize – [in] [int] Size of both vectors in bytes.
x – [in] pointer to vector on the device.
incx – [in] [int] specifies the increment for the elements of the vector.
y – [out] pointer to vector on the host.
incy – [in] [int] specifies the increment for the elements of the vector.
hipblasSetMatrix#
-
hipblasStatus_t hipblasSetMatrix(int rows, int cols, int elemSize, const void *AP, int lda, void *BP, int ldb)#
Copy matrix from host to device.
- Parameters:
rows – [in] [int] number of rows in the matrix.
cols – [in] [int] number of columns in the matrix.
elemSize – [in] [int] number of bytes per element in the matrix.
AP – [in] pointer to matrix on the host.
lda – [in] [int] specifies the leading dimension of A. lda >= rows.
BP – [out] pointer to matrix on the GPU.
ldb – [in] [int] specifies the leading dimension of B. ldb >= rows.
hipblasGetMatrix#
-
hipblasStatus_t hipblasGetMatrix(int rows, int cols, int elemSize, const void *AP, int lda, void *BP, int ldb)#
Copy matrix from device to host.
- Parameters:
rows – [in] [int] number of rows in the matrix.
cols – [in] [int] number of columns in the matrix.
elemSize – [in] [int] number of bytes per element in the matrix.
AP – [in] pointer to matrix on the GPU.
lda – [in] [int] specifies the leading dimension of A. lda >= rows.
BP – [out] pointer to matrix on the host.
ldb – [in] [int] specifies the leading dimension of B. ldb >= rows.
hipblasSetVectorAsync#
-
hipblasStatus_t hipblasSetVectorAsync(int n, int elemSize, const void *x, int incx, void *y, int incy, hipStream_t stream)#
Asynchronously copy vector from host to device.
hipblasSetVectorAsynccopies a vector from pinned host memory to device memory asynchronously. Memory on the host must be allocated withhipHostMallocor the transfer will be synchronous.- Parameters:
n – [in] [int] number of elements in the vector.
elemSize – [in] [int] number of bytes per element in the matrix.
x – [in] pointer to vector on the host.
incx – [in] [int] specifies the increment for the elements of the vector.
y – [out] pointer to vector on the device.
incy – [in] [int] specifies the increment for the elements of the vector.
stream – [in] specifies the stream into which this transfer request is queued.
hipblasGetVectorAsync#
-
hipblasStatus_t hipblasGetVectorAsync(int n, int elemSize, const void *x, int incx, void *y, int incy, hipStream_t stream)#
Asynchronously copy vector from device to host.
hipblasGetVectorAsynccopies a vector from pinned host memory to device memory asynchronously. Memory on the host must be allocated withhipHostMallocor the transfer will be synchronous.- Parameters:
n – [in] [int] number of elements in the vector.
elemSize – [in] [int] number of bytes per element in the matrix.
x – [in] pointer to vector on the device.
incx – [in] [int] specifies the increment for the elements of the vector.
y – [out] pointer to vector on the host.
incy – [in] [int] specifies the increment for the elements of the vector.
stream – [in] specifies the stream into which this transfer request is queued.
hipblasSetMatrixAsync#
-
hipblasStatus_t hipblasSetMatrixAsync(int rows, int cols, int elemSize, const void *AP, int lda, void *BP, int ldb, hipStream_t stream)#
Asynchronously copy matrix from host to device.
hipblasSetMatrixAsynccopies a matrix from pinned host memory to device memory asynchronously. Memory on the host must be allocated withhipHostMallocor the transfer will be synchronous.- Parameters:
rows – [in] [int] number of rows in matrices.
cols – [in] [int] number of columns in matrices.
elemSize – [in] [int] number of bytes per element in the matrix.
AP – [in] pointer to matrix on the host.
lda – [in] [int] specifies the leading dimension of A. lda >= rows.
BP – [out] pointer to matrix on the GPU.
ldb – [in] [int] specifies the leading dimension of B. ldb >= rows.
stream – [in] specifies the stream into which this transfer request is queued.
hipblasGetMatrixAsync#
-
hipblasStatus_t hipblasGetMatrixAsync(int rows, int cols, int elemSize, const void *AP, int lda, void *BP, int ldb, hipStream_t stream)#
Asynchronously copy matrix from device to host.
hipblasGetMatrixAsynccopies a matrix from device memory to pinned host memory asynchronously. Memory on the host must be allocated withhipHostMallocor the transfer will be synchronous.- Parameters:
rows – [in] [int] number of rows in matrices.
cols – [in] [int] number of columns in matrices.
elemSize – [in] [int] number of bytes per element in the matrix.
AP – [in] pointer to matrix on the GPU.
lda – [in] [int] specifies the leading dimension of A. lda >= rows.
BP – [out] pointer to matrix on the host.
ldb – [in] [int] specifies the leading dimension of B. ldb >= rows.
stream – [in] specifies the stream into which this transfer request is queued.
hipblasSetAtomicsMode#
-
hipblasStatus_t hipblasSetAtomicsMode(hipblasHandle_t handle, hipblasAtomicsMode_t atomics_mode)#
Sets hipblasSetAtomicsMode.
hipblasGetAtomicsMode#
-
hipblasStatus_t hipblasGetAtomicsMode(hipblasHandle_t handle, hipblasAtomicsMode_t *atomics_mode)#
Gets hipblasSetAtomicsMode.
hipblasStatusToString#
-
const char *hipblasStatusToString(hipblasStatus_t status)#
Auxiliary API
Returns a string representing the
hipblasStatus_tvalue.- Parameters:
status – [in] [hipblasStatus_t] hipBLAS status to convert to string.