rocBLAS Level-1 Functions#
rocBLAS Level-1 functions perform scalar, vector, and vector-vector operations. [Level1]
Level-1 functions support the ILP64 API. For more information on these _64 functions, refer to section ILP64 Interface.
rocblas_iXamax + batched, strided_batched#
- 
rocblas_status rocblas_isamax(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_int *result)#
- 
rocblas_status rocblas_idamax(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_int *result)#
- 
rocblas_status rocblas_icamax(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_int *result)#
- 
rocblas_status rocblas_izamax(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_int *result)#
- BLAS Level 1 API - amax finds the first index of the element of maximum magnitude of a vector x. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x. 
- x – [in] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of y. 
- result – [inout] device pointer or host pointer to store the amax index. return is 0.0 if n, incx<=0. 
 
 
The amax functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_isamax_batched(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_idamax_batched(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_icamax_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_izamax_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- BLAS Level 1 API - amax_batched finds the first index of the element of maximum magnitude of each vector x_i in a batch, for i = 1, …, batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each vector x_i. 
- x – [in] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- batch_count – [in] [rocblas_int] number of instances in the batch. Must be > 0. 
- result – [out] device or host array of pointers of batch_count size for results. return is 0 if n, incx<=0. 
 
 
The amax_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_isamax_strided_batched(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_idamax_strided_batched(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_icamax_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_izamax_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- BLAS Level 1 API - amax_strided_batched finds the first index of the element of maximum magnitude of each vector x_i in a batch, for i = 1, …, batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each vector x_i. 
- x – [in] device pointer to the first vector x_1. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- stridex – [in] [rocblas_stride] specifies the pointer increment between one x_i and the next x_(i + 1). 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
- result – [out] device or host pointer for storing contiguous batch_count results. return is 0 if n <= 0, incx<=0. 
 
 
The amax_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_iXamin + batched, strided_batched#
- 
rocblas_status rocblas_isamin(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_int *result)#
- 
rocblas_status rocblas_idamin(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_int *result)#
- 
rocblas_status rocblas_icamin(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_int *result)#
- 
rocblas_status rocblas_izamin(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_int *result)#
- BLAS Level 1 API - amin finds the first index of the element of minimum magnitude of a vector x. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x. 
- x – [in] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of y. 
- result – [inout] device pointer or host pointer to store the amin index. return is 0.0 if n, incx<=0. 
 
 
The amin functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_isamin_batched(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_idamin_batched(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_icamin_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_izamin_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, rocblas_int *result)#
- BLAS Level 1 API - amin_batched finds the first index of the element of minimum magnitude of each vector x_i in a batch, for i = 1, …, batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each vector x_i. 
- x – [in] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- batch_count – [in] [rocblas_int] number of instances in the batch. Must be > 0. 
- result – [out] device or host pointers to array of batch_count size for results. return is 0 if n, incx<=0. 
 
 
The amin_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_isamin_strided_batched(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_idamin_strided_batched(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_icamin_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- 
rocblas_status rocblas_izamin_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, rocblas_int *result)#
- BLAS Level 1 API - amin_strided_batched finds the first index of the element of minimum magnitude of each vector x_i in a batch, for i = 1, …, batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each vector x_i. 
- x – [in] device pointer to the first vector x_1. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- stridex – [in] [rocblas_stride] specifies the pointer increment between one x_i and the next x_(i + 1). 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
- result – [out] device or host pointer to array for storing contiguous batch_count results. return is 0 if n <= 0, incx<=0. 
 
 
The amin_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xasum + batched, strided_batched#
- 
rocblas_status rocblas_sasum(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, float *result)#
- 
rocblas_status rocblas_dasum(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, double *result)#
- 
rocblas_status rocblas_scasum(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, float *result)#
- 
rocblas_status rocblas_dzasum(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, double *result)#
- BLAS Level 1 API - asum computes the sum of the magnitudes of elements of a real vector x, or the sum of magnitudes of the real and imaginary parts of elements if x is a complex vector. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x and y. 
- x – [in] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of x. incx must be > 0. 
- result – [inout] device pointer or host pointer to store the asum product. return is 0.0 if n <= 0. 
 
 
The asum functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sasum_batched(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dasum_batched(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, double *results)#
- 
rocblas_status rocblas_scasum_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dzasum_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, double *results)#
- BLAS Level 1 API - asum_batched computes the sum of the magnitudes of the elements in a batch of real vectors x_i, or the sum of magnitudes of the real and imaginary parts of elements if x_i is a complex vector, for i = 1, …, batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each vector x_i. 
- x – [in] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
- results – [out] device array or host array of batch_count size for results. return is 0.0 if n, incx<=0. 
 
 
The asum_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sasum_strided_batched(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dasum_strided_batched(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)#
- 
rocblas_status rocblas_scasum_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dzasum_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)#
- BLAS Level 1 API - asum_strided_batched computes the sum of the magnitudes of elements of a real vectors x_i, or the sum of magnitudes of the real and imaginary parts of elements if x_i is a complex vector, for i = 1, …, batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each vector x_i. 
- x – [in] device pointer to the first vector x_1. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x. However, ensure that stride_x is of appropriate size. For a typical case this means stride_x >= n * incx. 
- results – [out] device pointer or host pointer to array for storing contiguous batch_count results. return is 0.0 if n, incx<=0. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
 
 
The asum_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xaxpy + batched, strided_batched#
- 
rocblas_status rocblas_saxpy(rocblas_handle handle, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, float *y, rocblas_int incy)#
- 
rocblas_status rocblas_daxpy(rocblas_handle handle, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, double *y, rocblas_int incy)#
- 
rocblas_status rocblas_haxpy(rocblas_handle handle, rocblas_int n, const rocblas_half *alpha, const rocblas_half *x, rocblas_int incx, rocblas_half *y, rocblas_int incy)#
- 
rocblas_status rocblas_caxpy(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy)#
- 
rocblas_status rocblas_zaxpy(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy)#
- BLAS Level 1 API - axpy computes constant alpha multiplied by vector x, plus vector y: - y := alpha * x + y - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x and y. 
- alpha – [in] device pointer or host pointer to specify the scalar alpha. 
- x – [in] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of x. 
- y – [out] device pointer storing vector y. 
- incy – [inout] [rocblas_int] specifies the increment for the elements of y. 
 
 
The axpy functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_saxpy_batched(rocblas_handle handle, rocblas_int n, const float *alpha, const float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_daxpy_batched(rocblas_handle handle, rocblas_int n, const double *alpha, const double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_haxpy_batched(rocblas_handle handle, rocblas_int n, const rocblas_half *alpha, const rocblas_half *const x[], rocblas_int incx, rocblas_half *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_caxpy_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_zaxpy_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)#
- BLAS Level 1 API - axpy_batched compute y := alpha * x + y over a set of batched vectors. - Parameters:
- handle – [in] rocblas_handle handle to the rocblas library context queue. 
- n – [in] rocblas_int 
- alpha – [in] specifies the scalar alpha. 
- x – [in] pointer storing vector x on the GPU. 
- incx – [in] rocblas_int specifies the increment for the elements of x. 
- y – [out] pointer storing vector y on the GPU. 
- incy – [inout] rocblas_int specifies the increment for the elements of y. 
- batch_count – [in] rocblas_int number of instances in the batch. 
 
 
The axpy_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_saxpy_strided_batched(rocblas_handle handle, rocblas_int n, const float *alpha, const float *x, rocblas_int incx, rocblas_stride stridex, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_daxpy_strided_batched(rocblas_handle handle, rocblas_int n, const double *alpha, const double *x, rocblas_int incx, rocblas_stride stridex, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_haxpy_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_half *alpha, const rocblas_half *x, rocblas_int incx, rocblas_stride stridex, rocblas_half *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_caxpy_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_zaxpy_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- BLAS Level 1 API - axpy_strided_batched compute y := alpha * x + y over a set of strided batched vectors. - Parameters:
- handle – [in] rocblas_handle handle to the rocblas library context queue. 
- n – [in] rocblas_int. 
- alpha – [in] specifies the scalar alpha. 
- x – [in] pointer storing vector x on the GPU. 
- incx – [in] rocblas_int specifies the increment for the elements of x. 
- stridex – [in] rocblas_stride specifies the increment between vectors of x. 
- y – [out] pointer storing vector y on the GPU. 
- incy – [inout] rocblas_int specifies the increment for the elements of y. 
- stridey – [in] rocblas_stride specifies the increment between vectors of y. 
- batch_count – [in] rocblas_int number of instances in the batch. 
 
 
The axpy_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xcopy + batched, strided_batched#
- 
rocblas_status rocblas_scopy(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, float *y, rocblas_int incy)#
- 
rocblas_status rocblas_dcopy(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, double *y, rocblas_int incy)#
- 
rocblas_status rocblas_ccopy(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy)#
- 
rocblas_status rocblas_zcopy(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy)#
- BLAS Level 1 API - copy copies each element x[i] into y[i], for i = 1 , … , n: - y := x - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x to be copied to y. 
- x – [in] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of x. 
- y – [out] device pointer storing vector y. 
- incy – [in] [rocblas_int] specifies the increment for the elements of y. 
 
 
The copy functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_scopy_batched(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_dcopy_batched(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_ccopy_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_zcopy_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)#
- BLAS Level 1 API - copy_batched copies each element x_i[j] into y_i[j], for j = 1 , … , n; i = 1 , … , batch_count: - y_i := x_i, where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i to be copied to y_i. 
- x – [in] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each vector x_i. 
- y – [out] device array of device pointers storing each vector y_i. 
- incy – [in] [rocblas_int] specifies the increment for the elements of each vector y_i. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
 
 
The copy_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_scopy_strided_batched(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_dcopy_strided_batched(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_ccopy_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_zcopy_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- BLAS Level 1 API - copy_strided_batched copies each element x_i[j] into y_i[j], for j = 1 , … , n; i = 1 , … , batch_count: - y_i := x_i, where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i to be copied to y_i. 
- x – [in] device pointer to the first vector (x_1) in the batch. 
- incx – [in] [rocblas_int] specifies the increments for the elements of vectors x_i. 
- stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x. However, the user should take care to ensure that stride_x is of appropriate size. For a typical case, this means stride_x >= n * incx. 
- y – [out] device pointer to the first vector (y_1) in the batch. 
- incy – [in] [rocblas_int] specifies the increment for the elements of vectors y_i. 
- stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_y, However, ensure that stride_y is of appropriate size, for a typical case this means stride_y >= n * incy. stridey should be non zero. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
 
 
The copy_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xdot + batched, strided_batched#
- 
rocblas_status rocblas_sdot(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, const float *y, rocblas_int incy, float *result)#
- 
rocblas_status rocblas_ddot(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, const double *y, rocblas_int incy, double *result)#
- 
rocblas_status rocblas_hdot(rocblas_handle handle, rocblas_int n, const rocblas_half *x, rocblas_int incx, const rocblas_half *y, rocblas_int incy, rocblas_half *result)#
- 
rocblas_status rocblas_bfdot(rocblas_handle handle, rocblas_int n, const rocblas_bfloat16 *x, rocblas_int incx, const rocblas_bfloat16 *y, rocblas_int incy, rocblas_bfloat16 *result)#
- 
rocblas_status rocblas_cdotu(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *y, rocblas_int incy, rocblas_float_complex *result)#
- 
rocblas_status rocblas_cdotc(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, const rocblas_float_complex *y, rocblas_int incy, rocblas_float_complex *result)#
- 
rocblas_status rocblas_zdotu(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *y, rocblas_int incy, rocblas_double_complex *result)#
- 
rocblas_status rocblas_zdotc(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, const rocblas_double_complex *y, rocblas_int incy, rocblas_double_complex *result)#
- BLAS Level 1 API - dot(u) performs the dot product of vectors x and y: dotc performs the dot product of the conjugate of complex vector x and complex vector y.- result = x * y; - result = conjugate (x) * y; - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x and y. 
- x – [in] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of y. 
- y – [in] device pointer storing vector y. 
- incy – [in] [rocblas_int] specifies the increment for the elements of y. 
- result – [inout] device pointer or host pointer to store the dot product. return is 0.0 if n <= 0. 
 
 
The dot/c/u functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sdot_batched(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, const float *const y[], rocblas_int incy, rocblas_int batch_count, float *result)#
- 
rocblas_status rocblas_ddot_batched(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, const double *const y[], rocblas_int incy, rocblas_int batch_count, double *result)#
- 
rocblas_status rocblas_hdot_batched(rocblas_handle handle, rocblas_int n, const rocblas_half *const x[], rocblas_int incx, const rocblas_half *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_half *result)#
- 
rocblas_status rocblas_bfdot_batched(rocblas_handle handle, rocblas_int n, const rocblas_bfloat16 *const x[], rocblas_int incx, const rocblas_bfloat16 *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_bfloat16 *result)#
- 
rocblas_status rocblas_cdotu_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_float_complex *result)#
- 
rocblas_status rocblas_cdotc_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, const rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_float_complex *result)#
- 
rocblas_status rocblas_zdotu_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_double_complex *result)#
- 
rocblas_status rocblas_zdotc_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, const rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count, rocblas_double_complex *result)#
- BLAS Level 1 API - dot_batched(u) performs a batch of dot products of vectors x and y: dotc_batched performs a batch of dot products of the conjugate of complex vector x and complex vector y- result_i = x_i * y_i; - result_i = conjugate (x_i) * y_i; where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors, for i = 1, ..., batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i and y_i. 
- x – [in] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. 
- y – [in] device array of device pointers storing each vector y_i. 
- incy – [in] [rocblas_int] specifies the increment for the elements of each y_i. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
- result – [inout] device array or host array of batch_count size to store the dot products of each batch. return 0.0 for each element if n <= 0. 
 
 
The dot/c/u_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sdot_strided_batched(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, const float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, float *result)#
- 
rocblas_status rocblas_ddot_strided_batched(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, const double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, double *result)#
- 
rocblas_status rocblas_hdot_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_half *x, rocblas_int incx, rocblas_stride stridex, const rocblas_half *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_half *result)#
- 
rocblas_status rocblas_bfdot_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_bfloat16 *x, rocblas_int incx, rocblas_stride stridex, const rocblas_bfloat16 *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_bfloat16 *result)#
- 
rocblas_status rocblas_cdotu_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_float_complex *result)#
- 
rocblas_status rocblas_cdotc_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_float_complex *result)#
- 
rocblas_status rocblas_zdotu_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_double_complex *result)#
- 
rocblas_status rocblas_zdotc_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, const rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count, rocblas_double_complex *result)#
- BLAS Level 1 API - dot_strided_batched(u) performs a batch of dot products of vectors x and y: dotc_strided_batched performs a batch of dot products of the conjugate of complex vector x and complex vector y- result_i = x_i * y_i; - result_i = conjugate (x_i) * y_i; where (x_i, y_i) is the i-th instance of the batch. x_i and y_i are vectors, for i = 1, ..., batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i and y_i. 
- x – [in] device pointer to the first vector (x_1) in the batch. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. 
- stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). 
- y – [in] device pointer to the first vector (y_1) in the batch. 
- incy – [in] [rocblas_int] specifies the increment for the elements of each y_i. 
- stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
- result – [inout] device array or host array of batch_count size to store the dot products of each batch. return 0.0 for each element if n <= 0. 
 
 
The dot/c/u_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xnrm2 + batched, strided_batched#
- 
rocblas_status rocblas_snrm2(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, float *result)#
- 
rocblas_status rocblas_dnrm2(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, double *result)#
- 
rocblas_status rocblas_scnrm2(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, float *result)#
- 
rocblas_status rocblas_dznrm2(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, double *result)#
- BLAS Level 1 API - nrm2 computes the euclidean norm of a real or complex vector: - result := sqrt( x'*x ) for real vectors result := sqrt( x**H*x ) for complex vectors - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x. 
- x – [in] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of y. 
- result – [inout] device pointer or host pointer to store the nrm2 product. return is 0.0 if n, incx<=0. 
 
 
The nrm2 functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_snrm2_batched(rocblas_handle handle, rocblas_int n, const float *const x[], rocblas_int incx, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dnrm2_batched(rocblas_handle handle, rocblas_int n, const double *const x[], rocblas_int incx, rocblas_int batch_count, double *results)#
- 
rocblas_status rocblas_scnrm2_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dznrm2_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count, double *results)#
- BLAS Level 1 API - nrm2_batched computes the euclidean norm over a batch of real or complex vectors: - result := sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batch_count result := sqrt( x_i**H*x_i ) for complex vectors x, for i = 1, ..., batch_count - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each x_i. 
- x – [in] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
- results – [out] device pointer or host pointer to array of batch_count size for nrm2 results. return is 0.0 for each element if n <= 0, incx<=0. 
 
 
The nrm2_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_snrm2_strided_batched(rocblas_handle handle, rocblas_int n, const float *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dnrm2_strided_batched(rocblas_handle handle, rocblas_int n, const double *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)#
- 
rocblas_status rocblas_scnrm2_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, float *results)#
- 
rocblas_status rocblas_dznrm2_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_int batch_count, double *results)#
- BLAS Level 1 API - nrm2_strided_batched computes the euclidean norm over a batch of real or complex vectors: - result := sqrt( x_i'*x_i ) for real vectors x, for i = 1, ..., batch_count result := sqrt( x_i**H*x_i ) for complex vectors, for i = 1, ..., batch_count - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each x_i. 
- x – [in] device pointer to the first vector x_1. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. incx must be > 0. 
- stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x. However, ensure that stride_x is of appropriate size. For a typical case this means stride_x >= n * incx. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
- results – [out] device pointer or host pointer to array for storing contiguous batch_count results. return is 0.0 for each element if n <= 0, incx<=0. 
 
 
The nrm2_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xrot + batched, strided_batched#
- 
rocblas_status rocblas_srot(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, float *y, rocblas_int incy, const float *c, const float *s)#
- 
rocblas_status rocblas_drot(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, double *y, rocblas_int incy, const double *c, const double *s)#
- 
rocblas_status rocblas_crot(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy, const float *c, const rocblas_float_complex *s)#
- 
rocblas_status rocblas_csrot(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy, const float *c, const float *s)#
- 
rocblas_status rocblas_zrot(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy, const double *c, const rocblas_double_complex *s)#
- 
rocblas_status rocblas_zdrot(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy, const double *c, const double *s)#
- BLAS Level 1 API - rot applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to vectors x and y. Scalars c and s may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in the x and y vectors. 
- x – [inout] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment between elements of x. 
- y – [inout] device pointer storing vector y. 
- incy – [in] [rocblas_int] specifies the increment between elements of y. 
- c – [in] device pointer or host pointer storing scalar cosine component of the rotation matrix. 
- s – [in] device pointer or host pointer storing scalar sine component of the rotation matrix. 
 
 
The rot functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srot_batched(rocblas_handle handle, rocblas_int n, float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, const float *c, const float *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_drot_batched(rocblas_handle handle, rocblas_int n, double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, const double *c, const double *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_crot_batched(rocblas_handle handle, rocblas_int n, rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, const float *c, const rocblas_float_complex *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_csrot_batched(rocblas_handle handle, rocblas_int n, rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, const float *c, const float *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_zrot_batched(rocblas_handle handle, rocblas_int n, rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, const double *c, const rocblas_double_complex *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_zdrot_batched(rocblas_handle handle, rocblas_int n, rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, const double *c, const double *s, rocblas_int batch_count)#
- BLAS Level 1 API - rot_batched applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to batched vectors x_i and y_i, for i = 1, …, batch_count. Scalars c and s may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each x_i and y_i vectors. 
- x – [inout] device array of deivce pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment between elements of each x_i. 
- y – [inout] device array of device pointers storing each vector y_i. 
- incy – [in] [rocblas_int] specifies the increment between elements of each y_i. 
- c – [in] device pointer or host pointer to scalar cosine component of the rotation matrix. 
- s – [in] device pointer or host pointer to scalar sine component of the rotation matrix. 
- batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches. 
 
 
The rot_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srot_strided_batched(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, rocblas_stride stride_x, float *y, rocblas_int incy, rocblas_stride stride_y, const float *c, const float *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_drot_strided_batched(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, rocblas_stride stride_x, double *y, rocblas_int incy, rocblas_stride stride_y, const double *c, const double *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_crot_strided_batched(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stride_y, const float *c, const rocblas_float_complex *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_csrot_strided_batched(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stride_y, const float *c, const float *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_zrot_strided_batched(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stride_y, const double *c, const rocblas_double_complex *s, rocblas_int batch_count)#
- 
rocblas_status rocblas_zdrot_strided_batched(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stride_y, const double *c, const double *s, rocblas_int batch_count)#
- BLAS Level 1 API - rot_strided_batched applies the Givens rotation matrix defined by c=cos(alpha) and s=sin(alpha) to strided batched vectors x_i and y_i, for i = 1, …, batch_count. Scalars c and s may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in each x_i and y_i vectors. 
- x – [inout] device pointer to the first vector x_1. 
- incx – [in] [rocblas_int] specifies the increment between elements of each x_i. 
- stride_x – [in] [rocblas_stride] specifies the increment from the beginning of x_i to the beginning of x_(i+1). 
- y – [inout] device pointer to the first vector y_1. 
- incy – [in] [rocblas_int] specifies the increment between elements of each y_i. 
- stride_y – [in] [rocblas_stride] specifies the increment from the beginning of y_i to the beginning of y_(i+1) 
- c – [in] device pointer or host pointer to scalar cosine component of the rotation matrix. 
- s – [in] device pointer or host pointer to scalar sine component of the rotation matrix. 
- batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches. 
 
 
The rot_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xrotg + batched, strided_batched#
- 
rocblas_status rocblas_srotg(rocblas_handle handle, float *a, float *b, float *c, float *s)#
- 
rocblas_status rocblas_drotg(rocblas_handle handle, double *a, double *b, double *c, double *s)#
- 
rocblas_status rocblas_crotg(rocblas_handle handle, rocblas_float_complex *a, rocblas_float_complex *b, float *c, rocblas_float_complex *s)#
- 
rocblas_status rocblas_zrotg(rocblas_handle handle, rocblas_double_complex *a, rocblas_double_complex *b, double *c, rocblas_double_complex *s)#
- BLAS Level 1 API - rotg creates the Givens rotation matrix for the vector (a b). Scalars a, b, c, and s may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. The computation uses the formulas The subroutine also computes- sigma = sgn(a) if |a| > |b| = sgn(b) if |b| >= |a| r = sigma*sqrt( a**2 + b**2 ) c = 1; s = 0 if r = 0 c = a/r; s = b/r if r != 0 This allows c and s to be reconstructed from z as follows:- z = s if |a| > |b|, = 1/c if |b| >= |a| and c != 0 = 1 if c = 0 - If z = 1, set c = 0, s = 1. If |z| < 1, set c = sqrt(1 - z**2) and s = z. If |z| > 1, set c = 1/z and s = sqrt( 1 - c**2). - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- a – [inout] pointer to a, an element in vector (a,b), overwritten with r. 
- b – [inout] pointer to b, an element in vector (a,b), overwritten with z. 
- c – [out] pointer to c, cosine element of Givens rotation. 
- s – [out] pointer to s, sine element of Givens rotation. 
 
 
The rotg functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srotg_batched(rocblas_handle handle, float *const a[], float *const b[], float *const c[], float *const s[], rocblas_int batch_count)#
- 
rocblas_status rocblas_drotg_batched(rocblas_handle handle, double *const a[], double *const b[], double *const c[], double *const s[], rocblas_int batch_count)#
- 
rocblas_status rocblas_crotg_batched(rocblas_handle handle, rocblas_float_complex *const a[], rocblas_float_complex *const b[], float *const c[], rocblas_float_complex *const s[], rocblas_int batch_count)#
- 
rocblas_status rocblas_zrotg_batched(rocblas_handle handle, rocblas_double_complex *const a[], rocblas_double_complex *const b[], double *const c[], rocblas_double_complex *const s[], rocblas_int batch_count)#
- BLAS Level 1 API - rotg_batched creates the Givens rotation matrix for the batched vectors (a_i b_i), for i = 1, …, batch_count. a, b, c, and s are host pointers to an array of device pointers on the device, where each device pointer points to a scalar value of a_i, b_i, c_i, or s_i. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- a – [inout] a, overwritten with r. 
- b – [inout] b overwritten with z. 
- c – [out] cosine element of Givens rotation for the batch. 
- s – [out] sine element of Givens rotation for the batch. 
- batch_count – [in] [rocblas_int] number of batches (length of arrays a, b, c, and s). 
 
 
The rotg_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srotg_strided_batched(rocblas_handle handle, float *a, rocblas_stride stride_a, float *b, rocblas_stride stride_b, float *c, rocblas_stride stride_c, float *s, rocblas_stride stride_s, rocblas_int batch_count)#
- 
rocblas_status rocblas_drotg_strided_batched(rocblas_handle handle, double *a, rocblas_stride stride_a, double *b, rocblas_stride stride_b, double *c, rocblas_stride stride_c, double *s, rocblas_stride stride_s, rocblas_int batch_count)#
- 
rocblas_status rocblas_crotg_strided_batched(rocblas_handle handle, rocblas_float_complex *a, rocblas_stride stride_a, rocblas_float_complex *b, rocblas_stride stride_b, float *c, rocblas_stride stride_c, rocblas_float_complex *s, rocblas_stride stride_s, rocblas_int batch_count)#
- 
rocblas_status rocblas_zrotg_strided_batched(rocblas_handle handle, rocblas_double_complex *a, rocblas_stride stride_a, rocblas_double_complex *b, rocblas_stride stride_b, double *c, rocblas_stride stride_c, rocblas_double_complex *s, rocblas_stride stride_s, rocblas_int batch_count)#
- BLAS Level 1 API - rotg_strided_batched creates the Givens rotation matrix for the strided batched vectors (a_i b_i), for i = 1, …, batch_count. a, b, c, and s are host pointers to arrays a, b, c, s on the device. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- a – [inout] host pointer to first single input vector element a_1 on the device, overwritten with r. 
- stride_a – [in] [rocblas_stride] distance between elements of a in batch (distance between a_i and a_(i + 1)). 
- b – [inout] host pointer to first single input vector element b_1 on the device, overwritten with z. 
- stride_b – [in] [rocblas_stride] distance between elements of b in batch (distance between b_i and b_(i + 1)). 
- c – [out] host pointer to first single cosine element of Givens rotations c_1 on the device. 
- stride_c – [in] [rocblas_stride] distance between elements of c in batch (distance between c_i and c_(i + 1)). 
- s – [out] host pointer to first single sine element of Givens rotations s_1 on the device. 
- stride_s – [in] [rocblas_stride] distance between elements of s in batch (distance between s_i and s_(i + 1)). 
- batch_count – [in] [rocblas_int] number of batches (length of arrays a, b, c, and s). 
 
 
The rotg_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xrotm + batched, strided_batched#
- 
rocblas_status rocblas_srotm(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, float *y, rocblas_int incy, const float *param)#
- 
rocblas_status rocblas_drotm(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, double *y, rocblas_int incy, const double *param)#
- BLAS Level 1 API - rotm applies the modified Givens rotation matrix defined by param to vectors x and y. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in the x and y vectors. 
- x – [inout] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment between elements of x. 
- y – [inout] device pointer storing vector y. 
- incy – [in] [rocblas_int] specifies the increment between elements of y. 
- param – [in] device vector or host vector of 5 elements defining the rotation. - param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory, location is specified by calling rocblas_set_pointer_mode. 
 
 
The rotm functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srotm_batched(rocblas_handle handle, rocblas_int n, float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, const float *const param[], rocblas_int batch_count)#
- 
rocblas_status rocblas_drotm_batched(rocblas_handle handle, rocblas_int n, double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, const double *const param[], rocblas_int batch_count)#
- BLAS Level 1 API - rotm_batched applies the modified Givens rotation matrix defined by param_i to batched vectors x_i and y_i, for i = 1, …, batch_count. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in the x and y vectors. 
- x – [inout] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment between elements of each x_i. 
- y – [inout] device array of device pointers storing each vector y_1. 
- incy – [in] [rocblas_int] specifies the increment between elements of each y_i. 
- param – [in] device array of device vectors of 5 elements defining the rotation. - param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may ONLY be stored on the device for the batched version of this function. 
- batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches. 
 
 
The rotm_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srotm_strided_batched(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, rocblas_stride stride_x, float *y, rocblas_int incy, rocblas_stride stride_y, const float *param, rocblas_stride stride_param, rocblas_int batch_count)#
- 
rocblas_status rocblas_drotm_strided_batched(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, rocblas_stride stride_x, double *y, rocblas_int incy, rocblas_stride stride_y, const double *param, rocblas_stride stride_param, rocblas_int batch_count)#
- BLAS Level 1 API - rotm_strided_batched applies the modified Givens rotation matrix defined by param_i to strided batched vectors x_i and y_i, for i = 1, …, batch_count - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] number of elements in the x and y vectors. 
- x – [inout] device pointer pointing to first strided batched vector x_1. 
- incx – [in] [rocblas_int] specifies the increment between elements of each x_i. 
- stride_x – [in] [rocblas_stride] specifies the increment between the beginning of x_i and x_(i + 1) 
- y – [inout] device pointer pointing to first strided batched vector y_1. 
- incy – [in] [rocblas_int] specifies the increment between elements of each y_i. 
- stride_y – [in] [rocblas_stride] specifies the increment between the beginning of y_i and y_(i + 1). 
- param – [in] device pointer pointing to first array of 5 elements defining the rotation (param_1). - param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may ONLY be stored on the device for the strided_batched version of this function. 
- stride_param – [in] [rocblas_stride] specifies the increment between the beginning of param_i and param_(i + 1). 
- batch_count – [in] [rocblas_int] the number of x and y arrays, i.e. the number of batches. 
 
 
The rotm_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xrotmg + batched, strided_batched#
- 
rocblas_status rocblas_srotmg(rocblas_handle handle, float *d1, float *d2, float *x1, const float *y1, float *param)#
- 
rocblas_status rocblas_drotmg(rocblas_handle handle, double *d1, double *d2, double *x1, const double *y1, double *param)#
- BLAS Level 1 API - rotmg creates the modified Givens rotation matrix for the vector (d1 * x1, d2 * y1). Parameters may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode: - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- d1 – [inout] device pointer or host pointer to input scalar that is overwritten. 
- d2 – [inout] device pointer or host pointer to input scalar that is overwritten. 
- x1 – [inout] device pointer or host pointer to input scalar that is overwritten. 
- y1 – [in] device pointer or host pointer to input scalar. 
- param – [out] device vector or host vector of five elements defining the rotation. - param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode. 
 
 
The rotmg functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srotmg_batched(rocblas_handle handle, float *const d1[], float *const d2[], float *const x1[], const float *const y1[], float *const param[], rocblas_int batch_count)#
- 
rocblas_status rocblas_drotmg_batched(rocblas_handle handle, double *const d1[], double *const d2[], double *const x1[], const double *const y1[], double *const param[], rocblas_int batch_count)#
- BLAS Level 1 API - rotmg_batched creates the modified Givens rotation matrix for the batched vectors (d1_i * x1_i, d2_i * y1_i), for i = 1, …, batch_count. Parameters may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode: - If the pointer mode is set to rocblas_pointer_mode_host, then this function blocks the CPU until the GPU has finished and the results are available in host memory. 
- If the pointer mode is set to rocblas_pointer_mode_device, then this function returns immediately and synchronization is required to read the results. 
 - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- d1 – [inout] device batched array or host batched array of input scalars that is overwritten. 
- d2 – [inout] device batched array or host batched array of input scalars that is overwritten. 
- x1 – [inout] device batched array or host batched array of input scalars that is overwritten. 
- y1 – [in] device batched array or host batched array of input scalars. 
- param – [out] device batched array or host batched array of vectors of 5 elements defining the rotation. - param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode. 
- batch_count – [in] [rocblas_int] the number of instances in the batch. 
 
 
The rotmg_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_srotmg_strided_batched(rocblas_handle handle, float *d1, rocblas_stride stride_d1, float *d2, rocblas_stride stride_d2, float *x1, rocblas_stride stride_x1, const float *y1, rocblas_stride stride_y1, float *param, rocblas_stride stride_param, rocblas_int batch_count)#
- 
rocblas_status rocblas_drotmg_strided_batched(rocblas_handle handle, double *d1, rocblas_stride stride_d1, double *d2, rocblas_stride stride_d2, double *x1, rocblas_stride stride_x1, const double *y1, rocblas_stride stride_y1, double *param, rocblas_stride stride_param, rocblas_int batch_count)#
- BLAS Level 1 API - rotmg_strided_batched creates the modified Givens rotation matrix for the strided batched vectors (d1_i * x1_i, d2_i * y1_i), for i = 1, …, batch_count. Parameters may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode: - If the pointer mode is set to rocblas_pointer_mode_host, then this function blocks the CPU until the GPU has finished and the results are available in host memory. 
- If the pointer mode is set to rocblas_pointer_mode_device, then this function returns immediately and synchronization is required to read the results. 
 - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- d1 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten. 
- stride_d1 – [in] [rocblas_stride] specifies the increment between the beginning of d1_i and d1_(i+1). 
- d2 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten. 
- stride_d2 – [in] [rocblas_stride] specifies the increment between the beginning of d2_i and d2_(i+1). 
- x1 – [inout] device strided_batched array or host strided_batched array of input scalars that is overwritten. 
- stride_x1 – [in] [rocblas_stride] specifies the increment between the beginning of x1_i and x1_(i+1). 
- y1 – [in] device strided_batched array or host strided_batched array of input scalars. 
- stride_y1 – [in] [rocblas_stride] specifies the increment between the beginning of y1_i and y1_(i+1). 
- param – [out] device strided_batched array or host strided_batched array of vectors of 5 elements defining the rotation. - param[0] = flag param[1] = H11 param[2] = H21 param[3] = H12 param[4] = H22 The flag parameter defines the form of H: flag = -1 => H = ( H11 H12 H21 H22 ) flag = 0 => H = ( 1.0 H12 H21 1.0 ) flag = 1 => H = ( H11 1.0 -1.0 H22 ) flag = -2 => H = ( 1.0 0.0 0.0 1.0 ) param may be stored in either host or device memory. Location is specified by calling rocblas_set_pointer_mode. 
- stride_param – [in] [rocblas_stride] specifies the increment between the beginning of param_i and param_(i + 1). 
- batch_count – [in] [rocblas_int] the number of instances in the batch. 
 
 
The rotmg_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xscal + batched, strided_batched#
- 
rocblas_status rocblas_sscal(rocblas_handle handle, rocblas_int n, const float *alpha, float *x, rocblas_int incx)#
- 
rocblas_status rocblas_dscal(rocblas_handle handle, rocblas_int n, const double *alpha, double *x, rocblas_int incx)#
- 
rocblas_status rocblas_cscal(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, rocblas_float_complex *x, rocblas_int incx)#
- 
rocblas_status rocblas_zscal(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, rocblas_double_complex *x, rocblas_int incx)#
- 
rocblas_status rocblas_csscal(rocblas_handle handle, rocblas_int n, const float *alpha, rocblas_float_complex *x, rocblas_int incx)#
- 
rocblas_status rocblas_zdscal(rocblas_handle handle, rocblas_int n, const double *alpha, rocblas_double_complex *x, rocblas_int incx)#
- BLAS Level 1 API - scal scales each element of vector x with scalar alpha: - x := alpha * x - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x. 
- alpha – [in] device pointer or host pointer for the scalar alpha. 
- x – [inout] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of x. 
 
 
The scal functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sscal_batched(rocblas_handle handle, rocblas_int n, const float *alpha, float *const x[], rocblas_int incx, rocblas_int batch_count)#
- 
rocblas_status rocblas_dscal_batched(rocblas_handle handle, rocblas_int n, const double *alpha, double *const x[], rocblas_int incx, rocblas_int batch_count)#
- 
rocblas_status rocblas_cscal_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count)#
- 
rocblas_status rocblas_zscal_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count)#
- 
rocblas_status rocblas_csscal_batched(rocblas_handle handle, rocblas_int n, const float *alpha, rocblas_float_complex *const x[], rocblas_int incx, rocblas_int batch_count)#
- 
rocblas_status rocblas_zdscal_batched(rocblas_handle handle, rocblas_int n, const double *alpha, rocblas_double_complex *const x[], rocblas_int incx, rocblas_int batch_count)#
- BLAS Level 1 API - scal_batched scales each element of vector x_i with scalar alpha, for i = 1, … , batch_count: - x_i := alpha * x_i, where (x_i) is the i-th instance of the batch. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i. 
- alpha – [in] host pointer or device pointer for the scalar alpha. 
- x – [inout] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. 
- batch_count – [in] [rocblas_int] specifies the number of batches in x. 
 
 
The scal_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sscal_strided_batched(rocblas_handle handle, rocblas_int n, const float *alpha, float *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)#
- 
rocblas_status rocblas_dscal_strided_batched(rocblas_handle handle, rocblas_int n, const double *alpha, double *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)#
- 
rocblas_status rocblas_cscal_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_float_complex *alpha, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)#
- 
rocblas_status rocblas_zscal_strided_batched(rocblas_handle handle, rocblas_int n, const rocblas_double_complex *alpha, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)#
- 
rocblas_status rocblas_csscal_strided_batched(rocblas_handle handle, rocblas_int n, const float *alpha, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)#
- 
rocblas_status rocblas_zdscal_strided_batched(rocblas_handle handle, rocblas_int n, const double *alpha, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stride_x, rocblas_int batch_count)#
- BLAS Level 1 API - scal_strided_batched scales each element of vector x_i with scalar alpha, for i = 1, … , batch_count: - x_i := alpha * x_i, where (x_i) is the i-th instance of the batch. - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i. 
- alpha – [in] host pointer or device pointer for the scalar alpha. 
- x – [inout] device pointer to the first vector (x_1) in the batch. 
- incx – [in] [rocblas_int] specifies the increment for the elements of x. 
- stride_x – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x. However, ensure that stride_x is of appropriate size, for a typical case this means stride_x >= n * incx. 
- batch_count – [in] [rocblas_int] specifies the number of batches in x. 
 
 
The scal_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.
rocblas_Xswap + batched, strided_batched#
- 
rocblas_status rocblas_sswap(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, float *y, rocblas_int incy)#
- 
rocblas_status rocblas_dswap(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, double *y, rocblas_int incy)#
- 
rocblas_status rocblas_cswap(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_float_complex *y, rocblas_int incy)#
- 
rocblas_status rocblas_zswap(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_double_complex *y, rocblas_int incy)#
- BLAS Level 1 API - swap interchanges vectors x and y: - y := x; x := y - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in x and y. 
- x – [inout] device pointer storing vector x. 
- incx – [in] [rocblas_int] specifies the increment for the elements of x. 
- y – [inout] device pointer storing vector y. 
- incy – [in] [rocblas_int] specifies the increment for the elements of y. 
 
 
The swap functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sswap_batched(rocblas_handle handle, rocblas_int n, float *const x[], rocblas_int incx, float *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_dswap_batched(rocblas_handle handle, rocblas_int n, double *const x[], rocblas_int incx, double *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_cswap_batched(rocblas_handle handle, rocblas_int n, rocblas_float_complex *const x[], rocblas_int incx, rocblas_float_complex *const y[], rocblas_int incy, rocblas_int batch_count)#
- 
rocblas_status rocblas_zswap_batched(rocblas_handle handle, rocblas_int n, rocblas_double_complex *const x[], rocblas_int incx, rocblas_double_complex *const y[], rocblas_int incy, rocblas_int batch_count)#
- BLAS Level 1 API - swap_batched interchanges vectors x_i and y_i, for i = 1 , … , batch_count: - y_i := x_i; x_i := y_i - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i and y_i. 
- x – [inout] device array of device pointers storing each vector x_i. 
- incx – [in] [rocblas_int] specifies the increment for the elements of each x_i. 
- y – [inout] device array of device pointers storing each vector y_i. 
- incy – [in] [rocblas_int] specifies the increment for the elements of each y_i. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
 
 
The swap_batched functions support the _64 interface. Refer to section ILP64 Interface.
- 
rocblas_status rocblas_sswap_strided_batched(rocblas_handle handle, rocblas_int n, float *x, rocblas_int incx, rocblas_stride stridex, float *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_dswap_strided_batched(rocblas_handle handle, rocblas_int n, double *x, rocblas_int incx, rocblas_stride stridex, double *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_cswap_strided_batched(rocblas_handle handle, rocblas_int n, rocblas_float_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_float_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- 
rocblas_status rocblas_zswap_strided_batched(rocblas_handle handle, rocblas_int n, rocblas_double_complex *x, rocblas_int incx, rocblas_stride stridex, rocblas_double_complex *y, rocblas_int incy, rocblas_stride stridey, rocblas_int batch_count)#
- BLAS Level 1 API - swap_strided_batched interchanges vectors x_i and y_i, for i = 1 , … , batch_count: - y_i := x_i; x_i := y_i - Parameters:
- handle – [in] [rocblas_handle] handle to the rocblas library context queue. 
- n – [in] [rocblas_int] the number of elements in each x_i and y_i. 
- x – [inout] device pointer to the first vector x_1. 
- incx – [in] [rocblas_int] specifies the increment for the elements of x. 
- stridex – [in] [rocblas_stride] stride from the start of one vector (x_i) and the next one (x_i+1). There are no restrictions placed on stride_x. However, ensure that stride_x is of appropriate size. For a typical case this means stride_x >= n * incx. 
- y – [inout] device pointer to the first vector y_1. 
- incy – [in] [rocblas_int] specifies the increment for the elements of y. 
- stridey – [in] [rocblas_stride] stride from the start of one vector (y_i) and the next one (y_i+1). There are no restrictions placed on stride_x. However, ensure that stride_y is of appropriate size. For a typical case this means stride_y >= n * incy. stridey should be non zero. 
- batch_count – [in] [rocblas_int] number of instances in the batch. 
 
 
The swap_strided_batched functions support the _64 interface. Refer to section ILP64 Interface.