rocSOLVER LAPACK Functions

Contents

rocSOLVER LAPACK Functions#

LAPACK routines solve complex Numerical Linear Algebra problems. These functions are organized in the following categories:

Note

Throughout the APIs’ descriptions, we use the following notations:

  • i, j, and k are used as general purpose indices. In some legacy LAPACK APIs, k could be a parameter indicating some problem/matrix dimension.

  • Depending on the context, when it is necessary to index rows, columns and blocks or submatrices, i is assigned to rows, j to columns and k to blocks. \(l\) is always used to index matrices/problems in a batch.

  • x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.

  • To identify a block in a matrix or a matrix in the batch, k and \(l\) are used as sub-indices

  • x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.

  • If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.

  • When a matrix A is formed as the product of several matrices, the following notation is used: A=M(1)M(2)…M(t).

Triangular factorizations#

rocsolver_<type>potf2()#

rocblas_status rocsolver_zpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_spotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

POTF2 computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form:

\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]

U is an upper triangular matrix and L is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.

rocsolver_<type>potf2_batched()#

rocblas_status rocsolver_zpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

POTF2_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>potf2_strided_batched()#

rocblas_status rocsolver_zpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

POTF2_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>potrf()#

rocblas_status rocsolver_zpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_spotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

POTRF computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.

(This is the blocked version of the algorithm).

The factorization has the form:

\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]

U is an upper triangular matrix and L is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.

rocsolver_<type>potrf_batched()#

rocblas_status rocsolver_zpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

POTRF_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>potrf_strided_batched()#

rocblas_status rocsolver_zpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

POTRF_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>getf2()#

rocblas_status rocsolver_zgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_cgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_dgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_sgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_zgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_cgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_sgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

GETF2 computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = PLU \]

where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension min(m,n). The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getf2_batched()#

rocblas_status rocsolver_zgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_cgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_dgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_sgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_zgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETF2_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = P_lL_lU_l \]

where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorizations. The unit diagonal elements of L_l are not stored.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivot indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>getf2_strided_batched()#

rocblas_status rocsolver_zgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_cgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_dgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_sgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_zgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETF2_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = P_lL_lU_l \]

where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorization. The unit diagonal elements of L_l are not stored.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivots indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>getrf()#

rocblas_status rocsolver_zgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_cgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_dgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_sgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
rocblas_status rocsolver_zgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_cgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_sgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

GETRF computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = PLU \]

where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension min(m,n). The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getrf_batched()#

rocblas_status rocsolver_zgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_cgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_dgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_sgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_zgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETRF_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = P_lL_lU_l \]

where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorizations. The unit diagonal elements of L_l are not stored.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivot indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>getrf_strided_batched()#

rocblas_status rocsolver_zgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_cgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_dgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_sgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
rocblas_status rocsolver_zgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETRF_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = P_lL_lU_l \]

where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorization. The unit diagonal elements of L_l are not stored.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivots indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytf2()#

rocblas_status rocsolver_zsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_csytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_ssytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

SYTF2 computes the factorization of a symmetric indefinite matrix \(A\) using Bunch-Kaufman diagonal pivoting.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]

where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_k\).

Specifically, \(U\) and \(L\) are computed as

\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_k\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D_k\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_k\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D_k\) is stored in \(A[k-1,k-1]\), \(A[k-1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D_k\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv[k] (or rows and columns k+1 and -ipiv[k]) were interchanged and D[k-1,k-1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2-by-2 diagonal block.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.

rocsolver_<type>sytf2_batched()#

rocblas_status rocsolver_zsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTF2_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]

where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).

Specifically, \(U_l\) and \(L_l\) are computed as

\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_{kl}\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytf2_strided_batched()#

rocblas_status rocsolver_zsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTF2_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]

where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).

Specifically, \(U_l\) and \(L_l\) are computed as

\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_{kl}\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytrf()#

rocblas_status rocsolver_zsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_csytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_ssytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

SYTRF computes the factorization of a symmetric indefinite matrix \(A\) using Bunch-Kaufman diagonal pivoting.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]

where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_k\).

Specifically, \(U\) and \(L\) are computed as

\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_k\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D_k\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_k\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D_k\) is stored in \(A[k-1,k-1]\), \(A[k-1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D_k\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv[k] (or rows and columns k+1 and -ipiv[k]) were interchanged and D[k-1,k-1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2-by-2 diagonal block.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.

rocsolver_<type>sytrf_batched()#

rocblas_status rocsolver_zsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTRF_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]

where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).

Specifically, \(U_l\) and \(L_l\) are computed as

\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_{kl}\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytrf_strided_batched()#

rocblas_status rocsolver_zsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTRF_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]

where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).

Specifically, \(U_l\) and \(L_l\) are computed as

\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_l(k)\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

Orthogonal factorizations#

rocsolver_<type>geqr2()#

rocblas_status rocsolver_zgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQR2 computes a QR factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]

where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(1)H(2)\cdots H(k), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]

where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m - i elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>geqr2_batched()#

rocblas_status rocsolver_zgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQR2_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]

where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>geqr2_strided_batched()#

rocblas_status rocsolver_zgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQR2_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]

where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>geqrf()#

rocblas_status rocsolver_zgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQRF computes a QR factorization of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]

where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(1)H(2)\cdots H(k), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]

where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m - i elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>geqrf_batched()#

rocblas_status rocsolver_zgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQRF_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]

where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>geqrf_strided_batched()#

rocblas_status rocsolver_zgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQRF_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]

where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gerq2()#

rocblas_status rocsolver_zgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GERQ2 computes a RQ factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]

where R is upper triangular (upper trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(1)'H(2)' \cdots H(k)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]

where the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>gerq2_batched()#

rocblas_status rocsolver_zgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQ2_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]

where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gerq2_strided_batched()#

rocblas_status rocsolver_zgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQ2_STRIDED_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]

where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gerqf()#

rocblas_status rocsolver_zgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GERQF computes a RQ factorization of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The factorization has the form

\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]

where R is upper triangular (upper trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(1)'H(2)' \cdots H(k)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]

where the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>gerqf_batched()#

rocblas_status rocsolver_zgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQF_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]

where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gerqf_strided_batched()#

rocblas_status rocsolver_zgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQF_STRIDED_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]

where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>geql2()#

rocblas_status rocsolver_zgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQL2 computes a QL factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]

where L is lower triangular (lower trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(k)H(k-1)\cdots H(1), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]

where the last m-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>geql2_batched()#

rocblas_status rocsolver_zgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQL2_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]

where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>geql2_strided_batched()#

rocblas_status rocsolver_zgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQL2_STRIDED_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]

where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>geqlf()#

rocblas_status rocsolver_zgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQLF computes a QL factorization of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]

where L is lower triangular (lower trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(k)H(k-1)\cdots H(1), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]

where the last m-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>geqlf_batched()#

rocblas_status rocsolver_zgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQLF_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]

where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>geqlf_strided_batched()#

rocblas_status rocsolver_zgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQLF_STRIDED_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]

where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gelq2()#

rocblas_status rocsolver_zgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GELQ2 computes a LQ factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[ A = \left[\begin{array}{cc} L & 0 \end{array}\right] Q \]

where L is lower triangular (lower trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(k)'H(k-1)' \cdots H(1)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i' v_i^{} \]

where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the last n - i elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>gelq2_batched()#

rocblas_status rocsolver_zgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GELQ2_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]

where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gelq2_strided_batched()#

rocblas_status rocsolver_zgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GELQ2_STRIDED_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]

where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gelqf()#

rocblas_status rocsolver_zgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GELQF computes a LQ factorization of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The factorization has the form

\[ A = \left[\begin{array}{cc} L & 0 \end{array}\right] Q \]

where L is lower triangular (lower trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H(k)'H(k-1)' \cdots H(1)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{ipiv}[i] \cdot v_i' v_i^{} \]

where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the last n - i elements of Householder vector v_i.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.

rocsolver_<type>gelqf_batched()#

rocblas_status rocsolver_zgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GELQF_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]

where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gelqf_strided_batched()#

rocblas_status rocsolver_zgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GELQF_STRIDED_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_l\) in the batch has the form

\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]

where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]

where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

Problem and matrix reductions#

rocsolver_<type>gebd2()#

rocblas_status rocsolver_zgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)#
rocblas_status rocsolver_cgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)#
rocblas_status rocsolver_dgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)#
rocblas_status rocsolver_sgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)#

GEBD2 computes the bidiagonal form of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The bidiagonal form is given by:

\[ B = Q' A P \]

where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n)\: \text{and} \: P = G(1)G(2)\cdots G(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q = H(1)H(2)\cdots H(m-1)\: \text{and} \: P = G(1)G(2)\cdots G(m), & \: \text{if}\: m < n. \end{array} \end{split}\]

Each Householder matrix \(H(i)\) and \(G(i)\) is given by

\[\begin{split} \begin{array}{cl} H(i) = I - \text{tauq}[i] \cdot v_i^{} v_i', & \: \text{and}\\ G(i) = I - \text{taup}[i] \cdot u_i' u_i^{}. \end{array} \end{split}\]

If m >= n, the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\); while the first i elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_i\) are zero, and \(u_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_i, and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_i. If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_i, and the elements above the diagonal are the last n - i elements of Householder vector u_i.

  • lda[in] rocblas_int. lda >= m. specifies the leading dimension of A.

  • D[out] pointer to real type. Array on the GPU of dimension min(m,n). The diagonal elements of B.

  • E[out] pointer to real type. Array on the GPU of dimension min(m,n)-1. The off-diagonal elements of B.

  • tauq[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix Q.

  • taup[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix P.

rocsolver_<type>gebd2_batched()#

rocblas_status rocsolver_zgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

GEBD2_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

For each instance in the batch, the bidiagonal form is given by:

\[ B_l^{} = Q_l' A_l^{} P_l^{} \]

where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by

\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]

If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.

  • tauq[out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.

  • strideQ[in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).

  • taup[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gebd2_strided_batched()#

rocblas_status rocsolver_zgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

GEBD2_STRIDED_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

For each instance in the batch, the bidiagonal form is given by:

\[ B_l^{} = Q_l' A_l^{} P_l^{} \]

where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_1 = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_1 = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by

\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]

If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.

  • tauq[out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.

  • strideQ[in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).

  • taup[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gebrd()#

rocblas_status rocsolver_zgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)#
rocblas_status rocsolver_cgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)#
rocblas_status rocsolver_dgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)#
rocblas_status rocsolver_sgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)#

GEBRD computes the bidiagonal form of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The bidiagonal form is given by:

\[ B = Q' A P \]

where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n)\: \text{and} \: P = G(1)G(2)\cdots G(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q = H(1)H(2)\cdots H(m-1)\: \text{and} \: P = G(1)G(2)\cdots G(m), & \: \text{if}\: m < n. \end{array} \end{split}\]

Each Householder matrix \(H(i)\) and \(G(i)\) is given by

\[\begin{split} \begin{array}{cl} H(i) = I - \text{tauq}[i] \cdot v_i^{} v_i', & \: \text{and}\\ G(i) = I - \text{taup}[i] \cdot u_i' u_i^{}. \end{array} \end{split}\]

If m >= n, the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\); while the first i elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_i\) are zero, and \(u_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_i, and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_i. If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_i, and the elements above the diagonal are the last n - i elements of Householder vector u_i.

  • lda[in] rocblas_int. lda >= m. specifies the leading dimension of A.

  • D[out] pointer to real type. Array on the GPU of dimension min(m,n). The diagonal elements of B.

  • E[out] pointer to real type. Array on the GPU of dimension min(m,n)-1. The off-diagonal elements of B.

  • tauq[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix Q.

  • taup[out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix P.

rocsolver_<type>gebrd_batched()#

rocblas_status rocsolver_zgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

GEBRD_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

For each instance in the batch, the bidiagonal form is given by:

\[ B_l^{} = Q_l' A_l^{} P_l^{} \]

where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by

\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]

If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.

  • tauq[out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.

  • strideQ[in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).

  • taup[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gebrd_strided_batched()#

rocblas_status rocsolver_zgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#

GEBRD_STRIDED_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

For each instance in the batch, the bidiagonal form is given by:

\[ B_l^{} = Q_l' A_l^{} P_l^{} \]

where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by

\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]

If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.

  • tauq[out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.

  • strideQ[in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).

  • taup[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.

  • strideP[in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytd2()#

rocblas_status rocsolver_dsytd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)#
rocblas_status rocsolver_ssytd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)#

SYTD2 computes the tridiagonal form of a real symmetric matrix A.

(This is the unblocked version of the algorithm).

The tridiagonal form is given by:

\[ T = Q' A Q \]

where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]

where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • D[out] pointer to type. Array on the GPU of dimension n. The diagonal elements of T.

  • E[out] pointer to type. Array on the GPU of dimension n-1. The off-diagonal elements of T.

  • tau[out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.

rocsolver_<type>sytd2_batched()#

rocblas_status rocsolver_dsytd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

SYTD2_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.

(This is the unblocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytd2_strided_batched()#

rocblas_status rocsolver_dsytd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

SYTD2_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.

(This is the unblocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hetd2()#

rocblas_status rocsolver_zhetd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)#
rocblas_status rocsolver_chetd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)#

HETD2 computes the tridiagonal form of a complex hermitian matrix A.

(This is the unblocked version of the algorithm).

The tridiagonal form is given by:

\[ T = Q' A Q \]

where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]

where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householders vector v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • D[out] pointer to real type. Array on the GPU of dimension n. The diagonal elements of T.

  • E[out] pointer to real type. Array on the GPU of dimension n-1. The off-diagonal elements of T.

  • tau[out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.

rocsolver_<type>hetd2_batched()#

rocblas_status rocsolver_zhetd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_chetd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

HETD2_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_l.

(This is the unblocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is Hermitian tridiagonal and \(Q_l\) is a unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hetd2_strided_batched()#

rocblas_status rocsolver_zhetd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_chetd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

HETD2_STRIDED_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_l.

(This is the unblocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is Hermitian tridiagonal and \(Q_l\) is a unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytrd()#

rocblas_status rocsolver_dsytrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)#
rocblas_status rocsolver_ssytrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)#

SYTRD computes the tridiagonal form of a real symmetric matrix A.

(This is the blocked version of the algorithm).

The tridiagonal form is given by:

\[ T = Q' A Q \]

where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H_(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]

where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • D[out] pointer to type. Array on the GPU of dimension n. The diagonal elements of T.

  • E[out] pointer to type. Array on the GPU of dimension n-1. The off-diagonal elements of T.

  • tau[out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.

rocsolver_<type>sytrd_batched()#

rocblas_status rocsolver_dsytrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

SYTRD_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.

(This is the blocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sytrd_strided_batched()#

rocblas_status rocsolver_dsytrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

SYTRD_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.

(This is the blocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hetrd()#

rocblas_status rocsolver_zhetrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)#
rocblas_status rocsolver_chetrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)#

HETRD computes the tridiagonal form of a complex hermitian matrix A.

(This is the blocked version of the algorithm).

The tridiagonal form is given by:

\[ T = Q' A Q \]

where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]

where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • D[out] pointer to real type. Array on the GPU of dimension n. The diagonal elements of T.

  • E[out] pointer to real type. Array on the GPU of dimension n-1. The off-diagonal elements of T.

  • tau[out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.

rocsolver_<type>hetrd_batched()#

rocblas_status rocsolver_zhetrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_chetrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

HETRD_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_l.

(This is the blocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is Hermitian tridiagonal and \(Q_l\) is a unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hetrd_strided_batched()#

rocblas_status rocsolver_zhetrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_chetrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#

HETRD_STRIDED_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_l.

(This is the blocked version of the algorithm).

The tridiagonal form of \(A_l\) is given by:

\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]

where \(T_l\) is Hermitian tridiagonal and \(Q_l\) is a unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]

Each Householder matrix \(H_l(i)\) is given by

\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]

where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.

  • tau[out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.

  • strideP[in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygs2()#

rocblas_status rocsolver_dsygs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *B, const rocblas_int ldb)#
rocblas_status rocsolver_ssygs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *B, const rocblas_int ldb)#

SYGS2 reduces a real symmetric-definite generalized eigenproblem to standard form.

(This is the unblocked version of the algorithm).

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U^{-T} A U^{-1}, & \: \text{or}\\ L^{-1} A L^{-T}, \end{array} \end{split}\]

where the symmetric-definite matrix B has been factorized as either \(U^T U\) or \(L L^T\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U A U^T, & \: \text{or}\\ L^T A L, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored, and whether the factorization applied to B was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the transformed matrix associated with the equivalent standard eigenvalue problem.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. The triangular factor of the matrix B, as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

rocsolver_<type>sygs2_batched()#

rocblas_status rocsolver_dsygs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

SYGS2_BATCHED reduces a batch of real symmetric-definite generalized eigenproblems to standard form.

(This is the unblocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-T} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-T}, \end{array} \end{split}\]

where the symmetric-definite matrix \(B_l\) has been factorized as either \(U_l^T U_l^{}\) or \(L_l^{} L_l^T\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^T, & \: \text{or}\\ L_l^T A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. The triangular factors of the matrices B_l, as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygs2_strided_batched()#

rocblas_status rocsolver_dsygs2_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygs2_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#

SYGS2_STRIDED_BATCHED reduces a batch of real symmetric-definite generalized eigenproblems to standard form.

(This is the unblocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-T} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-T}, \end{array} \end{split}\]

where the symmetric-definite matrix \(B_l\) has been factorized as either \(U_l^T U_l^{}\) or \(L_l^{} L_l^T\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^T, & \: \text{or}\\ L_l^T A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). The triangular factors of the matrices B_l, as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*n.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegs2()#

rocblas_status rocsolver_zhegs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb)#
rocblas_status rocsolver_chegs2(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb)#

HEGS2 reduces a hermitian-definite generalized eigenproblem to standard form.

(This is the unblocked version of the algorithm).

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U^{-H} A U^{-1}, & \: \text{or}\\ L^{-1} A L^{-H}, \end{array} \end{split}\]

where the hermitian-definite matrix B has been factorized as either \(U^H U\) or \(L L^H\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U A U^H, & \: \text{or}\\ L^H A L, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored, and whether the factorization applied to B was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the transformed matrix associated with the equivalent standard eigenvalue problem.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. The triangular factor of the matrix B, as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

rocsolver_<type>hegs2_batched()#

rocblas_status rocsolver_zhegs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_chegs2_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

HEGS2_BATCHED reduces a batch of hermitian-definite generalized eigenproblems to standard form.

(This is the unblocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-H} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-H}, \end{array} \end{split}\]

where the hermitian-definite matrix \(B_l\) has been factorized as either \(U_l^H U_l^{}\) or \(L_l^{} L_l^H\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^H, & \: \text{or}\\ L_l^H A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. The triangular factors of the matrices B_l, as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegs2_strided_batched()#

rocblas_status rocsolver_zhegs2_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_chegs2_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#

HEGS2_STRIDED_BATCHED reduces a batch of hermitian-definite generalized eigenproblems to standard form.

(This is the unblocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-H} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-H}, \end{array} \end{split}\]

where the hermitian-definite matrix \(B_l\) has been factorized as either \(U_l^H U_l^{}\) or \(L_l^{} L_l^H\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^H, & \: \text{or}\\ L_l^H A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). The triangular factors of the matrices B_l, as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*n.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygst()#

rocblas_status rocsolver_dsygst(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *B, const rocblas_int ldb)#
rocblas_status rocsolver_ssygst(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *B, const rocblas_int ldb)#

SYGST reduces a real symmetric-definite generalized eigenproblem to standard form.

(This is the blocked version of the algorithm).

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U^{-T} A U^{-1}, & \: \text{or}\\ L^{-1} A L^{-T}, \end{array} \end{split}\]

where the symmetric-definite matrix B has been factorized as either \(U^T U\) or \(L L^T\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U A U^T, & \: \text{or}\\ L^T A L, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored, and whether the factorization applied to B was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the transformed matrix associated with the equivalent standard eigenvalue problem.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. The triangular factor of the matrix B, as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

rocsolver_<type>sygst_batched()#

rocblas_status rocsolver_dsygst_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygst_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

SYGST_BATCHED reduces a batch of real symmetric-definite generalized eigenproblems to standard form.

(This is the blocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-T} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-T}, \end{array} \end{split}\]

where the symmetric-definite matrix \(B_l\) has been factorized as either \(U_l^T U_l^{}\) or \(L_l^{} L_l^T\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^T, & \: \text{or}\\ L_l^T A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. The triangular factors of the matrices B_l, as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygst_strided_batched()#

rocblas_status rocsolver_dsygst_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygst_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#

SYGST_STRIDED_BATCHED reduces a batch of real symmetric-definite generalized eigenproblems to standard form.

(This is the blocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-T} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-T}, \end{array} \end{split}\]

where the symmetric-definite matrix \(B_l\) has been factorized as either \(U_l^T U_l^{}\) or \(L_l^{} L_l^T\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^T, & \: \text{or}\\ L_l^T A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). The triangular factors of the matrices B_l, as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*n.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegst()#

rocblas_status rocsolver_zhegst(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb)#
rocblas_status rocsolver_chegst(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb)#

HEGST reduces a hermitian-definite generalized eigenproblem to standard form.

(This is the blocked version of the algorithm).

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U^{-H} A U^{-1}, & \: \text{or}\\ L^{-1} A L^{-H}, \end{array} \end{split}\]

where the hermitian-definite matrix B has been factorized as either \(U^H U\) or \(L L^H\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U A U^H, & \: \text{or}\\ L^H A L, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored, and whether the factorization applied to B was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the transformed matrix associated with the equivalent standard eigenvalue problem.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. The triangular factor of the matrix B, as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

rocsolver_<type>hegst_batched()#

rocblas_status rocsolver_zhegst_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_chegst_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

HEGST_BATCHED reduces a batch of hermitian-definite generalized eigenproblems to standard form.

(This is the blocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-H} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-H}, \end{array} \end{split}\]

where the hermitian-definite matrix \(B_l\) has been factorized as either \(U_l^H U_l^{}\) or \(L_l^{} L_l^H\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^H, & \: \text{or}\\ L_l^H A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. The triangular factors of the matrices B_l, as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegst_strided_batched()#

rocblas_status rocsolver_zhegst_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_chegst_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#

HEGST_STRIDED_BATCHED reduces a batch of hermitian-definite generalized eigenproblems to standard form.

(This is the blocked version of the algorithm).

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype.

If the problem is of the 1st form, then \(A_l\) is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{-H} A_l^{} U_l^{-1}, & \: \text{or}\\ L_l^{-1} A_l^{} L_l^{-H}, \end{array} \end{split}\]

where the hermitian-definite matrix \(B_l\) has been factorized as either \(U_l^H U_l^{}\) or \(L_l^{} L_l^H\) as returned by POTRF, depending on the value of uplo.

If the problem is of the 2nd or 3rd form, then A is overwritten with

\[\begin{split} \begin{array}{cl} U_l^{} A_l^{} U_l^H, & \: \text{or}\\ L_l^H A_l^{} L_l^{}, \end{array} \end{split}\]

also depending on the value of uplo.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored, and whether the factorization applied to B_l was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the transformed matrices associated with the equivalent standard eigenvalue problems.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). The triangular factors of the matrices B_l, as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*n.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

Linear-systems solvers#

rocsolver_<type>trtri()#

rocblas_status rocsolver_ztrtri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_ctrtri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dtrtri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_strtri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

TRTRI inverts a triangular n-by-n matrix A.

A can be upper or lower triangular, depending on the value of uplo, and unit or non-unit triangular, depending on the value of diag.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • diag[in] rocblas_diagonal. If diag indicates unit, then the diagonal elements of A are not referenced and assumed to be one.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the triangular matrix. On exit, the inverse of A if info = 0.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, A is singular. A[i,i] is the first zero element in the diagonal.

rocsolver_<type>trtri_batched()#

rocblas_status rocsolver_ztrtri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ctrtri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dtrtri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_strtri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

TRTRI_BATCHED inverts a batch of triangular n-by-n matrices \(A_l\).

\(A_l\) can be upper or lower triangular, depending on the value of uplo, and unit or non-unit triangular, depending on the value of diag.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • diag[in] rocblas_diagonal. If diag indicates unit, then the diagonal elements of matrices A_l are not referenced and assumed to be one.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the triangular matrices A_l. On exit, the inverses of A_l if info[l] = 0.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for inversion of A_l. If info[l] = i > 0, A_l is singular. A_l[i,i] is the first zero element in the diagonal.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>trtri_strided_batched()#

rocblas_status rocsolver_ztrtri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ctrtri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dtrtri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_strtri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_diagonal diag, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

TRTRI_STRIDED_BATCHED inverts a batch of triangular n-by-n matrices \(A_l\).

\(A_l\) can be upper or lower triangular, depending on the value of uplo, and unit or non-unit triangular, depending on the value of diag.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • diag[in] rocblas_diagonal. If diag indicates unit, then the diagonal elements of matrices A_l are not referenced and assumed to be one.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the triangular matrices A_l. On exit, the inverses of A_l if info[l] = 0.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for inversion of A_l. If info[l] = i > 0, A_l is singular. A_l[i,i] is the first zero element in the diagonal.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>getri()#

rocblas_status rocsolver_zgetri(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_cgetri(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dgetri(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_sgetri(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

GETRI inverts a general n-by-n matrix A using the LU factorization computed by GETRF.

The inverse is computed by solving the linear system

\[ A^{-1}L = U^{-1} \]

where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the factors L and U of the factorization A = P*L*U returned by GETRF. On exit, the inverse of A if info = 0; otherwise undefined.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • ipiv[in] pointer to rocblas_int. Array on the GPU of dimension n. The pivot indices returned by GETRF.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_batched()#

rocblas_status rocsolver_zgetri_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETRI_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_BATCHED.

The inverse of matrix \(A_l\) in the batch is computed by solving the linear system

\[ A_l^{-1} L_l^{} = U_l^{-1} \]

where \(L_l\) is the lower triangular factor of \(A_l\) with unit diagonal elements, and \(U_l\) is the upper triangular factor.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the factors L_l and U_l of the factorization A_l = P_l*L_l*U_l returned by GETRF_BATCHED. On exit, the inverses of A_l if info[l] = 0; otherwise undefined.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • ipiv[in] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). The pivot indices returned by GETRF_BATCHED.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for inversion of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>getri_strided_batched()#

rocblas_status rocsolver_zgetri_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETRI_STRIDED_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_STRIDED_BATCHED.

The inverse of matrix \(A_l\) in the batch is computed by solving the linear system

\[ A_l^{-1} L_l^{} = U_l^{-1} \]

where \(L_l\) is the lower triangular factor of \(A_l\) with unit diagonal elements, and \(U_l\) is the upper triangular factor.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the factors L_l and U_l of the factorization A_l = P_l*L_l*U_l returned by GETRF_STRIDED_BATCHED. On exit, the inverses of A_l if info[l] = 0; otherwise undefined.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[in] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). The pivot indices returned by GETRF_STRIDED_BATCHED.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for inversion of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>getrs()#

rocblas_status rocsolver_zgetrs_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, rocblas_double_complex *A, const int64_t lda, const int64_t *ipiv, rocblas_double_complex *B, const int64_t ldb)#
rocblas_status rocsolver_cgetrs_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, rocblas_float_complex *A, const int64_t lda, const int64_t *ipiv, rocblas_float_complex *B, const int64_t ldb)#
rocblas_status rocsolver_dgetrs_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, double *A, const int64_t lda, const int64_t *ipiv, double *B, const int64_t ldb)#
rocblas_status rocsolver_sgetrs_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, float *A, const int64_t lda, const int64_t *ipiv, float *B, const int64_t ldb)#
rocblas_status rocsolver_zgetrs(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int *ipiv, rocblas_double_complex *B, const rocblas_int ldb)#
rocblas_status rocsolver_cgetrs(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int *ipiv, rocblas_float_complex *B, const rocblas_int ldb)#
rocblas_status rocsolver_dgetrs(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_int *ipiv, double *B, const rocblas_int ldb)#
rocblas_status rocsolver_sgetrs(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_int *ipiv, float *B, const rocblas_int ldb)#

GETRS solves a system of n linear equations on n variables in its factorized form.

It solves one of the following systems, depending on the value of trans:

\[\begin{split} \begin{array}{cl} A X = B & \: \text{not transposed,}\\ A^T X = B & \: \text{transposed, or}\\ A^H X = B & \: \text{conjugate transposed.} \end{array} \end{split}\]

Matrix A is defined by its triangular factors as returned by GETRF.

Parameters:
  • handle[in] rocblas_handle.

  • trans[in] rocblas_operation. Specifies the form of the system of equations.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of A.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of the matrix B.

  • A[in] pointer to type. Array on the GPU of dimension lda*n. The factors L and U of the factorization A = P*L*U returned by GETRF.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • ipiv[in] pointer to rocblas_int. Array on the GPU of dimension n. The pivot indices returned by GETRF.

  • B[inout] pointer to type. Array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrix B. On exit, the solution matrix X.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of B.

rocsolver_<type>getrs_batched()#

rocblas_status rocsolver_zgetrs_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, rocblas_double_complex *const A[], const int64_t lda, const int64_t *ipiv, const rocblas_stride strideP, rocblas_double_complex *const B[], const int64_t ldb, const int64_t batch_count)#
rocblas_status rocsolver_cgetrs_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, rocblas_float_complex *const A[], const int64_t lda, const int64_t *ipiv, const rocblas_stride strideP, rocblas_float_complex *const B[], const int64_t ldb, const int64_t batch_count)#
rocblas_status rocsolver_dgetrs_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, double *const A[], const int64_t lda, const int64_t *ipiv, const rocblas_stride strideP, double *const B[], const int64_t ldb, const int64_t batch_count)#
rocblas_status rocsolver_sgetrs_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, float *const A[], const int64_t lda, const int64_t *ipiv, const rocblas_stride strideP, float *const B[], const int64_t ldb, const int64_t batch_count)#
rocblas_status rocsolver_zgetrs_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrs_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrs_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, double *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrs_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, const rocblas_int *ipiv, const rocblas_stride strideP, float *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

GETRS_BATCHED solves a batch of systems of n linear equations on n variables in its factorized forms.

For each instance l in the batch, it solves one of the following systems, depending on the value of trans:

\[\begin{split} \begin{array}{cl} A_l X_l = B_l & \: \text{not transposed,}\\ A_l^T X_l^{} = B_l^{} & \: \text{transposed, or}\\ A_l^H X_l^{} = B_l^{} & \: \text{conjugate transposed.} \end{array} \end{split}\]

Matrix \(A_l\) is defined by its triangular factors as returned by GETRF_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • trans[in] rocblas_operation. Specifies the form of the system of equations of each instance in the batch.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[in] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. The factors L_l and U_l of the factorization A_l = P_l*L_l*U_l returned by GETRF_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • ipiv[in] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of pivot indices returned by GETRF_BATCHED.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • B[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

rocsolver_<type>getrs_strided_batched()#

rocblas_status rocsolver_zgetrs_strided_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, rocblas_double_complex *A, const int64_t lda, const rocblas_stride strideA, const int64_t *ipiv, const rocblas_stride strideP, rocblas_double_complex *B, const int64_t ldb, const rocblas_stride strideB, const int64_t batch_count)#
rocblas_status rocsolver_cgetrs_strided_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, rocblas_float_complex *A, const int64_t lda, const rocblas_stride strideA, const int64_t *ipiv, const rocblas_stride strideP, rocblas_float_complex *B, const int64_t ldb, const rocblas_stride strideB, const int64_t batch_count)#
rocblas_status rocsolver_dgetrs_strided_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, double *A, const int64_t lda, const rocblas_stride strideA, const int64_t *ipiv, const rocblas_stride strideP, double *B, const int64_t ldb, const rocblas_stride strideB, const int64_t batch_count)#
rocblas_status rocsolver_sgetrs_strided_batched_64(rocblas_handle handle, const rocblas_operation trans, const int64_t n, const int64_t nrhs, float *A, const int64_t lda, const rocblas_stride strideA, const int64_t *ipiv, const rocblas_stride strideP, float *B, const int64_t ldb, const rocblas_stride strideB, const int64_t batch_count)#
rocblas_status rocsolver_zgetrs_strided_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrs_strided_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrs_strided_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, double *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrs_strided_batched(rocblas_handle handle, const rocblas_operation trans, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, const rocblas_int *ipiv, const rocblas_stride strideP, float *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#

GETRS_STRIDED_BATCHED solves a batch of systems of n linear equations on n variables in its factorized forms.

For each instance l in the batch, it solves one of the following systems, depending on the value of trans:

\[\begin{split} \begin{array}{cl} A_l X_l = B_l & \: \text{not transposed,}\\ A_l^T X_l^{} = B_l^{} & \: \text{transposed, or}\\ A_l^H X_l^{} = B_l^{} & \: \text{conjugate transposed.} \end{array} \end{split}\]

Matrix \(A_l\) is defined by its triangular factors as returned by GETRF_STRIDED_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • trans[in] rocblas_operation. Specifies the form of the system of equations of each instance in the batch.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[in] pointer to type. Array on the GPU (the size depends on the value of strideA). The factors L_l and U_l of the factorization A_l = P_l*L_l*U_l returned by GETRF_STRIDED_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[in] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of pivot indices returned by GETRF_STRIDED_BATCHED.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • B[inout] pointer to type. Array on the GPU (size depends on the value of strideB). On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

rocsolver_<type>gesv()#

rocblas_status rocsolver_zgesv(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_double_complex *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_cgesv(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_float_complex *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_dgesv(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, rocblas_int *ipiv, double *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_sgesv(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, rocblas_int *ipiv, float *B, const rocblas_int ldb, rocblas_int *info)#

GESV solves a general system of n linear equations on n variables.

The linear system is of the form

\[ A X = B \]

where A is a general n-by-n matrix. Matrix A is first factorized in triangular factors L and U using GETRF; then, the solution is computed with GETRS.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of A.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of the matrix B.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, if info = 0, the factors L and U of the LU decomposition of A returned by GETRF.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The pivot indices returned by GETRF.

  • B[inout] pointer to type. Array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrix B. On exit, the solution matrix X.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of B.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, U is singular, and the solution could not be computed. U[i,i] is the first zero element in the diagonal.

rocsolver_<type>gesv_batched()#

rocblas_status rocsolver_zgesv_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgesv_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgesv_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, double *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgesv_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, float *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#

GESV_BATCHED solves a batch of general systems of n linear equations on n variables.

The linear systems are of the form

\[ A_l X_l = B_l \]

where \(A_l\) is a general n-by-n matrix. Matrix \(A_l\) is first factorized in triangular factors \(L_l\) and \(U_l\) using GETRF_BATCHED; then, the solutions are computed with GETRS_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, if info[l] = 0, the factors L_l and U_l of the LU decomposition of A_l returned by GETRF_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • ipiv[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). The vectors ipiv_l of pivot indices returned by GETRF_BATCHED.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • B[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for A_l. If info[l] = i > 0, U_l is singular, and the solution could not be computed. U_l[i,i] is the first zero element in the diagonal.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

rocsolver_<type>gesv_strided_batched()#

rocblas_status rocsolver_zgesv_strided_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgesv_strided_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgesv_strided_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, double *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgesv_strided_batched(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, float *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#

GESV_STRIDED_BATCHED solves a batch of general systems of n linear equations on n variables.

The linear systems are of the form

\[ A_l X_l = B_l \]

where \(A_l\) is a general n-by-n matrix. Matrix \(A_l\) is first factorized in triangular factors \(L_l\) and \(U_l\) using GETRF_STRIDED_BATCHED; then, the solutions are computed with GETRS_STRIDED_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[in] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, if info[l] = 0, the factors L_l and U_l of the LU decomposition of A_l returned by GETRF_STRIDED_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • strideA[inout] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). The vectors ipiv_l of pivot indices returned by GETRF_STRIDED_BATCHED.

  • strideP[in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • B[inout] pointer to type. Array on the GPU (size depends on the value of strideB). On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for A_l. If info[l] = i > 0, U_l is singular, and the solution could not be computed. U_l[i,i] is the first zero element in the diagonal.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

rocsolver_<type>potri()#

rocblas_status rocsolver_zpotri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cpotri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dpotri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_spotri(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

POTRI inverts a symmetric/hermitian positive definite matrix A.

The inverse of matrix \(A\) is computed as

\[\begin{split} \begin{array}{cl} A^{-1} = U^{-1} {U^{-1}}' & \: \text{if uplo is upper, or}\\ A^{-1} = {L^{-1}}' L^{-1} & \: \text{if uplo is lower.} \end{array} \end{split}\]

where \(U\) or \(L\) is the triangular factor of the Cholesky factorization of \(A\) returned by POTRF.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the factor L or U of the Cholesky factorization of A returned by POTRF. On exit, the inverse of A if info = 0.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit for inversion of A. If info = i > 0, A is singular. L[i,i] or U[i,i] is zero.

rocsolver_<type>potri_batched()#

rocblas_status rocsolver_zpotri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotri_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

POTRI_BATCHED inverts a batch of symmetric/hermitian positive definite matrices \(A_l\).

The inverse of matrix \(A_l\) in the batch is computed as

\[\begin{split} \begin{array}{cl} A_l^{-1} = U_l^{-1} {U_l^{-1}}' & \: \text{if uplo is upper, or}\\ A_l^{-1} = {L_l^{-1}}' L_l^{-1} & \: \text{if uplo is lower.} \end{array} \end{split}\]

where \(U_l\) or \(L_l\) is the triangular factor of the Cholesky factorization of \(A_l\) returned by POTRF_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the factor L_l or U_l of the Cholesky factorization of A_l returned by POTRF_BATCHED. On exit, the inverses of A_l if info[l] = 0.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for inversion of A_l. If info[l] = i > 0, A_l is singular. L_l[i,i] or U_l[i,i] is zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>potri_strided_batched()#

rocblas_status rocsolver_zpotri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotri_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

POTRI_STRIDED_BATCHED inverts a batch of symmetric/hermitian positive definite matrices \(A_l\).

The inverse of matrix \(A_l\) in the batch is computed as

\[\begin{split} \begin{array}{cl} A_l^{-1} = U_l^{-1} {U_l^{-1}}' & \: \text{if uplo is upper, or}\\ A_l^{-1} = {L_l^{-1}}' L_l^{-1} & \: \text{if uplo is lower.} \end{array} \end{split}\]

where \(U_l\) or \(L_l\) is the triangular factor of the Cholesky factorization of \(A_l\) returned by POTRF_STRIDED_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the factor L_l or U_l of the Cholesky factorization of A_l returned by POTRF_STRIDED_BATCHED. On exit, the inverses of A_l if info[l] = 0.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for inversion of A_l. If info[l] = i > 0, A_l is singular. L_l[i,i] or U_l[i,i] is zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>potrs()#

rocblas_status rocsolver_zpotrs(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb)#
rocblas_status rocsolver_cpotrs(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb)#
rocblas_status rocsolver_dpotrs(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, double *B, const rocblas_int ldb)#
rocblas_status rocsolver_spotrs(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, float *B, const rocblas_int ldb)#

POTRS solves a symmetric/hermitian system of n linear equations on n variables in its factorized form.

It solves the system

\[ A X = B \]

where A is a real symmetric (complex hermitian) positive definite matrix defined by its triangular factor

\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]

as returned by POTRF.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of A.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of the matrix B.

  • A[in] pointer to type. Array on the GPU of dimension lda*n. The factor L or U of the Cholesky factorization of A returned by POTRF.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • B[inout] pointer to type. Array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrix B. On exit, the solution matrix X.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of B.

rocsolver_<type>potrs_batched()#

rocblas_status rocsolver_zpotrs_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotrs_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotrs_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, const rocblas_int batch_count)#
rocblas_status rocsolver_spotrs_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, const rocblas_int batch_count)#

POTRS_BATCHED solves a batch of symmetric/hermitian systems of n linear equations on n variables in its factorized forms.

For each instance l in the batch, it solves the system

\[ A_l X_l = B_l \]

where \(A_l\) is a real symmetric (complex hermitian) positive definite matrix defined by its triangular factor

\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]

as returned by POTRF_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[in] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. The factor L_l or U_l of the Cholesky factorization of A_l returned by POTRF_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • B[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

rocsolver_<type>potrs_strided_batched()#

rocblas_status rocsolver_zpotrs_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotrs_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotrs_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#
rocblas_status rocsolver_spotrs_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, const rocblas_int batch_count)#

POTRS_STRIDED_BATCHED solves a batch of symmetric/hermitian systems of n linear equations on n variables in its factorized forms.

For each instance l in the batch, it solves the system

\[ A_l X_l = B_l \]

where \(A_l\) is a real symmetric (complex hermitian) positive definite matrix defined by its triangular factor

\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]

as returned by POTRF_STRIDED_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[in] pointer to type. Array on the GPU (the size depends on the value of strideA). The factor L_l or U_l of the Cholesky factorization of A_l returned by POTRF_STRIDED_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[inout] pointer to type. Array on the GPU (size depends on the value of strideB). On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

rocsolver_<type>posv()#

rocblas_status rocsolver_zposv(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_cposv(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_dposv(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, double *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_sposv(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, float *B, const rocblas_int ldb, rocblas_int *info)#

POSV solves a symmetric/hermitian system of n linear equations on n variables.

It solves the system

\[ A X = B \]

where A is a real symmetric (complex hermitian) positive definite matrix. Matrix A is first factorized as \(A=LL'\) or \(A=U'U\), depending on the value of uplo, using POTRF; then, the solution is computed with POTRS.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of A.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of the matrix B.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric/hermitian matrix A. On exit, if info = 0, the factor L or U of the Cholesky factorization of A returned by POTRF.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • B[inout] pointer to type. Array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrix B. On exit, the solution matrix X.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of B.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, the leading minor of order i of A is not positive definite. The solution could not be computed.

rocsolver_<type>posv_batched()#

rocblas_status rocsolver_zposv_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cposv_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dposv_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sposv_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#

POSV_BATCHED solves a batch of symmetric/hermitian systems of n linear equations on n variables.

For each instance l in the batch, it solves the system

\[ A_l X_l = B_l \]

where \(A_l\) is a real symmetric (complex hermitian) positive definite matrix. Matrix \(A_l\) is first factorized as \(A_l^{}=L_l^{}L_l'\) or \(A_l^{}=U_l'U_l^{}\), depending on the value of uplo, using POTRF_BATCHED; then, the solution is computed with POTRS_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the symmetric/hermitian matrices A_l. On exit, if info[l] = 0, the factor L_l or U_l of the Cholesky factorization of A_l returned by POTRF_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • B[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*nrhs. On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th solution could not be computed.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

rocsolver_<type>posv_strided_batched()#

rocblas_status rocsolver_zposv_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cposv_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dposv_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sposv_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#

POSV_STRIDED_BATCHED solves a batch of symmetric/hermitian systems of n linear equations on n variables.

For each instance l in the batch, it solves the system

\[ A_l X_l = B_l \]

where \(A_l\) is a real symmetric (complex hermitian) positive definite matrix. Matrix \(A_l\) is first factorized as \(A_l^{}=L_l^{}L_l'\) or \(A_l^{}=U_l'U_l^{}\), depending on the value of uplo, using POTRF_STRIDED_BATCHED; then, the solution is computed with POTRS_STRIDED_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. The order of the system, i.e. the number of columns and rows of all A_l matrices.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of right hand sides, i.e., the number of columns of all the matrices B_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the symmetric/hermitian matrices A_l. On exit, if info[l] = 0, the factor L_l or U_l of the Cholesky factorization of A_l returned by POTRF_STRIDED_BATCHED.

  • lda[in] rocblas_int. lda >= n. The leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[inout] pointer to type. Array on the GPU (size depends on the value of strideB). On entry, the right hand side matrices B_l. On exit, the solution matrix X_l of each system in the batch.

  • ldb[in] rocblas_int. ldb >= n. The leading dimension of matrices B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th solution could not be computed.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of instances (systems) in the batch.

Least-squares solvers#

rocsolver_<type>gels()#

rocblas_status rocsolver_zgels(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_cgels(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_dgels(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, double *B, const rocblas_int ldb, rocblas_int *info)#
rocblas_status rocsolver_sgels(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, float *B, const rocblas_int ldb, rocblas_int *info)#

GELS solves an overdetermined (or underdetermined) linear system defined by an m-by-n matrix A, and a corresponding matrix B, using the QR factorization computed by GEQRF (or the LQ factorization computed by GELQF).

Depending on the value of trans, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = B & \: \text{not transposed, or}\\ A' X = B & \: \text{transposed if real, or conjugate transposed if complex} \end{array} \end{split}\]

If m >= n (or m < n in the case of transpose/conjugate transpose), the system is overdetermined and a least-squares solution approximating X is found by minimizing

\[ || B - A X || \quad \text{(or} \: || B - A' X ||\text{)} \]

If m < n (or m >= n in the case of transpose/conjugate transpose), the system is underdetermined and a unique solution for X is chosen such that \(|| X ||\) is minimal.

Parameters:
  • handle[in] rocblas_handle.

  • trans[in] rocblas_operation. Specifies the form of the system of equations.

  • m[in] rocblas_int. m >= 0. The number of rows of matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of matrix A.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of columns of matrices B and X; i.e., the columns on the right hand side.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the QR (or LQ) factorization of A as returned by GEQRF (or GELQF).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrix A.

  • B[inout] pointer to type. Array on the GPU of dimension ldb*nrhs. On entry, the matrix B. On exit, when info = 0, B is overwritten by the solution vectors (and the residuals in the overdetermined cases) stored as columns.

  • ldb[in] rocblas_int. ldb >= max(m,n). Specifies the leading dimension of matrix B.

  • info[out] pointer to rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, the solution could not be computed because input matrix A is rank deficient; the i-th diagonal element of its triangular factor is zero.

rocsolver_<type>gels_batched()#

rocblas_status rocsolver_zgels_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgels_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgels_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgels_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, rocblas_int *info, const rocblas_int batch_count)#

GELS_BATCHED solves a batch of overdetermined (or underdetermined) linear systems defined by a set of m-by-n matrices \(A_l\), and corresponding matrices \(B_l\), using the QR factorizations computed by GEQRF_BATCHED (or the LQ factorizations computed by GELQF_BATCHED).

For each instance in the batch, depending on the value of trans, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = B_l & \: \text{not transposed, or}\\ A_l' X_l^{} = B_l^{} & \: \text{transposed if real, or conjugate transposed if complex} \end{array} \end{split}\]

If m >= n (or m < n in the case of transpose/conjugate transpose), the system is overdetermined and a least-squares solution approximating X_l is found by minimizing

\[ || B_l - A_l X_l || \quad \text{(or} \: || B_l^{} - A_l' X_l^{} ||\text{)} \]

If m < n (or m >= n in the case of transpose/conjugate transpose), the system is underdetermined and a unique solution for X_l is chosen such that \(|| X_l ||\) is minimal.

Parameters:
  • handle[in] rocblas_handle.

  • trans[in] rocblas_operation. Specifies the form of the system of equations.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of columns of all matrices B_l and X_l in the batch; i.e., the columns on the right hand side.

  • A[inout] array of pointer to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the QR (or LQ) factorizations of A_l as returned by GEQRF_BATCHED (or GELQF_BATCHED).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • B[inout] array of pointer to type. Each pointer points to an array on the GPU of dimension ldb*nrhs. On entry, the matrices B_l. On exit, when info[l] = 0, B_l is overwritten by the solution vectors (and the residuals in the overdetermined cases) stored as columns.

  • ldb[in] rocblas_int. ldb >= max(m,n). Specifies the leading dimension of matrices B_l.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for solution of A_l. If info[l] = i > 0, the solution of A_l could not be computed because input matrix A_l is rank deficient; the i-th diagonal element of its triangular factor is zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gels_strided_batched()#

rocblas_status rocsolver_zgels_strided_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgels_strided_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgels_strided_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgels_strided_batched(rocblas_handle handle, rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int nrhs, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, rocblas_int *info, const rocblas_int batch_count)#

GELS_STRIDED_BATCHED solves a batch of overdetermined (or underdetermined) linear systems defined by a set of m-by-n matrices \(A_l\), and corresponding matrices \(B_l\), using the QR factorizations computed by GEQRF_STRIDED_BATCHED (or the LQ factorizations computed by GELQF_STRIDED_BATCHED).

For each instance in the batch, depending on the value of trans, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = B_l & \: \text{not transposed, or}\\ A_l' X_l^{} = B_l^{} & \: \text{transposed if real, or conjugate transposed if complex} \end{array} \end{split}\]

If m >= n (or m < n in the case of transpose/conjugate transpose), the system is overdetermined and a least-squares solution approximating X_l is found by minimizing

\[ || B_l - A_l X_l || \quad \text{(or} \: || B_l^{} - A_l' X_l^{} ||\text{)} \]

If m < n (or m >= n in the case of transpose/conjugate transpose), the system is underdetermined and a unique solution for X_l is chosen such that \(|| X_l ||\) is minimal.

Parameters:
  • handle[in] rocblas_handle.

  • trans[in] rocblas_operation. Specifies the form of the system of equations.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • nrhs[in] rocblas_int. nrhs >= 0. The number of columns of all matrices B_l and X_l in the batch; i.e., the columns on the right hand side.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the QR (or LQ) factorizations of A_l as returned by GEQRF_STRIDED_BATCHED (or GELQF_STRIDED_BATCHED).

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • B[inout] pointer to type. Array on the GPU (the size depends on the value of strideB). On entry, the matrices B_l. On exit, when info[l] = 0, each B_l is overwritten by the solution vectors (and the residuals in the overdetermined cases) stored as columns.

  • ldb[in] rocblas_int. ldb >= max(m,n). Specifies the leading dimension of matrices B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use case is strideB >= ldb*nrhs

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for solution of A_l. If info[l] = i > 0, the solution of A_l could not be computed because input matrix A_l is rank deficient; the i-th diagonal element of its triangular factor is zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

Symmetric eigensolvers#

rocsolver_<type>syev()#

rocblas_status rocsolver_dsyev(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_ssyev(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, rocblas_int *info)#

SYEV computes the eigenvalues and optionally the eigenvectors of a real symmetric matrix A.

The eigenvalues are returned in ascending order. The eigenvectors are computed depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the eigenvectors of A if they were computed and the algorithm converged; otherwise the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • D[out] pointer to type. Array on the GPU of dimension n. The eigenvalues of A in increasing order.

  • E[out] pointer to type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with A. On exit, if info > 0, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues of A (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, the algorithm did not converge. i elements of E did not converge to zero.

rocsolver_<type>syev_batched()#

rocblas_status rocsolver_dsyev_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssyev_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYEV_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of real symmetric matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i elements of E_l did not converge to zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>syev_strided_batched()#

rocblas_status rocsolver_dsyev_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssyev_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYEV_STRIDED_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of real symmetric matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i elements of E_l did not converge to zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>heev()#

rocblas_status rocsolver_zheev(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_cheev(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_int *info)#

HEEV computes the eigenvalues and optionally the eigenvectors of a Hermitian matrix A.

The eigenvalues are returned in ascending order. The eigenvectors are computed depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the Hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the eigenvectors of A if they were computed and the algorithm converged; otherwise the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • D[out] pointer to real type. Array on the GPU of dimension n. The eigenvalues of A in increasing order.

  • E[out] pointer to real type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with A. On exit, if info > 0, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues of A (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, the algorithm did not converge. i elements of E did not converge to zero.

rocsolver_<type>heev_batched()#

rocblas_status rocsolver_zheev_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cheev_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEEV_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of Hermitian matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the Hermitian matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i elements of E_l did not converge to zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>heev_strided_batched()#

rocblas_status rocsolver_zheev_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cheev_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEEV_STRIDED_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of Hermitian matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the Hermitian matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i elements of E_l did not converge to zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>syevd()#

rocblas_status rocsolver_dsyevd(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_ssyevd(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, rocblas_int *info)#

SYEVD computes the eigenvalues and optionally the eigenvectors of a real symmetric matrix A.

The eigenvalues are returned in ascending order. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the eigenvectors of A if they were computed and the algorithm converged; otherwise the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • D[out] pointer to type. Array on the GPU of dimension n. The eigenvalues of A in increasing order.

  • E[out] pointer to type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with A. On exit, if info > 0, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues of A (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0 and evect is rocblas_evect_none, the algorithm did not converge. i elements of E did not converge to zero. If info = i > 0 and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)].

rocsolver_<type>syevd_batched()#

rocblas_status rocsolver_dsyevd_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssyevd_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYEVD_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of real symmetric matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0 and evect is rocblas_evect_none, the algorithm did not converge. i elements of E_l did not converge to zero. If info[l] = i > 0 and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)].

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>syevd_strided_batched()#

rocblas_status rocsolver_dsyevd_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssyevd_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYEVD_STRIDED_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of real symmetric matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0 and evect is rocblas_evect_none, the algorithm did not converge. i elements of E_l did not converge to zero. If info[l] = i > 0 and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)].

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>heevd()#

rocblas_status rocsolver_zheevd(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_cheevd(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_int *info)#

HEEVD computes the eigenvalues and optionally the eigenvectors of a Hermitian matrix A.

The eigenvalues are returned in ascending order. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the Hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the eigenvectors of A if they were computed and the algorithm converged; otherwise the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • D[out] pointer to real type. Array on the GPU of dimension n. The eigenvalues of A in increasing order.

  • E[out] pointer to real type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with A. On exit, if info > 0, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues of A (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0 and evect is rocblas_evect_none, the algorithm did not converge. i elements of E did not converge to zero. If info = i > 0 and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)].

rocsolver_<type>heevd_batched()#

rocblas_status rocsolver_zheevd_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cheevd_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEEVD_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of Hermitian matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the Hermitian matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0 and evect is rocblas_evect_none, the algorithm did not converge. i elements of E_l did not converge to zero. If info[l] = i > 0 and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)].

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>heevd_strided_batched()#

rocblas_status rocsolver_zheevd_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cheevd_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEEVD_STRIDED_BATCHED computes the eigenvalues and optionally the eigenvectors of a batch of Hermitian matrices A_l.

The eigenvalues are returned in ascending order. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect. The computed eigenvectors are orthonormal.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the Hermitian matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the eigenvectors of A_l if they were computed and the algorithm converged; otherwise the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The eigenvalues of A_l in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with A_l. On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0 and evect is rocblas_evect_none, the algorithm did not converge. i elements of E_l did not converge to zero. If info[l] = i > 0 and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)].

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>syevx()#

rocblas_status rocsolver_dsyevx(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, double *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_ssyevx(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, float *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#

SYEVX computes a set of the eigenvalues and optionally the corresponding eigenvectors of a real symmetric matrix A.

This function computes all the eigenvalues of A, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • vl[in] type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • vu[in] type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • abstol[in] type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to a rocblas_int on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev = n. If erange is rocblas_erange_index, nev = iu - il + 1. Otherwise, 0 <= nev <= n.

  • W[out] pointer to type. Array on the GPU of dimension n. The first nev elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • Z[out] pointer to type. Array on the GPU of dimension ldz*nev. On exit, if evect is not rocblas_evect_none and info = 0, the first nev columns contain the eigenvectors of A corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev are not known in advance. The user should ensure that Z is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrix Z.

  • ifail[out] pointer to rocblas_int. Array on the GPU of dimension n. If info = 0, the first nev elements of ifail are zero. Otherwise, contains the indices of those eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, the algorithm did not converge. i columns of Z did not converge.

rocsolver_<type>syevx_batched()#

rocblas_status rocsolver_dsyevx_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, double *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssyevx_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, float *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

SYEVX_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of real symmetric matrices A_l.

This function computes all the eigenvalues of A_l, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • vl[in] type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldz*nev[l]. On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. Otherwise, contains the indices of those eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i columns of Z_l did not converge.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>syevx_strided_batched()#

rocblas_status rocsolver_dsyevx_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, double *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssyevx_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, float *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

SYEVX_STRIDED_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of real symmetric matrices A_l.

This function computes all the eigenvalues of A_l, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • vl[in] type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] pointer to type. Array on the GPU (the size depends on the value of strideZ). On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • strideZ[in] rocblas_stride. Stride from the start of one matrix Z_l to the next one Z_(l+1). There is no restriction for the value of strideZ. Normal use case is strideZ >= ldz*nev[l]. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. Otherwise, contains the indices of those eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i columns of Z_l did not converge.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>heevx()#

rocblas_status rocsolver_zheevx(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, rocblas_double_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_cheevx(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, rocblas_float_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#

HEEVX computes a set of the eigenvalues and optionally the corresponding eigenvectors of a Hermitian matrix A.

This function computes all the eigenvalues of A, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • vl[in] real type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • vu[in] real type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • abstol[in] real type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to a rocblas_int on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev = n. If erange is rocblas_erange_index, nev = iu - il + 1. Otherwise, 0 <= nev <= n.

  • W[out] pointer to real type. Array on the GPU of dimension n. The first nev elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • Z[out] pointer to type. Array on the GPU of dimension ldz*nev. On exit, if evect is not rocblas_evect_none and info = 0, the first nev columns contain the eigenvectors of A corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev are not known in advance. The user should ensure that Z is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrix Z.

  • ifail[out] pointer to rocblas_int. Array on the GPU of dimension n. If info = 0, the first nev elements of ifail are zero. Otherwise, contains the indices of those eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, the algorithm did not converge. i columns of Z did not converge.

rocsolver_<type>heevx_batched()#

rocblas_status rocsolver_zheevx_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, rocblas_double_complex *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cheevx_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, rocblas_float_complex *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

HEEVX_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of Hermitian matrices A_l.

This function computes all the eigenvalues of A_l, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • vl[in] real type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] real type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] real type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to real type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldz*nev[l]. On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. Otherwise, contains the indices of those eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i columns of Z_l did not converge.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>heevx_strided_batched()#

rocblas_status rocsolver_zheevx_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, rocblas_double_complex *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cheevx_strided_batched(rocblas_handle handle, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, rocblas_float_complex *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

HEEVX_STRIDED_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of Hermitian matrices A_l.

This function computes all the eigenvalues of A_l, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrices A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.

  • n[in] rocblas_int. n >= 0. Number of rows and columns of matrices A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • vl[in] real type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] real type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] real type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to real type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] pointer to type. Array on the GPU (the size depends on the value of strideZ). On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • strideZ[in] rocblas_stride. Stride from the start of one matrix Z_l to the next one Z_(l+1). There is no restriction for the value of strideZ. Normal use case is strideZ >= ldz*nev[l]. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. Otherwise, contains the indices of those eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for matrix A_l. If info[l] = i > 0, the algorithm did not converge. i columns of Z_l did not converge.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygv()#

rocblas_status rocsolver_dsygv(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *B, const rocblas_int ldb, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_ssygv(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *B, const rocblas_int ldb, float *D, float *E, rocblas_int *info)#

SYGV computes the eigenvalues and (optionally) eigenvectors of a real generalized symmetric-definite eigenproblem.

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix Z of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z^T B Z=I & \: \text{if 1st or 2nd form, or}\\ Z^T B^{-1} Z=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A and B are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric matrix A. On exit, if evect is original, the normalized matrix Z of eigenvectors. If evect is none, then the upper or lower triangular part of the matrix A (including the diagonal) is destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. On entry, the symmetric positive definite matrix B. On exit, the triangular factor of B as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

  • D[out] pointer to type. Array on the GPU of dimension n. On exit, the eigenvalues in increasing order.

  • E[out] pointer to type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with the reduced eigenvalue problem. On exit, if 0 < info <= n, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i <= n, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info = n + i, the leading minor of order i of B is not positive definite.

rocsolver_<type>sygv_batched()#

rocblas_status rocsolver_dsygv_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygv_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYGV_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of real generalized symmetric-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^T B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^T B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the symmetric matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. On entry, the symmetric positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, E_l contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch instance l. If info[l] = i <= n, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygv_strided_batched()#

rocblas_status rocsolver_dsygv_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygv_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYGV_STRIDED_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of real generalized symmetric-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^T B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^T B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the symmetric matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). On entry, the symmetric positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use is strideB >= ldb*n.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, it contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch j. If info[l] = i <= n, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegv()#

rocblas_status rocsolver_zhegv(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_chegv(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb, float *D, float *E, rocblas_int *info)#

HEGV computes the eigenvalues and (optionally) eigenvectors of a complex generalized hermitian-definite eigenproblem.

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix Z of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z^H B Z=I & \: \text{if 1st or 2nd form, or}\\ Z^H B^{-1} Z=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A and B are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the hermitian matrix A. On exit, if evect is original, the normalized matrix Z of eigenvectors. If evect is none, then the upper or lower triangular part of the matrix A (including the diagonal) is destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. On entry, the hermitian positive definite matrix B. On exit, the triangular factor of B as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

  • D[out] pointer to real type. Array on the GPU of dimension n. On exit, the eigenvalues in increasing order.

  • E[out] pointer to real type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with the reduced eigenvalue problem. On exit, if 0 < info <= n, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i <= n, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info = n + i, the leading minor of order i of B is not positive definite.

rocsolver_<type>hegv_batched()#

rocblas_status rocsolver_zhegv_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_chegv_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEGV_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of complex generalized hermitian-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^H B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^H B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the hermitian matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. On entry, the hermitian positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, it contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegv_strided_batched()#

rocblas_status rocsolver_zhegv_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_chegv_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEGV_STRIDED_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of complex generalized hermitian-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^H B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^H B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the hermitian matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). On entry, the hermitian positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use is strideB >= ldb*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, it contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygvd()#

rocblas_status rocsolver_dsygvd(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *B, const rocblas_int ldb, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_ssygvd(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *B, const rocblas_int ldb, float *D, float *E, rocblas_int *info)#

SYGVD computes the eigenvalues and (optionally) eigenvectors of a real generalized symmetric-definite eigenproblem.

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect.

When computed, the matrix Z of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z^T B Z=I & \: \text{if 1st or 2nd form, or}\\ Z^T B^{-1} Z=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A and B are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric matrix A. On exit, if evect is original, the normalized matrix Z of eigenvectors. If evect is none, then the upper or lower triangular part of the matrix A (including the diagonal) is destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. On entry, the symmetric positive definite matrix B. On exit, the triangular factor of B as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

  • D[out] pointer to type. Array on the GPU of dimension n. On exit, the eigenvalues in increasing order.

  • E[out] pointer to type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with the reduced eigenvalue problem. On exit, if 0 < info <= n, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i <= n and evect is rocblas_evect_none, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info = i <= n and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)]. If info = n + i, the leading minor of order i of B is not positive definite.

rocsolver_<type>sygvd_batched()#

rocblas_status rocsolver_dsygvd_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygvd_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYGVD_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of real generalized symmetric-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^T B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^T B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the symmetric matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. On entry, the symmetric positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, it contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n and evect is rocblas_evect_none, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = i <= n and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)]. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygvd_strided_batched()#

rocblas_status rocsolver_dsygvd_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygvd_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

SYGVD_STRIDED_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of real generalized symmetric-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^T B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^T B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the symmetric matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). On entry, the symmetric positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use is strideB >= ldb*n.

  • D[out] pointer to type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, it contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n and evect is rocblas_evect_none, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = i <= n and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)]. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegvd()#

rocblas_status rocsolver_zhegvd(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_chegvd(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb, float *D, float *E, rocblas_int *info)#

HEGVD computes the eigenvalues and (optionally) eigenvectors of a complex generalized hermitian-definite eigenproblem.

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect.

When computed, the matrix Z of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z^H B Z=I & \: \text{if 1st or 2nd form, or}\\ Z^H B^{-1} Z=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A and B are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the hermitian matrix A. On exit, if evect is original, the normalized matrix Z of eigenvectors. If evect is none, then the upper or lower triangular part of the matrix A (including the diagonal) is destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. On entry, the hermitian positive definite matrix B. On exit, the triangular factor of B as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

  • D[out] pointer to real type. Array on the GPU of dimension n. On exit, the eigenvalues in increasing order.

  • E[out] pointer to real type. Array on the GPU of dimension n. This array is used to work internally with the tridiagonal matrix T associated with the reduced eigenvalue problem. On exit, if 0 < info <= n, it contains the unconverged off-diagonal elements of T (or properly speaking, a tridiagonal matrix equivalent to T). The diagonal elements of this matrix are in D; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i <= n and evect is rocblas_evect_none, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info = i <= n and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)]. If info = n + i, the leading minor of order i of B is not positive definite.

rocsolver_<type>hegvd_batched()#

rocblas_status rocsolver_zhegvd_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_chegvd_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEGVD_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of complex generalized hermitian-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^H B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^H B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the hermitian matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • B[out] array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. On entry, the hermitian positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, it contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n and evect is rocblas_evect_none, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = i <= n and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)]. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegvd_strided_batched()#

rocblas_status rocsolver_zhegvd_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_chegvd_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_int *info, const rocblas_int batch_count)#

HEGVD_STRIDED_BATCHED computes the eigenvalues and (optionally) eigenvectors of a batch of complex generalized hermitian-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed using a divide-and-conquer algorithm, depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^H B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^H B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the hermitian matrices A_l. On exit, if evect is original, the normalized matrix Z_l of eigenvectors. If evect is none, then the upper or lower triangular part of the matrices A_l (including the diagonal) are destroyed, depending on the value of uplo.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). On entry, the hermitian positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use is strideB >= ldb*n.

  • D[out] pointer to real type. Array on the GPU (the size depends on the value of strideD). On exit, the eigenvalues in increasing order.

  • strideD[in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use is strideD >= n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the tridiagonal matrix T_l associated with the l-th reduced eigenvalue problem. On exit, if 0 < info[l] <= n, it contains the unconverged off-diagonal elements of T_l (or properly speaking, a tridiagonal matrix equivalent to T_l). The diagonal elements of this matrix are in D_l; those that converged correspond to a subset of the eigenvalues (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use is strideE >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n and evect is rocblas_evect_none, i off-diagonal elements of an intermediate tridiagonal form did not converge to zero. If info[l] = i <= n and evect is rocblas_evect_original, the algorithm failed to compute an eigenvalue in the submatrix from [i/(n+1), i/(n+1)] to [i%(n+1), i%(n+1)]. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygvx()#

rocblas_status rocsolver_dsygvx(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *B, const rocblas_int ldb, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, double *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_ssygvx(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *B, const rocblas_int ldb, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, float *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#

SYGVX computes a set of the eigenvalues and optionally the corresponding eigenvectors of a real generalized symmetric-definite eigenproblem.

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix Z of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z^T B Z=I & \: \text{if 1st or 2nd form, or}\\ Z^T B^{-1} Z=I & \: \text{if 3rd form.} \end{array} \end{split}\]

This function computes all the eigenvalues, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A and B are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. On entry, the symmetric positive definite matrix B. On exit, the triangular factor of B as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

  • vl[in] type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • vu[in] type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • abstol[in] type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to a rocblas_int on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev = n. If erange is rocblas_erange_index, nev = iu - il + 1. Otherwise, 0 <= nev <= n.

  • W[out] pointer to type. Array on the GPU of dimension n. The first nev elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • Z[out] pointer to type. Array on the GPU of dimension ldz*nev. On exit, if evect is not rocblas_evect_none and info = 0, the first nev columns contain the eigenvectors of A corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev are not known in advance. The user should ensure that Z is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrix Z.

  • ifail[out] pointer to rocblas_int. Array on the GPU of dimension n. If info = 0, the first nev elements of ifail are zero. If info = i <= n, ifail contains the indices of the i eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i <= n, i columns of Z did not converge. If info = n + i, the leading minor of order i of B is not positive definite.

rocsolver_<type>sygvx_batched()#

rocblas_status rocsolver_dsygvx_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *const B[], const rocblas_int ldb, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, double *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygvx_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *const B[], const rocblas_int ldb, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, float *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

SYGVX_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of real generalized symmetric-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^T B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^T B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

This function computes all the eigenvalues, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • B[out] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. On entry, the symmetric positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • vl[in] type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldz*nev[l]. On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. If info[l] = i <= n, ifail_l contains the indices of the i eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch instance l. If info[l] = i <= n, i columns of Z_l did not converge. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>sygvx_strided_batched()#

rocblas_status rocsolver_dsygvx_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *B, const rocblas_int ldb, const rocblas_stride strideB, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, double *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssygvx_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *B, const rocblas_int ldb, const rocblas_stride strideB, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, float *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

SYGVX_STRIDED_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of real generalized symmetric-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^T B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^T B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

This function computes all the eigenvalues, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). On entry, the symmetric positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use is strideB >= ldb*n.

  • vl[in] type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] pointer to type. Array on the GPU (the size depends on the value of strideZ). On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • strideZ[in] rocblas_stride. Stride from the start of one matrix Z_l to the next one Z_(l+1). There is no restriction for the value of strideZ. Normal use case is strideZ >= ldz*nev[l]. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. If info[l] = i <= n, ifail_l contains the indices of the i eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n, i columns of Z_l did not converge. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegvx()#

rocblas_status rocsolver_zhegvx(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *B, const rocblas_int ldb, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, rocblas_double_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_chegvx(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *B, const rocblas_int ldb, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, rocblas_float_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#

HEGVX computes a set of the eigenvalues and optionally the corresponding eigenvectors of a complex generalized hermitian-definite eigenproblem.

The problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A X = \lambda B X & \: \text{1st form,}\\ A B X = \lambda X & \: \text{2nd form, or}\\ B A X = \lambda X & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix Z of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z^H B Z=I & \: \text{if 1st or 2nd form, or}\\ Z^H B^{-1} Z=I & \: \text{if 3rd form.} \end{array} \end{split}\]

This function computes all the eigenvalues, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblem.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A and B are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A and B are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrix A.

  • B[out] pointer to type. Array on the GPU of dimension ldb*n. On entry, the hermitian positive definite matrix B. On exit, the triangular factor of B as returned by POTRF.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B.

  • vl[in] real type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • vu[in] real type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A or the eigenvalues in a half-open interval.

  • abstol[in] real type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to a rocblas_int on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev = n. If erange is rocblas_erange_index, nev = iu - il + 1. Otherwise, 0 <= nev <= n.

  • W[out] pointer to real type. Array on the GPU of dimension n. The first nev elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • Z[out] pointer to type. Array on the GPU of dimension ldz*nev. On exit, if evect is not rocblas_evect_none and info = 0, the first nev columns contain the eigenvectors of A corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev are not known in advance. The user should ensure that Z is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrix Z.

  • ifail[out] pointer to rocblas_int. Array on the GPU of dimension n. If info = 0, the first nev elements of ifail are zero. If info = i <= n, ifail contains the indices of the i eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i <= n, i columns of Z did not converge. If info = n + i, the leading minor of order i of B is not positive definite.

rocsolver_<type>hegvx_batched()#

rocblas_status rocsolver_zhegvx_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const B[], const rocblas_int ldb, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, rocblas_double_complex *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_chegvx_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const B[], const rocblas_int ldb, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, rocblas_float_complex *const Z[], const rocblas_int ldz, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

HEGVX_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of complex generalized hermitian-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^H B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^H B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

This function computes all the eigenvalues, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • B[out] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldb*n. On entry, the hermitian positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • vl[in] real type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] real type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] real type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to real type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] Array of pointers to type. Each pointer points to an array on the GPU of dimension ldz*nev[l]. On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. If info[l] = i <= n, ifail_l contains the indices of the i eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch instance l. If info[l] = i <= n, i columns of Z_l did not converge. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>hegvx_strided_batched()#

rocblas_status rocsolver_zhegvx_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, rocblas_int *nev, double *W, const rocblas_stride strideW, rocblas_double_complex *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_chegvx_strided_batched(rocblas_handle handle, const rocblas_eform itype, const rocblas_evect evect, const rocblas_erange erange, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *B, const rocblas_int ldb, const rocblas_stride strideB, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, rocblas_int *nev, float *W, const rocblas_stride strideW, rocblas_float_complex *Z, const rocblas_int ldz, const rocblas_stride strideZ, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

HEGVX_STRIDED_BATCHED computes a set of the eigenvalues and optionally the corresponding eigenvectors of a batch of complex generalized hermitian-definite eigenproblems.

For each instance in the batch, the problem solved by this function is either of the form

\[\begin{split} \begin{array}{cl} A_l X_l = \lambda B_l X_l & \: \text{1st form,}\\ A_l B_l X_l = \lambda X_l & \: \text{2nd form, or}\\ B_l A_l X_l = \lambda X_l & \: \text{3rd form,} \end{array} \end{split}\]

depending on the value of itype. The eigenvectors are computed depending on the value of evect.

When computed, the matrix \(Z_l\) of eigenvectors is normalized as follows:

\[\begin{split} \begin{array}{cl} Z_l^H B_l^{} Z_l^{}=I & \: \text{if 1st or 2nd form, or}\\ Z_l^H B_l^{-1} Z_l^{}=I & \: \text{if 3rd form.} \end{array} \end{split}\]

This function computes all the eigenvalues, all the eigenvalues in the half-open interval \((vl, vu]\), or the il-th through iu-th eigenvalues, depending on the value of erange. If evect is rocblas_evect_original, the eigenvectors for these eigenvalues will be computed as well.

Parameters:
  • handle[in] rocblas_handle.

  • itype[in] rocblas_eform. Specifies the form of the generalized eigenproblems.

  • evect[in] rocblas_evect. Specifies whether the eigenvectors are to be computed. If evect is rocblas_evect_original, then the eigenvectors are computed. rocblas_evect_tridiagonal is not supported.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower parts of the matrices A_l and B_l are stored. If uplo indicates lower (or upper), then the upper (or lower) parts of A_l and B_l are not used.

  • n[in] rocblas_int. n >= 0. The matrix dimensions.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • B[out] pointer to type. Array on the GPU (the size depends on the value of strideB). On entry, the hermitian positive definite matrices B_l. On exit, the triangular factor of B_l as returned by POTRF_STRIDED_BATCHED.

  • ldb[in] rocblas_int. ldb >= n. Specifies the leading dimension of B_l.

  • strideB[in] rocblas_stride. Stride from the start of one matrix B_l to the next one B_(l+1). There is no restriction for the value of strideB. Normal use is strideB >= ldb*n.

  • vl[in] real type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • vu[in] real type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise.. The index of the largest eigenvalue to be computed. Ignored if range indicates to look for all the eigenvalues of A_l or the eigenvalues in a half-open interval.

  • abstol[in] real type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A_l will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • nev[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of eigenvalues found. If erange is rocblas_erange_all, nev[l] = n. If erange is rocblas_erange_index, nev[l] = iu - il + 1. Otherwise, 0 <= nev[l] <= n.

  • W[out] pointer to real type. Array on the GPU (the size depends on the value of strideW). The first nev[l] elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • strideW[in] rocblas_stride. Stride from the start of one vector W_l to the next one W_(l+1). There is no restriction for the value of strideW. Normal use case is strideW >= n.

  • Z[out] pointer to type. Array on the GPU (the size depends on the value of strideZ). On exit, if evect is not rocblas_evect_none and info[l] = 0, the first nev[l] columns contain the eigenvectors of A_l corresponding to the output eigenvalues. Not referenced if evect is rocblas_evect_none.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of matrices Z_l.

  • strideZ[in] rocblas_stride. Stride from the start of one matrix Z_l to the next one Z_(l+1). There is no restriction for the value of strideZ. Normal use case is strideZ >= ldz*nev[l]. Note: If erange is rocblas_range_value, then the values of nev[l] are not known in advance. The user should ensure that Z_l is large enough to hold n columns, as all n columns can be used as workspace for internal computations.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nev[l] elements of ifail_l are zero. If info[l] = i <= n, ifail_l contains the indices of the i eigenvectors that failed to converge. Not referenced if evect is rocblas_evect_none.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= n.

  • info[out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit of batch l. If info[l] = i <= n, i columns of Z_l did not converge. If info[l] = n + i, the leading minor of order i of B_l is not positive definite.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

Singular value decomposition#

rocsolver_<type>gesvd()#

rocblas_status rocsolver_zgesvd(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *S, rocblas_double_complex *U, const rocblas_int ldu, rocblas_double_complex *V, const rocblas_int ldv, double *E, const rocblas_workmode fast_alg, rocblas_int *info)#
rocblas_status rocsolver_cgesvd(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *S, rocblas_float_complex *U, const rocblas_int ldu, rocblas_float_complex *V, const rocblas_int ldv, float *E, const rocblas_workmode fast_alg, rocblas_int *info)#
rocblas_status rocsolver_dgesvd(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *S, double *U, const rocblas_int ldu, double *V, const rocblas_int ldv, double *E, const rocblas_workmode fast_alg, rocblas_int *info)#
rocblas_status rocsolver_sgesvd(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *S, float *U, const rocblas_int ldu, float *V, const rocblas_int ldv, float *E, const rocblas_workmode fast_alg, rocblas_int *info)#

GESVD computes the singular values and optionally the singular vectors of a general m-by-n matrix A (Singular Value Decomposition).

The SVD of matrix A is given by:

\[ A = U S V' \]

where the m-by-n matrix S is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of A. U and V are orthogonal (unitary) matrices. The first min(m,n) columns of U and V are the left and right singular vectors of A, respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of V’.

left_svect and right_svect are rocblas_svect enums that can take the following values:

  • rocblas_svect_all: the entire matrix U (or V’) is computed,

  • rocblas_svect_singular: only the singular vectors (first min(m,n) columns of U or rows of V’) are computed,

  • rocblas_svect_overwrite: the first columns (or rows) of A are overwritten with the singular vectors, or

  • rocblas_svect_none: no columns (or rows) of U (or V’) are computed, i.e. no singular vectors.

left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of A are destroyed by the time the function returns.

Note

When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix A via a QR (or LQ) factorization, and working with the triangular factor afterwards (thin-SVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations out-of-place, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details, see the “Tuning rocSOLVER performance” and “Memory model” sections of the documentation.

Parameters:
  • handle[in] rocblas_handle.

  • left_svect[in] rocblas_svect. Specifies how the left singular vectors are computed.

  • right_svect[in] rocblas_svect. Specifies how the right singular vectors are computed.

  • m[in] rocblas_int. m >= 0. The number of rows of matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) contain the left (or right) singular vectors; otherwise, the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= m. The leading dimension of A.

  • S[out] pointer to real type. Array on the GPU of dimension min(m,n). The singular values of A in decreasing order.

  • U[out] pointer to type. Array on the GPU of dimension ldu*min(m,n) if left_svect is set to singular, or ldu*m when left_svect is equal to all. The matrix of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.

  • ldu[in] rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise. The leading dimension of U.

  • V[out] pointer to type. Array on the GPU of dimension ldv*n. The matrix of right singular vectors stored as rows (transposed / conjugate-transposed). Not referenced if right_svect is set to overwrite or none.

  • ldv[in] rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise. The leading dimension of V.

  • E[out] pointer to real type. Array on the GPU of dimension min(m,n)-1. This array is used to work internally with the bidiagonal matrix B associated with A (using BDSQR). On exit, if info > 0, it contains the unconverged off-diagonal elements of B (or properly speaking, a bidiagonal matrix orthogonally equivalent to B). The diagonal elements of this matrix are in S; those that converged correspond to a subset of the singular values of A (not necessarily ordered).

  • fast_alg[in] rocblas_workmode. If set to rocblas_outofplace, the function will execute the fast thin-SVD version of the algorithm when possible.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, BDSQR did not converge. i elements of E did not converge to zero.

rocsolver_<type>gesvd_batched()#

rocblas_status rocsolver_zgesvd_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *S, const rocblas_stride strideS, rocblas_double_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_double_complex *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgesvd_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *S, const rocblas_stride strideS, rocblas_float_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_float_complex *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgesvd_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *S, const rocblas_stride strideS, double *U, const rocblas_int ldu, const rocblas_stride strideU, double *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgesvd_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *S, const rocblas_stride strideS, float *U, const rocblas_int ldu, const rocblas_stride strideU, float *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#

GESVD_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrices A_l (Singular Value Decomposition).

The SVD of matrix A_l in the batch is given by:

\[ A_l^{} = U_l^{} S_l^{} V_l' \]

where the m-by-n matrix \(S_l\) is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of \(A_l\). \(U_l\) and \(V_l\) are orthogonal (unitary) matrices. The first min(m,n) columns of \(U_l\) and \(V_l\) are the left and right singular vectors of \(A_l\), respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of \(V_l'\).

left_svect and right_svect are rocblas_svect enums that can take the following values:

  • rocblas_svect_all: the entire matrix \(U_l\) (or \(V_l'\)) is computed,

  • rocblas_svect_singular: only the singular vectors (first min(m,n) columns of \(U_l\) or rows of \(V_l'\)) are computed,

  • rocblas_svect_overwrite: the first columns (or rows) of \(A_l\) are overwritten with the singular vectors, or

  • rocblas_svect_none: no columns (or rows) of \(U_l\) (or \(V_l'\)) are computed, i.e. no singular vectors.

left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of \(A_l\) are destroyed by the time the function returns.

Note

When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix \(A_l\) via a QR (or LQ) factorization, and working with the triangular factor afterwards (thin-SVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations out-of-place, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details, see the “Tuning rocSOLVER performance” and “Memory model” sections of the documentation.

Parameters:
  • handle[in] rocblas_handle.

  • left_svect[in] rocblas_svect. Specifies how the left singular vectors are computed.

  • right_svect[in] rocblas_svect. Specifies how the right singular vectors are computed.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) of A_l contain the left (or right) corresponding singular vectors; otherwise, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= m. The leading dimension of A_l.

  • S[out] pointer to real type. Array on the GPU (the size depends on the value of strideS). The singular values of A_l in decreasing order.

  • strideS[in] rocblas_stride. Stride from the start of one vector S_l to the next one S_(l+1). There is no restriction for the value of strideS. Normal use case is strideS >= min(m,n).

  • U[out] pointer to type. Array on the GPU (the side depends on the value of strideU). The matrices U_l of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.

  • ldu[in] rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise. The leading dimension of U_l.

  • strideU[in] rocblas_stride. Stride from the start of one matrix U_l to the next one U_(l+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*min(m,n) if left_svect is set to singular, or strideU >= ldu*m when left_svect is equal to all.

  • V[out] pointer to type. Array on the GPU (the size depends on the value of strideV). The matrices V_l of right singular vectors stored as rows (transposed / conjugate-transposed). Not referenced if right_svect is set to overwrite or none.

  • ldv[in] rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise. The leading dimension of V_l.

  • strideV[in] rocblas_stride. Stride from the start of one matrix V_l to the next one V_(l+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the bidiagonal matrix B_l associated with A_l (using BDSQR). On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of B_l (or properly speaking, a bidiagonal matrix orthogonally equivalent to B_l). The diagonal elements of this matrix are in S_l; those that converged correspond to a subset of the singular values of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.

  • fast_alg[in] rocblas_workmode. If set to rocblas_outofplace, the function will execute the fast thin-SVD version of the algorithm when possible.

  • info[out] pointer to a rocblas_int on the GPU. If info[l] = 0, successful exit. If info[l] = i > 0, BDSQR did not converge. i elements of E_l did not converge to zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gesvd_strided_batched()#

rocblas_status rocsolver_zgesvd_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *S, const rocblas_stride strideS, rocblas_double_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_double_complex *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgesvd_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *S, const rocblas_stride strideS, rocblas_float_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_float_complex *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgesvd_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *S, const rocblas_stride strideS, double *U, const rocblas_int ldu, const rocblas_stride strideU, double *V, const rocblas_int ldv, const rocblas_stride strideV, double *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgesvd_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *S, const rocblas_stride strideS, float *U, const rocblas_int ldu, const rocblas_stride strideU, float *V, const rocblas_int ldv, const rocblas_stride strideV, float *E, const rocblas_stride strideE, const rocblas_workmode fast_alg, rocblas_int *info, const rocblas_int batch_count)#

GESVD_STRIDED_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrices A_l (Singular Value Decomposition).

The SVD of matrix A_l in the batch is given by:

\[ A_l^{} = U_l^{} S_l^{} V_l' \]

where the m-by-n matrix \(S_l\) is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of \(A_l\). \(U_l\) and \(V_l\) are orthogonal (unitary) matrices. The first min(m,n) columns of \(U_l\) and \(V_l\) are the left and right singular vectors of \(A_l\), respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of \(V_l'\).

left_svect and right_svect are rocblas_svect enums that can take the following values:

  • rocblas_svect_all: the entire matrix \(U_l\) (or \(V_l'\)) is computed,

  • rocblas_svect_singular: only the singular vectors (first min(m,n) columns of \(U_l\) or rows of \(V_l'\)) are computed,

  • rocblas_svect_overwrite: the first columns (or rows) of \(A_l\) are overwritten with the singular vectors, or

  • rocblas_svect_none: no columns (or rows) of \(U_l\) (or \(V_l'\)) are computed, i.e. no singular vectors.

left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of \(A_l\) are destroyed by the time the function returns.

Note

When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix \(A_l\) via a QR (or LQ) factorization, and working with the triangular factor afterwards (thin-SVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations out-of-place, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details, see the “Tuning rocSOLVER performance” and “Memory model” sections of the documentation.

Parameters:
  • handle[in] rocblas_handle.

  • left_svect[in] rocblas_svect. Specifies how the left singular vectors are computed.

  • right_svect[in] rocblas_svect. Specifies how the right singular vectors are computed.

  • m[in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.

  • n[in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) of A_l contain the left (or right) corresponding singular vectors; otherwise, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= m. The leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • S[out] pointer to real type. Array on the GPU (the size depends on the value of strideS). The singular values of A_l in decreasing order.

  • strideS[in] rocblas_stride. Stride from the start of one vector S_l to the next one S_(l+1). There is no restriction for the value of strideS. Normal use case is strideS >= min(m,n).

  • U[out] pointer to type. Array on the GPU (the side depends on the value of strideU). The matrices U_l of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.

  • ldu[in] rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise. The leading dimension of U_l.

  • strideU[in] rocblas_stride. Stride from the start of one matrix U_l to the next one U_(l+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*min(m,n) if left_svect is set to singular, or strideU >= ldu*m when left_svect is equal to all.

  • V[out] pointer to type. Array on the GPU (the size depends on the value of strideV). The matrices V_l of right singular vectors stored as rows (transposed / conjugate-transposed). Not referenced if right_svect is set to overwrite or none.

  • ldv[in] rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise. The leading dimension of V_l.

  • strideV[in] rocblas_stride. Stride from the start of one matrix V_l to the next one V_(l+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.

  • E[out] pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the bidiagonal matrix B_l associated with A_l (using BDSQR). On exit, if info[l] > 0, E_l contains the unconverged off-diagonal elements of B_l (or properly speaking, a bidiagonal matrix orthogonally equivalent to B_l). The diagonal elements of this matrix are in S_l; those that converged correspond to a subset of the singular values of A_l (not necessarily ordered).

  • strideE[in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.

  • fast_alg[in] rocblas_workmode. If set to rocblas_outofplace, the function will execute the fast thin-SVD version of the algorithm when possible.

  • info[out] pointer to a rocblas_int on the GPU. If info[l] = 0, successful exit. If info[l] = i > 0, BDSQR did not converge. i elements of E_l did not converge to zero.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gesvdx()#

rocblas_status rocsolver_zgesvdx(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, rocblas_double_complex *U, const rocblas_int ldu, rocblas_double_complex *V, const rocblas_int ldv, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_cgesvdx(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, rocblas_float_complex *U, const rocblas_int ldu, rocblas_float_complex *V, const rocblas_int ldv, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_dgesvdx(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, double *U, const rocblas_int ldu, double *V, const rocblas_int ldv, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_sgesvdx(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, float *U, const rocblas_int ldu, float *V, const rocblas_int ldv, rocblas_int *ifail, rocblas_int *info)#

GESVDX computes a set of singular values and optionally the corresponding singular vectors of a general m-by-n matrix A (partial Singular Value Decomposition).

This function computes all the singular values of A, all the singular values in the half-open interval \([vl, vu)\), or the il-th through iu-th singular values, depending on the value of srange.

The full SVD of matrix A is given by:

\[ A = U S V' \]

where the m-by-n matrix S is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of A. U and V are orthogonal (unitary) matrices. The first min(m,n) columns of U and V are the left and right singular vectors of A, respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of V’.

left_svect and right_svect are rocblas_svect enums that, for this function, can take the following values:

  • rocblas_svect_singular: the singular vectors (first min(m,n) columns of U or rows of V’) corresponding to the computed singular values are computed,

  • rocblas_svect_none: no columns (or rows) of U (or V’) are computed, i.e. no singular vectors.

Parameters:
  • handle[in] rocblas_handle.

  • left_svect[in] rocblas_svect. Specifies if the left singular vectors are computed.

  • right_svect[in] rocblas_svect. Specifies if the right singular vectors are computed.

  • srange[in] rocblas_srange. Specifies the type of range or interval of the singular values to be computed.

  • m[in] rocblas_int. m >= 0. The number of rows of matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A. On exit, the contents of A are destroyed.

  • lda[in] rocblas_int. lda >= m. The leading dimension of A.

  • vl[in] real type. 0 <= vl < vu. The lower bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of A or the singular values within a set of indices.

  • vu[in] real type. 0 <= vl < vu. The upper bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of A or the singular values within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the largest singular value to be computed. Ignored if srange indicates to look for all the singular values of A or the singular values in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise. The index of the smallest singular value to be computed. Ignored if srange indicates to look for all the singular values of A or the singular values in a half-open interval.

  • nsv[out] pointer to a rocblas_int on the GPU. The total number of singular values found. If srange is rocblas_srange_all, nsv = min(m,n). If srange is rocblas_srange_index, nsv = iu - il + 1. Otherwise, 0 <= nsv <= min(m,n).

  • S[out] pointer to real type. Array on the GPU of dimension nsv. The first nsv elements contain the computed singular values in descending order. Note: If srange is rocblas_srange_value, then the value of nsv is not known in advance. In this case, the user should ensure that S is large enough to hold min(m,n) values.

  • U[out] pointer to type. Array on the GPU of dimension ldu*nsv. The matrix of left singular vectors stored as columns. Not referenced if left_svect is set to none. Note: If srange is rocblas_srange_value, then the value of nsv is not known in advance. In this case, the user should ensure that U is large enough to hold min(m,n) columns.

  • ldu[in] rocblas_int. ldu >= m if left_svect singular; ldu >= 1 otherwise. The leading dimension of U.

  • V[out] pointer to type. Array on the GPU of dimension ldv*n. The matrix of right singular vectors stored as rows (transposed / conjugate-transposed). Not referenced if right_svect is set to none.

  • ldv[in] rocblas_int. ldv >= nsv if right_svect is set to singular; or ldv >= 1 otherwise. The leading dimension of V. Note: If srange is rocblas_srange_value, then the value of nsv is not known in advance. In this case, the user should ensure that V is large enough to hold min(m,n) rows.

  • ifail[out] pointer to rocblas_int. Array on the GPU of dimension min(m,n). If info = 0, the first nsv elements of ifail are zero. Otherwise, contains the indices of those eigenvectors that failed to converge, as returned by BDSVDX.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, i eigenvectors did not converge in BDSVDX; their indices are stored in ifail.

rocsolver_<type>gesvdx_batched()#

rocblas_status rocsolver_zgesvdx_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, const rocblas_stride strideS, rocblas_double_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_double_complex *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgesvdx_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, const rocblas_stride strideS, rocblas_float_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_float_complex *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgesvdx_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, const rocblas_stride strideS, double *U, const rocblas_int ldu, const rocblas_stride strideU, double *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgesvdx_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, const rocblas_stride strideS, float *U, const rocblas_int ldu, const rocblas_stride strideU, float *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

GESVDX_BATCHED computes a set of singular values and optionally the corresponding singular vectors of a batch of general m-by-n matrices \(A_l\) (partial Singular Value Decomposition).

This function computes all the singular values of \(A_l\), all the singular values in the half-open interval \([vl, vu)\), or the il-th through iu-th singular values, depending on the value of srange.

The full SVD of matrix \(A_l\) is given by:

\[ A_l = U_l S_l V_l' \]

where the m-by-n matrix \(S_l\) is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of \(A_l\) . \(U_l\) and \(V_l\) are orthogonal (unitary) matrices. The first min(m,n) columns of \(U_l\) and \(V_l\) are the left and right singular vectors of \(A_l\) , respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of \(V_l'\).

left_svect and right_svect are rocblas_svect enums that, for this function, can take the following values:

  • rocblas_svect_singular: the singular vectors (first min(m,n) columns of \(U_l\) or rows of \(V_l'\) ) corresponding to the computed singular values are computed,

  • rocblas_svect_none: no columns (or rows) of \(U_l\) (or \(V_l'\) ) are computed, i.e. no singular vectors.

Parameters:
  • handle[in] rocblas_handle.

  • left_svect[in] rocblas_svect. Specifies if the left singular vectors are computed.

  • right_svect[in] rocblas_svect. Specifies if the right singular vectors are computed.

  • srange[in] rocblas_srange. Specifies the type of range or interval of the singular values to be computed.

  • m[in] rocblas_int. m >= 0. The number of rows of matrix A_l.

  • n[in] rocblas_int. n >= 0. The number of columns of matrix A_l.

  • A[inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= m. The leading dimension of A_l.

  • vl[in] real type. 0 <= vl < vu. The lower bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of A_l or the singular values within a set of indices.

  • vu[in] real type. 0 <= vl < vu. The upper bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of A_l or the singular values within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the largest singular value to be computed. Ignored if srange indicates to look for all the singular values of A_l or the singular values in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise. The index of the smallest singular value to be computed. Ignored if srange indicates to look for all the singular values of A_l or the singular values in a half-open interval.

  • nsv[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of singular values found. If srange is rocblas_srange_all, nsv[l] = min(m,n). If srange is rocblas_srange_index, nsv[l] = iu - il + 1. Otherwise, 0 <= nsv[l] <= min(m,n).

  • S[out] pointer to real type. Array on the GPU (the size depends on the value of strideS). The first nsv_l elements contain the computed singular values in descending order. (The remaining elements may be used as workspace for internal computations).

  • strideS[in] rocblas_stride. Stride from the start of one vector S_l to the next one S_(l+1). There is no restriction for the value of strideS. Normal use case is strideS >= nsv_l. Note: If srange is rocblas_srange_value, then the value of nsv_l is not known in advance. In this case, the user should ensure that S_l is large enough to hold min(m,n) values.

  • U[out] pointer to type. Array on the GPU (the size depends on the value of strideU). The matrix U_l of left singular vectors stored as columns. Not referenced if left_svect is set to none.

  • ldu[in] rocblas_int. ldu >= m if left_svect singular; ldu >= 1 otherwise. The leading dimension of U_l.

  • strideU[in] rocblas_stride. Stride from the start of one matrix U_l to the next one U_(l+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*nsv_l. Note: If srange is rocblas_srange_value, then the value of nsv_l is not known in advance. In this case, the user should ensure that U_l is large enough to hold min(m,n) columns.

  • V[out] pointer to type. Array on the GPU (the size depends on the value of strideV). The matrix V_l of right singular vectors stored as rows (transposed / conjugate-transposed). Not referenced if right_svect is set to none.

  • ldv[in] rocblas_int. ldv >= nsv_l if right_svect is set to singular; or ldv >= 1 otherwise. The leading dimension of V_l. Note: If srange is rocblas_srange_value, then the value of nsv_l is not known in advance. In this case, the user should ensure that V_l is large enough to hold min(m,n) rows.

  • strideV[in] rocblas_stride. Stride from the start of one matrix V_l to the next one V_(l+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nsv[l] elements of ifail_l are zero. Otherwise, contains the indices of those eigenvectors that failed to converge, as returned by BDSVDX.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= min(m,n).

  • info[out] pointer to a rocblas_int on the GPU. If info[l] = 0, successful exit. If info[l] = i > 0, i eigenvectors did not converge in BDSVDX; their indices are stored in ifail_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.

rocsolver_<type>gesvdx_strided_batched()#

rocblas_status rocsolver_zgesvdx_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, const rocblas_stride strideS, rocblas_double_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_double_complex *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgesvdx_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, const rocblas_stride strideS, rocblas_float_complex *U, const rocblas_int ldu, const rocblas_stride strideU, rocblas_float_complex *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgesvdx_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, const rocblas_stride strideS, double *U, const rocblas_int ldu, const rocblas_stride strideU, double *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgesvdx_strided_batched(rocblas_handle handle, const rocblas_svect left_svect, const rocblas_svect right_svect, const rocblas_srange srange, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, const rocblas_stride strideS, float *U, const rocblas_int ldu, const rocblas_stride strideU, float *V, const rocblas_int ldv, const rocblas_stride strideV, rocblas_int *ifail, const rocblas_stride strideF, rocblas_int *info, const rocblas_int batch_count)#

GESVDX_STRIDED_BATCHED computes a set of singular values and optionally the corresponding singular vectors of a batch of general m-by-n matrices \(A_l\) (partial Singular Value Decomposition).

This function computes all the singular values of \(A_l\), all the singular values in the half-open interval \([vl, vu)\), or the il-th through iu-th singular values, depending on the value of srange.

The full SVD of matrix \(A_l\) is given by:

\[ A_l = U_l S_l V_l' \]

where the m-by-n matrix \(S_l\) is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of \(A_l\) . \(U_l\) and \(V_l\) are orthogonal (unitary) matrices. The first min(m,n) columns of \(U_l\) and \(V_l\) are the left and right singular vectors of \(A_l\) , respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of \(V_l'\).

left_svect and right_svect are rocblas_svect enums that, for this function, can take the following values:

  • rocblas_svect_singular: the singular vectors (first min(m,n) columns of \(U_l\) or rows of \(V_l'\) ) corresponding to the computed singular values are computed,

  • rocblas_svect_none: no columns (or rows) of \(U_l\) (or \(V_l'\) ) are computed, i.e. no singular vectors.

Parameters:
  • handle[in] rocblas_handle.

  • left_svect[in] rocblas_svect. Specifies if the left singular vectors are computed.

  • right_svect[in] rocblas_svect. Specifies if the right singular vectors are computed.

  • srange[in] rocblas_srange. Specifies the type of range or interval of the singular values to be computed.

  • m[in] rocblas_int. m >= 0. The number of rows of matrix A_l.

  • n[in] rocblas_int. n >= 0. The number of columns of matrix A_l.

  • A[inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l. On exit, the contents of A_l are destroyed.

  • lda[in] rocblas_int. lda >= m. The leading dimension of A_l.

  • strideA[in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • vl[in] real type. 0 <= vl < vu. The lower bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of A_l or the singular values within a set of indices.

  • vu[in] real type. 0 <= vl < vu. The upper bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of A_l or the singular values within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the largest singular value to be computed. Ignored if srange indicates to look for all the singular values of A_l or the singular values in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise. The index of the smallest singular value to be computed. Ignored if srange indicates to look for all the singular values of A_l or the singular values in a half-open interval.

  • nsv[out] pointer to rocblas_int. Array of batch_count integers on the GPU. The total number of singular values found. If srange is rocblas_srange_all, nsv[l] = min(m,n). If srange is rocblas_srange_index, nsv[l] = iu - il + 1. Otherwise, 0 <= nsv[l] <= min(m,n).

  • S[out] pointer to real type. Array on the GPU (the size depends on the value of strideS). The first nsv_l elements contain the computed singular values in descending order. (The remaining elements may be used as workspace for internal computations).

  • strideS[in] rocblas_stride. Stride from the start of one vector S_l to the next one S_(l+1). There is no restriction for the value of strideS. Normal use case is strideS >= nsv_l. Note: If srange is rocblas_srange_value, then the value of nsv_l is not known in advance. In this case, the user should ensure that S_l is large enough to hold min(m,n) values.

  • U[out] pointer to type. Array on the GPU (the size depends on the value of strideU). The matrix U_l of left singular vectors stored as columns. Not referenced if left_svect is set to none.

  • ldu[in] rocblas_int. ldu >= m if left_svect singular; ldu >= 1 otherwise. The leading dimension of U_l.

  • strideU[in] rocblas_stride. Stride from the start of one matrix U_l to the next one U_(l+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*nsv_l. Note: If srange is rocblas_srange_value, then the value of nsv_l is not known in advance. In this case, the user should ensure that U_l is large enough to hold min(m,n) columns.

  • V[out] pointer to type. Array on the GPU (the size depends on the value of strideV). The matrix V_l of right singular vectors stored as rows (transposed / conjugate-transposed). Not referenced if right_svect is set to none.

  • ldv[in] rocblas_int. ldv >= nsv_l if right_svect is set to singular; or ldv >= 1 otherwise. The leading dimension of V_l. Note: If srange is rocblas_srange_value, then the value of nsv_l is not known in advance. In this case, the user should ensure that V_l is large enough to hold min(m,n) rows.

  • strideV[in] rocblas_stride. Stride from the start of one matrix V_l to the next one V_(l+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.

  • ifail[out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideF). If info[l] = 0, the first nsv[l] elements of ifail_l are zero. Otherwise, contains the indices of those eigenvectors that failed to converge, as returned by BDSVDX.

  • strideF[in] rocblas_stride. Stride from the start of one vector ifail_l to the next one ifail_(l+1). There is no restriction for the value of strideF. Normal use case is strideF >= min(m,n).

  • info[out] pointer to a rocblas_int on the GPU. If info[l] = 0, successful exit. If info[l] = i > 0, i eigenvectors did not converge in BDSVDX; their indices are stored in ifail_l.

  • batch_count[in] rocblas_int. batch_count >= 0. Number of matrices in the batch.