rocSOLVER LAPACK Functions#
LAPACK routines solve complex Numerical Linear Algebra problems. These functions are organized in the following categories:
Triangular factorizations. Based on Gaussian elimination.
Orthogonal factorizations. Based on Householder reflections.
Problem and matrix reductions. Transformation of matrices and problems into equivalent forms.
Linear-systems solvers. Based on triangular factorizations.
Least-squares solvers. Based on orthogonal factorizations.
Symmetric eigensolvers. Eigenproblems for symmetric matrices.
Singular value decomposition. Singular values and related problems for general matrices.
Note
Throughout the APIs’ descriptions, we use the following notations:
i, j, and k are used as general purpose indices. In some legacy LAPACK APIs, k could be a parameter indicating some problem/matrix dimension.
Depending on the context, when it is necessary to index rows, columns and blocks or submatrices, i is assigned to rows, j to columns and k to blocks. \(l\) is always used to index matrices/problems in a batch.
x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.
To identify a block in a matrix or a matrix in the batch, k and \(l\) are used as sub-indices
x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.
If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.
When a matrix A is formed as the product of several matrices, the following notation is used: A=M(1)M(2)…M(t).
Triangular factorizations#
rocsolver_<type>potf2()#
-
rocblas_status rocsolver_zpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_cpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_dpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_spotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#
POTF2 computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form:
\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]U is an upper triangular matrix and L is lower triangular.
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A.
info – [out] pointer to a rocblas_int on the GPU. If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
rocsolver_<type>potf2_batched()#
-
rocblas_status rocsolver_zpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_spotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
POTF2_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>potf2_strided_batched()#
-
rocblas_status rocsolver_zpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_spotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
POTF2_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>potrf()#
-
rocblas_status rocsolver_zpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_cpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_dpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_spotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#
POTRF computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.
(This is the blocked version of the algorithm).
The factorization has the form:
\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]U is an upper triangular matrix and L is lower triangular.
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A.
info – [out] pointer to a rocblas_int on the GPU. If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.
rocsolver_<type>potrf_batched()#
-
rocblas_status rocsolver_zpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_spotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
POTRF_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>potrf_strided_batched()#
-
rocblas_status rocsolver_zpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_spotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
POTRF_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form:
\[\begin{split} \begin{array}{cl} A_l^{} = U_l'U_l^{} & \: \text{if uplo is upper, or}\\ A_l^{} = L_l^{}L_l' & \: \text{if uplo is lower.} \end{array} \end{split}\]\(U_l\) is an upper triangular matrix and \(L_l\) is lower triangular.
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of matrix A_l.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, the upper or lower triangular factors.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful factorization of matrix A_l. If info[l] = i > 0, the leading minor of order i of A_l is not positive definite. The l-th factorization stopped at this point.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>getf2()#
-
rocblas_status rocsolver_zgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_cgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_dgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_sgetf2_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_zgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_cgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_dgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_sgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
GETF2 computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = PLU \]where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension min(m,n). The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
info – [out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getf2_batched()#
-
rocblas_status rocsolver_zgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_cgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_dgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_sgetf2_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_zgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETF2_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = P_lL_lU_l \]where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorizations. The unit diagonal elements of L_l are not stored.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivot indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>getf2_strided_batched()#
-
rocblas_status rocsolver_zgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_cgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_dgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_sgetf2_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_zgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETF2_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = P_lL_lU_l \]where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorization. The unit diagonal elements of L_l are not stored.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivots indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>getrf()#
-
rocblas_status rocsolver_zgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_cgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_dgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_sgetrf_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, int64_t *ipiv, int64_t *info)#
-
rocblas_status rocsolver_zgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_cgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_dgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_sgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
GETRF computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = PLU \]where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension min(m,n). The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.
info – [out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getrf_batched()#
-
rocblas_status rocsolver_zgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_cgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_dgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_sgetrf_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *const A[], const int64_t lda, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_zgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETRF_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = P_lL_lU_l \]where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorizations. The unit diagonal elements of L_l are not stored.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivot indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>getrf_strided_batched()#
-
rocblas_status rocsolver_zgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_double_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_cgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, rocblas_float_complex *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_dgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, double *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_sgetrf_strided_batched_64(rocblas_handle handle, const int64_t m, const int64_t n, float *A, const int64_t lda, const rocblas_stride strideA, int64_t *ipiv, const rocblas_stride strideP, int64_t *info, const int64_t batch_count)#
-
rocblas_status rocsolver_zgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
GETRF_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = P_lL_lU_l \]where \(P_l\) is a permutation matrix, \(L_l\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_l\) is upper triangular (upper trapezoidal if m < n).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the factors L_l and U_l from the factorization. The unit diagonal elements of L_l are not stored.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out] pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP). Contains the vectors of pivots indices ipiv_l (corresponding to A_l). Dimension of ipiv_l is min(m,n). Elements of ipiv_l are 1-based indices. For each instance A_l in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_l was interchanged with row ipiv_l[i]. Matrix P_l of the factorization can be derived from ipiv_l.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, U_l is singular. U_l[i,i] is the first zero pivot.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytf2()#
-
rocblas_status rocsolver_zsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_csytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_dsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_ssytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
SYTF2 computes the factorization of a symmetric indefinite matrix \(A\) using Bunch-Kaufman diagonal pivoting.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_k\).
Specifically, \(U\) and \(L\) are computed as
\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_k\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D_k\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]and
\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_k\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D_k\) is stored in \(A[k-1,k-1]\), \(A[k-1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D_k\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A.
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv[k] (or rows and columns k+1 and -ipiv[k]) were interchanged and D[k-1,k-1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2-by-2 diagonal block.
info – [out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.
rocsolver_<type>sytf2_batched()#
-
rocblas_status rocsolver_zsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_csytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTF2_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).
Specifically, \(U_l\) and \(L_l\) are computed as
\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]and
\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_{kl}\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytf2_strided_batched()#
-
rocblas_status rocsolver_zsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_csytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTF2_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).
Specifically, \(U_l\) and \(L_l\) are computed as
\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]and
\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_{kl}\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytrf()#
-
rocblas_status rocsolver_zsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_csytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_dsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
-
rocblas_status rocsolver_ssytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
SYTRF computes the factorization of a symmetric indefinite matrix \(A\) using Bunch-Kaufman diagonal pivoting.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_k\).
Specifically, \(U\) and \(L\) are computed as
\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_k\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D_k\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]and
\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_k\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D_k\) is stored in \(A[k-1,k-1]\), \(A[k-1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D_k\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of A.
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv[k] (or rows and columns k+1 and -ipiv[k]) were interchanged and D[k-1,k-1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2-by-2 diagonal block.
info – [out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.
rocsolver_<type>sytrf_batched()#
-
rocblas_status rocsolver_zsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_csytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTRF_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).
Specifically, \(U_l\) and \(L_l\) are computed as
\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]and
\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_{kl}\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytrf_strided_batched()#
-
rocblas_status rocsolver_zsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_csytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
SYTRF_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} \begin{array}{cl} A_l^{} = U_l^{} D_l^{} U_l^T & \: \text{or}\\ A_l^{} = L_l^{} D_l^{} L_l^T & \end{array} \end{split}\]where \(U_l\) or \(L_l\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_l\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_{kl}\).
Specifically, \(U_l\) and \(L_l\) are computed as
\[\begin{split} \begin{array}{cl} U_l = P_l(n) U_l(n) \cdots P_l(k) U_l(k) \cdots & \: \text{and}\\ L_l = P_l(1) L_l(1) \cdots P_l(k) L_l(k) \cdots & \end{array} \end{split}\]where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_{kl}\), and \(P_l(k)\) is a permutation matrix defined by \(ipiv_l[k]\). If we let \(s\) denote the order of block \(D_{kl}\), then \(U_l(k)\) and \(L_l(k)\) are unit upper/lower triangular matrices defined as
\[\begin{split} U_l(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]and
\[\begin{split} L_l(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]If \(s = 1\), then \(D_{kl}\) is stored in \(A_l[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_l\). If \(s = 2\) and uplo is upper, then \(D_{kl}\) is stored in \(A_l[k-1,k-1]\), \(A_l[k-1,k]\), and \(A_l[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_l\). If \(s = 2\) and uplo is lower, then \(D_l(k)\) is stored in \(A_l[k,k]\), \(A_l[k+1,k]\), and \(A_l[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_l\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the matrices A_l are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of all matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the symmetric matrices A_l to be factored. On exit, the block diagonal matrices D_l and the multipliers needed to compute U_l or L_l.
lda – [in] rocblas_int. lda >= n. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_l[k] > 0 then rows and columns k and ipiv_l[k] were interchanged and D_l[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_l[k] = ipiv_l[k-1] < 0 and uplo is upper (or ipiv_l[k] = ipiv_l[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_l[k] (or rows and columns k+1 and -ipiv_l[k]) were interchanged and D_l[k-1,k-1] to D_l[k,k] (or D_l[k,k] to D_l[k+1,k+1]) is a 2-by-2 diagonal block.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
info – [out] pointer to rocblas_int. Array of batch_count integers on the GPU. If info[l] = 0, successful exit for factorization of A_l. If info[l] = i > 0, D_l is singular. D_l[i,i] is the first diagonal zero.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
Orthogonal factorizations#
rocsolver_<type>geqr2()#
-
rocblas_status rocsolver_zgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQR2 computes a QR factorization of a general m-by-n matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(1)H(2)\cdots H(k), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m - i elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>geqr2_batched()#
-
rocblas_status rocsolver_zgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQR2_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>geqr2_strided_batched()#
-
rocblas_status rocsolver_zgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQR2_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>geqrf()#
-
rocblas_status rocsolver_zgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQRF computes a QR factorization of a general m-by-n matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(1)H(2)\cdots H(k), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m - i elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>geqrf_batched()#
-
rocblas_status rocsolver_zgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQRF_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>geqrf_strided_batched()#
-
rocblas_status rocsolver_zgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQRF_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} R_l\\ 0 \end{array}\right] \end{split}\]where \(R_l\) is upper triangular (upper trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)H_l(2)\cdots H_l(k), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the diagonal contain the factor R_l. The elements below the diagonal are the last m - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gerq2()#
-
rocblas_status rocsolver_zgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GERQ2 computes a RQ factorization of a general m-by-n matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]where R is upper triangular (upper trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(1)'H(2)' \cdots H(k)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]where the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>gerq2_batched()#
-
rocblas_status rocsolver_zgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQ2_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gerq2_strided_batched()#
-
rocblas_status rocsolver_zgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQ2_STRIDED_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gerqf()#
-
rocblas_status rocsolver_zgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GERQF computes a RQ factorization of a general m-by-n matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]where R is upper triangular (upper trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(1)'H(2)' \cdots H(k)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]where the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>gerqf_batched()#
-
rocblas_status rocsolver_zgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQF_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gerqf_strided_batched()#
-
rocblas_status rocsolver_zgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GERQF_STRIDED_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} 0 & R_l \end{array}\right] Q_l \]where \(R_l\) is upper triangular (upper trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(1)'H_l(2)' \cdots H_l(k)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last n-i elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_l; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>geql2()#
-
rocblas_status rocsolver_zgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQL2 computes a QL factorization of a general m-by-n matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]where L is lower triangular (lower trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(k)H(k-1)\cdots H(1), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]where the last m-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>geql2_batched()#
-
rocblas_status rocsolver_zgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQL2_BATCHED computes the QL factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>geql2_strided_batched()#
-
rocblas_status rocsolver_zgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQL2_STRIDED_BATCHED computes the QL factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>geqlf()#
-
rocblas_status rocsolver_zgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GEQLF computes a QL factorization of a general m-by-n matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]where L is lower triangular (lower trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(k)H(k-1)\cdots H(1), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i^{} v_i' \]where the last m-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>geqlf_batched()#
-
rocblas_status rocsolver_zgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQLF_BATCHED computes the QL factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>geqlf_strided_batched()#
-
rocblas_status rocsolver_zgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GEQLF_STRIDED_BATCHED computes the QL factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[\begin{split} A_l = Q_l\left[\begin{array}{c} 0\\ L_l \end{array}\right] \end{split}\]where \(L_l\) is lower triangular (lower trapezoidal if m < n), and \(Q_l\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)H_l(k-1)\cdots H_l(1), \quad \text{with} \: k = \text{min}(m,n) \]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where the last m-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_l; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gelq2()#
-
rocblas_status rocsolver_zgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GELQ2 computes a LQ factorization of a general m-by-n matrix A.
(This is the unblocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} L & 0 \end{array}\right] Q \]where L is lower triangular (lower trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(k)'H(k-1)' \cdots H(1)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i' v_i^{} \]where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the last n - i elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>gelq2_batched()#
-
rocblas_status rocsolver_zgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQ2_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gelq2_strided_batched()#
-
rocblas_status rocsolver_zgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQ2_STRIDED_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gelqf()#
-
rocblas_status rocsolver_zgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
-
rocblas_status rocsolver_cgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
-
rocblas_status rocsolver_dgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
-
rocblas_status rocsolver_sgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#
GELQF computes a LQ factorization of a general m-by-n matrix A.
(This is the blocked version of the algorithm).
The factorization has the form
\[ A = \left[\begin{array}{cc} L & 0 \end{array}\right] Q \]where L is lower triangular (lower trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q = H(k)'H(k-1)' \cdots H(1)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{ipiv}[i] \cdot v_i' v_i^{} \]where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the last n - i elements of Householder vector v_i.
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of A.
ipiv – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars.
rocsolver_<type>gelqf_batched()#
-
rocblas_status rocsolver_zgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgelqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQF_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gelqf_strided_batched()#
-
rocblas_status rocsolver_zgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgelqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
GELQF_STRIDED_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
The factorization of matrix \(A_l\) in the batch has the form
\[ A_l = \left[\begin{array}{cc} L_l & 0 \end{array}\right] Q_l \]where \(L_l\) is lower triangular (lower trapezoidal if m > n), and \(Q_l\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices
\[ Q_l = H_l(k)'H_l(k-1)' \cdots H_l(1)', \quad \text{with} \: k = \text{min}(m,n). \]Each Householder matrices \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{ipiv}_l^{}[i] \cdot v_{l_i}' v_{l_i}^{} \]where the first i-1 elements of Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on and below the diagonal contain the factor L_l. The elements above the diagonal are the last n - i elements of Householder vector v_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
ipiv – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors ipiv_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector ipiv_l to the next one ipiv_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
Problem and matrix reductions#
rocsolver_<type>gebd2()#
-
rocblas_status rocsolver_zgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)#
-
rocblas_status rocsolver_cgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)#
-
rocblas_status rocsolver_dgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)#
-
rocblas_status rocsolver_sgebd2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)#
GEBD2 computes the bidiagonal form of a general m-by-n matrix A.
(This is the unblocked version of the algorithm).
The bidiagonal form is given by:
\[ B = Q' A P \]where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n)\: \text{and} \: P = G(1)G(2)\cdots G(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q = H(1)H(2)\cdots H(m-1)\: \text{and} \: P = G(1)G(2)\cdots G(m), & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H(i)\) and \(G(i)\) is given by
\[\begin{split} \begin{array}{cl} H(i) = I - \text{tauq}[i] \cdot v_i^{} v_i', & \: \text{and}\\ G(i) = I - \text{taup}[i] \cdot u_i' u_i^{}. \end{array} \end{split}\]If m >= n, the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\); while the first i elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_i\) are zero, and \(u_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_i, and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_i. If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_i, and the elements above the diagonal are the last n - i elements of Householder vector u_i.
lda – [in] rocblas_int. lda >= m. specifies the leading dimension of A.
D – [out] pointer to real type. Array on the GPU of dimension min(m,n). The diagonal elements of B.
E – [out] pointer to real type. Array on the GPU of dimension min(m,n)-1. The off-diagonal elements of B.
tauq – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix Q.
taup – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix P.
rocsolver_<type>gebd2_batched()#
-
rocblas_status rocsolver_zgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgebd2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBD2_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_l^{} = Q_l' A_l^{} P_l^{} \]where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by
\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
D – [out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.
tauq – [out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.
strideQ – [in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.
strideP – [in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gebd2_strided_batched()#
-
rocblas_status rocsolver_zgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgebd2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBD2_STRIDED_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.
(This is the unblocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_l^{} = Q_l' A_l^{} P_l^{} \]where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_1 = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_1 = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by
\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.
tauq – [out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.
strideQ – [in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.
strideP – [in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gebrd()#
-
rocblas_status rocsolver_zgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup)#
-
rocblas_status rocsolver_cgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup)#
-
rocblas_status rocsolver_dgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup)#
-
rocblas_status rocsolver_sgebrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup)#
GEBRD computes the bidiagonal form of a general m-by-n matrix A.
(This is the blocked version of the algorithm).
The bidiagonal form is given by:
\[ B = Q' A P \]where B is upper bidiagonal if m >= n and lower bidiagonal if m < n, and Q and P are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n)\: \text{and} \: P = G(1)G(2)\cdots G(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q = H(1)H(2)\cdots H(m-1)\: \text{and} \: P = G(1)G(2)\cdots G(m), & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H(i)\) and \(G(i)\) is given by
\[\begin{split} \begin{array}{cl} H(i) = I - \text{tauq}[i] \cdot v_i^{} v_i', & \: \text{and}\\ G(i) = I - \text{taup}[i] \cdot u_i' u_i^{}. \end{array} \end{split}\]If m >= n, the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\); while the first i elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_i\) are zero, and \(u_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of the matrix A.
n – [in] rocblas_int. n >= 0. The number of columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_i, and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_i. If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_i, and the elements above the diagonal are the last n - i elements of Householder vector u_i.
lda – [in] rocblas_int. lda >= m. specifies the leading dimension of A.
D – [out] pointer to real type. Array on the GPU of dimension min(m,n). The diagonal elements of B.
E – [out] pointer to real type. Array on the GPU of dimension min(m,n)-1. The off-diagonal elements of B.
tauq – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix Q.
taup – [out] pointer to type. Array on the GPU of dimension min(m,n). The Householder scalars associated with matrix P.
rocsolver_<type>gebrd_batched()#
-
rocblas_status rocsolver_zgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgebrd_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBRD_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_l^{} = Q_l' A_l^{} P_l^{} \]where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by
\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
D – [out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.
tauq – [out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.
strideQ – [in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.
strideP – [in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>gebrd_strided_batched()#
-
rocblas_status rocsolver_zgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tauq, const rocblas_stride strideQ, rocblas_double_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tauq, const rocblas_stride strideQ, rocblas_float_complex *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tauq, const rocblas_stride strideQ, double *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgebrd_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tauq, const rocblas_stride strideQ, float *taup, const rocblas_stride strideP, const rocblas_int batch_count)#
GEBRD_STRIDED_BATCHED computes the bidiagonal form of a batch of general m-by-n matrices.
(This is the blocked version of the algorithm).
For each instance in the batch, the bidiagonal form is given by:
\[ B_l^{} = Q_l' A_l^{} P_l^{} \]where \(B_l\) is upper bidiagonal if m >= n and lower bidiagonal if m < n, and \(Q_l\) and \(P_l\) are orthogonal/unitary matrices represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(n-1), & \: \text{if}\: m >= n, \:\text{or}\\ Q_l = H_l(1)H_l(2)\cdots H_l(m-1)\: \text{and} \: P_l = G_l(1)G_l(2)\cdots G_l(m), & \: \text{if}\: m < n. \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) and \(G_l(i)\) is given by
\[\begin{split} \begin{array}{cl} H_l^{}(i) = I - \text{tauq}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}', & \: \text{and}\\ G_l^{}(i) = I - \text{taup}_l^{}[i] \cdot u_{l_i}' u_{l_i}^{}. \end{array} \end{split}\]If m >= n, the first i-1 elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\); while the first i elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i+1] = 1\). If m < n, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\); while the first i-1 elements of the Householder vector \(u_{l_i}\) are zero, and \(u_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
m – [in] rocblas_int. m >= 0. The number of rows of all the matrices A_l in the batch.
n – [in] rocblas_int. n >= 0. The number of columns of all the matrices A_l in the batch.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the m-by-n matrices A_l to be factored. On exit, the elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n) contain the bidiagonal form B_l. If m >= n, the elements below the diagonal are the last m - i elements of Householder vector v_(l_i), and the elements above the superdiagonal are the last n - i - 1 elements of Householder vector u_(l_i). If m < n, the elements below the subdiagonal are the last m - i - 1 elements of Householder vector v_(l_i), and the elements above the diagonal are the last n - i elements of Householder vector u_(l_i).
lda – [in] rocblas_int. lda >= m. Specifies the leading dimension of matrices A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of B_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= min(m,n).
E – [out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of B_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.
tauq – [out] pointer to type. Array on the GPU (the size depends on the value of strideQ). Contains the vectors tauq_l of Householder scalars associated with matrices Q_l.
strideQ – [in] rocblas_stride. Stride from the start of one vector tauq_l to the next one tauq_(l+1). There is no restriction for the value of strideQ. Normal use is strideQ >= min(m,n).
taup – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors taup_l of Householder scalars associated with matrices P_l.
strideP – [in] rocblas_stride. Stride from the start of one vector taup_l to the next one taup_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytd2()#
-
rocblas_status rocsolver_dsytd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)#
-
rocblas_status rocsolver_ssytd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)#
SYTD2 computes the tridiagonal form of a real symmetric matrix A.
(This is the unblocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A.
D – [out] pointer to type. Array on the GPU of dimension n. The diagonal elements of T.
E – [out] pointer to type. Array on the GPU of dimension n-1. The off-diagonal elements of T.
tau – [out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.
rocsolver_<type>sytd2_batched()#
-
rocblas_status rocsolver_dsytd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTD2_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_l\) is given by:
\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{tau}_l^{}[i] \cdot v_{l_i}^{} v_{l_i}' \]where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A_l.
D – [out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.
tau – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytd2_strided_batched()#
-
rocblas_status rocsolver_dsytd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTD2_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_l\) is given by:
\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.
tau – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>hetd2()#
-
rocblas_status rocsolver_zhetd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)#
-
rocblas_status rocsolver_chetd2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)#
HETD2 computes the tridiagonal form of a complex hermitian matrix A.
(This is the unblocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householders vector v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A.
D – [out] pointer to real type. Array on the GPU of dimension n. The diagonal elements of T.
E – [out] pointer to real type. Array on the GPU of dimension n-1. The off-diagonal elements of T.
tau – [out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.
rocsolver_<type>hetd2_batched()#
-
rocblas_status rocsolver_zhetd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_chetd2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
HETD2_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_l.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_l\) is given by:
\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]where \(T_l\) is Hermitian tridiagonal and \(Q_l\) is a unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A_l.
D – [out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.
tau – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>hetd2_strided_batched()#
-
rocblas_status rocsolver_zhetd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_chetd2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
HETD2_STRIDED_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_l.
(This is the unblocked version of the algorithm).
The tridiagonal form of \(A_l\) is given by:
\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]where \(T_l\) is Hermitian tridiagonal and \(Q_l\) is a unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.
tau – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytrd()#
-
rocblas_status rocsolver_dsytrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *D, double *E, double *tau)#
-
rocblas_status rocsolver_ssytrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *D, float *E, float *tau)#
SYTRD computes the tridiagonal form of a real symmetric matrix A.
(This is the blocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is symmetric tridiagonal and Q is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H_(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A.
D – [out] pointer to type. Array on the GPU of dimension n. The diagonal elements of T.
E – [out] pointer to type. Array on the GPU of dimension n-1. The off-diagonal elements of T.
tau – [out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.
rocsolver_<type>sytrd_batched()#
-
rocblas_status rocsolver_dsytrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTRD_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.
(This is the blocked version of the algorithm).
The tridiagonal form of \(A_l\) is given by:
\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A_l.
D – [out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.
tau – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>sytrd_strided_batched()#
-
rocblas_status rocsolver_dsytrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, double *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_ssytrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, float *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
SYTRD_STRIDED_BATCHED computes the tridiagonal form of a batch of real symmetric matrices A_l.
(This is the blocked version of the algorithm).
The tridiagonal form of \(A_l\) is given by:
\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]where \(T_l\) is symmetric tridiagonal and \(Q_l\) is an orthogonal matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the symmetric matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.
A – [inout] pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A_l.
strideA – [in] rocblas_stride. Stride from the start of one matrix A_l to the next one A_(l+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
D – [out] pointer to type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out] pointer to type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.
tau – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>hetrd()#
-
rocblas_status rocsolver_zhetrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tau)#
-
rocblas_status rocsolver_chetrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tau)#
HETRD computes the tridiagonal form of a complex hermitian matrix A.
(This is the blocked version of the algorithm).
The tridiagonal form is given by:
\[ T = Q' A Q \]where T is hermitian tridiagonal and Q is an unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(n-1) & \: \text{if uplo indicates lower, or}\\ Q = H(n-1)H(n-2)\cdots H(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H(i)\) is given by
\[ H(i) = I - \text{tau}[i] \cdot v_i^{} v_i' \]where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.
A – [inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_i stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_i stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A.
D – [out] pointer to real type. Array on the GPU of dimension n. The diagonal elements of T.
E – [out] pointer to real type. Array on the GPU of dimension n-1. The off-diagonal elements of T.
tau – [out] pointer to type. Array on the GPU of dimension n-1. The Householder scalars.
rocsolver_<type>hetrd_batched()#
-
rocblas_status rocsolver_zhetrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
-
rocblas_status rocsolver_chetrd_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
HETRD_BATCHED computes the tridiagonal form of a batch of complex hermitian matrices A_l.
(This is the blocked version of the algorithm).
The tridiagonal form of \(A_l\) is given by:
\[ T_l^{} = Q_l' A_l^{} Q_l^{} \]where \(T_l\) is Hermitian tridiagonal and \(Q_l\) is a unitary matrix represented as the product of Householder matrices
\[\begin{split} \begin{array}{cl} Q_l = H_l(1)H_l(2)\cdots H_l(n-1) & \: \text{if uplo indicates lower, or}\\ Q_l = H_l(n-1)H_l(n-2)\cdots H_l(1) & \: \text{if uplo indicates upper.} \end{array} \end{split}\]Each Householder matrix \(H_l(i)\) is given by
\[ H_l^{}(i) = I - \text{tau}_l[i] \cdot v_{l_i}^{} v_{l_i}' \]where \(\text{tau}_l[i]\) is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i+1] = 1\). If uplo indicates upper, the last n-i elements of the Householder vector \(v_{l_i}\) are zero, and \(v_{l_i}[i] = 1\).
- Parameters:
handle – [in] rocblas_handle.
uplo – [in] rocblas_fill. Specifies whether the upper or lower part of the hermitian matrix A_l is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_l is not used.
n – [in] rocblas_int. n >= 0. The number of rows and columns of the matrices A_l.
A – [inout] array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n. On entry, the matrices A_l to be factored. On exit, if upper, then the elements on the diagonal and superdiagonal contain the tridiagonal form T_l; the elements above the superdiagonal contain the first i-1 elements of the Householder vectors v_(l_i) stored as columns. If lower, then the elements on the diagonal and subdiagonal contain the tridiagonal form T_l; the elements below the subdiagonal contain the last n-i-1 elements of the Householder vectors v_(l_i) stored as columns.
lda – [in] rocblas_int. lda >= n. The leading dimension of A_l.
D – [out] pointer to real type. Array on the GPU (the size depends on the value of strideD). The diagonal elements of T_l.
strideD – [in] rocblas_stride. Stride from the start of one vector D_l to the next one D_(l+1). There is no restriction for the value of strideD. Normal use case is strideD >= n.
E – [out] pointer to real type. Array on the GPU (the size depends on the value of strideE). The off-diagonal elements of T_l.
strideE – [in] rocblas_stride. Stride from the start of one vector E_l to the next one E_(l+1). There is no restriction for the value of strideE. Normal use case is strideE >= n-1.
tau – [out] pointer to type. Array on the GPU (the size depends on the value of strideP). Contains the vectors tau_l of corresponding Householder scalars.
strideP – [in] rocblas_stride. Stride from the start of one vector tau_l to the next one tau_(l+1). There is no restriction for the value of strideP. Normal use is strideP >= n-1.
batch_count – [in] rocblas_int. batch_count >= 0. Number of matrices in the batch.
rocsolver_<type>hetrd_strided_batched()#
-
rocblas_status rocsolver_zhetrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, double *D, const rocblas_stride strideD, double *E, const rocblas_stride strideE, rocblas_double_complex *tau, const rocblas_stride strideP, const rocblas_int batch_count)#
- rocblas_status rocsolver_chetrd_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, float *D, const rocblas_stride strideD, float *E, const rocblas_stride strideE, rocblas_float_complex *tau,