LAPACK Functions

Contents

3.3. LAPACK Functions#

LAPACK routines solve complex Numerical Linear Algebra problems. These functions are organized in the following categories:

Note

Throughout the APIs’ descriptions, we use the following notations:

  • x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.

  • If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.

  • x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.

3.3.1. Triangular factorizations#

3.3.1.1. rocsolver_<type>potf2()#

rocblas_status rocsolver_zpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dpotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_spotf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

POTF2 computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form:

\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]

U is an upper triangular matrix and L is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.

3.3.1.2. rocsolver_<type>potf2_batched()#

rocblas_status rocsolver_zpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

POTF2_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of matrix A_j.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A_j.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The j-th factorization stopped at this point.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.3. rocsolver_<type>potf2_strided_batched()#

rocblas_status rocsolver_zpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

POTF2_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of matrix A_j.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The j-th factorization stopped at this point.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.4. rocsolver_<type>potrf()#

rocblas_status rocsolver_zpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dpotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_spotrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

POTRF computes the Cholesky factorization of a real symmetric (complex Hermitian) positive definite matrix A.

(This is the blocked version of the algorithm).

The factorization has the form:

\[\begin{split} \begin{array}{cl} A = U'U & \: \text{if uplo is upper, or}\\ A = LL' & \: \text{if uplo is lower.} \end{array} \end{split}\]

U is an upper triangular matrix and L is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A to be factored. On exit, the lower or upper triangular factor.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful factorization of matrix A. If info = i > 0, the leading minor of order i of A is not positive definite. The factorization stopped at this point.

3.3.1.5. rocsolver_<type>potrf_batched()#

rocblas_status rocsolver_zpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

POTRF_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of matrix A_j.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A_j.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The j-th factorization stopped at this point.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.6. rocsolver_<type>potrf_strided_batched()#

rocblas_status rocsolver_zpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dpotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_spotrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

POTRF_STRIDED_BATCHED computes the Cholesky factorization of a batch of real symmetric (complex Hermitian) positive definite matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form:

\[\begin{split} \begin{array}{cl} A_j = U_j'U_j & \: \text{if uplo is upper, or}\\ A_j = L_jL_j' & \: \text{if uplo is lower.} \end{array} \end{split}\]

\(U_j\) is an upper triangular matrix and \(L_j\) is lower triangular.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the factorization is upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of matrix A_j.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the matrices A_j to be factored. On exit, the upper or lower triangular factors.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful factorization of matrix A_j. If info[j] = i > 0, the leading minor of order i of A_j is not positive definite. The j-th factorization stopped at this point.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.7. rocsolver_<type>getf2()#

rocblas_status rocsolver_zgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_cgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_sgetf2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

GETF2 computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = PLU \]

where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension min(m,n).

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

3.3.1.8. rocsolver_<type>getf2_batched()#

rocblas_status rocsolver_zgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetf2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETF2_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = P_jL_jU_j \]

where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_j in the batch.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors of pivot indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1-based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.9. rocsolver_<type>getf2_strided_batched()#

rocblas_status rocsolver_zgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetf2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETF2_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = P_jL_jU_j \]

where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors of pivots indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1-based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.10. rocsolver_<type>getrf()#

rocblas_status rocsolver_zgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_cgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_sgetrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

GETRF computes the LU factorization of a general m-by-n matrix A using partial pivoting with row interchanges.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = PLU \]

where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension min(m,n).

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= i <= min(m,n), the row i of the matrix was interchanged with row ipiv[i]. Matrix P of the factorization can be derived from ipiv.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

3.3.1.11. rocsolver_<type>getrf_batched()#

rocblas_status rocsolver_zgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETRF_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = P_jL_jU_j \]

where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_j in the batch.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors of pivot indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1-based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.12. rocsolver_<type>getrf_strided_batched()#

rocblas_status rocsolver_zgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

GETRF_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices using partial pivoting with row interchanges.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = P_jL_jU_j \]

where \(P_j\) is a permutation matrix, \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors of pivots indices ipiv_j (corresponding to A_j). Dimension of ipiv_j is min(m,n). Elements of ipiv_j are 1-based indices. For each instance A_j in the batch and for 1 <= i <= min(m,n), the row i of the matrix A_j was interchanged with row ipiv_j[i]. Matrix P_j of the factorization can be derived from ipiv_j.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= min(m,n).

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.13. rocsolver_<type>sytf2()#

rocblas_status rocsolver_zsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_csytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dsytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_ssytf2(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

SYTF2 computes the factorization of a symmetric indefinite matrix \(A\) using Bunch-Kaufman diagonal pivoting.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]

where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D(k)\).

Specifically, \(U\) and \(L\) are computed as

\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D(k)\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D(k)\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D(k)\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D(k)\) is stored in \(A[k-1,k-1]\), \(A[k-1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D(k)\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension n.

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv[k] (or rows and columns k+1 and -ipiv[k]) were interchanged and D[k-1,k-1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2-by-2 diagonal block.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.

3.3.1.14. rocsolver_<type>sytf2_batched()#

rocblas_status rocsolver_zsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytf2_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTF2_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]

where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_j(k)\).

Specifically, \(U_j\) and \(L_j\) are computed as

\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k-1,k-1]\), \(A_j[k-1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension n.

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k-1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_j[k] (or rows and columns k+1 and -ipiv_j[k]) were interchanged and D_j[k-1,k-1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.15. rocsolver_<type>sytf2_strided_batched()#

rocblas_status rocsolver_zsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytf2_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTF2_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]

where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_j(k)\).

Specifically, \(U_j\) and \(L_j\) are computed as

\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k-1,k-1]\), \(A_j[k-1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension n.

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k-1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_j[k] (or rows and columns k+1 and -ipiv_j[k]) were interchanged and D_j[k-1,k-1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.16. rocsolver_<type>sytrf()#

rocblas_status rocsolver_zsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_csytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dsytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_ssytrf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

SYTRF computes the factorization of a symmetric indefinite matrix \(A\) using Bunch-Kaufman diagonal pivoting.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A = U D U^T & \: \text{or}\\ A = L D L^T & \end{array} \end{split}\]

where \(U\) or \(L\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D(k)\).

Specifically, \(U\) and \(L\) are computed as

\[\begin{split} \begin{array}{cl} U = P(n) U(n) \cdots P(k) U(k) \cdots & \: \text{and}\\ L = P(1) L(1) \cdots P(k) L(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D(k)\), and \(P(k)\) is a permutation matrix defined by \(ipiv[k]\). If we let \(s\) denote the order of block \(D(k)\), then \(U(k)\) and \(L(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D(k)\) is stored in \(A[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A\). If \(s = 2\) and uplo is upper, then \(D(k)\) is stored in \(A[k-1,k-1]\), \(A[k-1,k]\), and \(A[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A\). If \(s = 2\) and uplo is lower, then \(D(k)\) is stored in \(A[k,k]\), \(A[k+1,k]\), and \(A[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the symmetric matrix A to be factored. On exit, the block diagonal matrix D and the multipliers needed to compute U or L.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension n.

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0 and uplo is upper (or ipiv[k] = ipiv[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv[k] (or rows and columns k+1 and -ipiv[k]) were interchanged and D[k-1,k-1] to D[k,k] (or D[k,k] to D[k+1,k+1]) is a 2-by-2 diagonal block.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.

3.3.1.17. rocsolver_<type>sytrf_batched()#

rocblas_status rocsolver_zsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytrf_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTRF_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]

where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_j(k)\).

Specifically, \(U_j\) and \(L_j\) are computed as

\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k-1,k-1]\), \(A_j[k-1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension n.

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k-1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_j[k] (or rows and columns k+1 and -ipiv_j[k]) were interchanged and D_j[k-1,k-1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.1.18. rocsolver_<type>sytrf_strided_batched()#

rocblas_status rocsolver_zsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_csytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dsytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_ssytrf_strided_batched(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_int *info, const rocblas_int batch_count)#

SYTRF_STRIDED_BATCHED computes the factorization of a batch of symmetric indefinite matrices using Bunch-Kaufman diagonal pivoting.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} \begin{array}{cl} A_j = U_j D_j U_j^T & \: \text{or}\\ A_j = L_j D_j L_j^T & \end{array} \end{split}\]

where \(U_j\) or \(L_j\) is a product of permutation and unit upper/lower triangular matrices (depending on the value of uplo), and \(D_j\) is a symmetric block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks \(D_j(k)\).

Specifically, \(U_j\) and \(L_j\) are computed as

\[\begin{split} \begin{array}{cl} U_j = P_j(n) U_j(n) \cdots P_j(k) U_j(k) \cdots & \: \text{and}\\ L_j = P_j(1) L_j(1) \cdots P_j(k) L_j(k) \cdots & \end{array} \end{split}\]

where \(k\) decreases from \(n\) to 1 (increases from 1 to \(n\)) in steps of 1 or 2, depending on the order of block \(D_j(k)\), and \(P_j(k)\) is a permutation matrix defined by \(ipiv_j[k]\). If we let \(s\) denote the order of block \(D_j(k)\), then \(U_j(k)\) and \(L_j(k)\) are unit upper/lower triangular matrices defined as

\[\begin{split} U_j(k) = \left[ \begin{array}{ccc} I_{k-s} & v & 0 \\ 0 & I_s & 0 \\ 0 & 0 & I_{n-k} \end{array} \right] \end{split}\]

and

\[\begin{split} L_j(k) = \left[ \begin{array}{ccc} I_{k-1} & 0 & 0 \\ 0 & I_s & 0 \\ 0 & v & I_{n-k-s+1} \end{array} \right]. \end{split}\]

If \(s = 1\), then \(D_j(k)\) is stored in \(A_j[k,k]\) and \(v\) is stored in the upper/lower part of column \(k\) of \(A_j\). If \(s = 2\) and uplo is upper, then \(D_j(k)\) is stored in \(A_j[k-1,k-1]\), \(A_j[k-1,k]\), and \(A_j[k,k]\), and \(v\) is stored in the upper parts of columns \(k-1\) and \(k\) of \(A_j\). If \(s = 2\) and uplo is lower, then \(D_j(k)\) is stored in \(A_j[k,k]\), \(A_j[k+1,k]\), and \(A_j[k+1,k+1]\), and \(v\) is stored in the lower parts of columns \(k\) and \(k+1\) of \(A_j\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in]

    rocblas_fill.

    Specifies whether the upper or lower part of the matrices A_j are stored. If uplo indicates lower (or upper), then the upper (or lower) part of A_j is not used.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the symmetric matrices A_j to be factored. On exit, the block diagonal matrices D_j and the multipliers needed to compute U_j or L_j.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[out]

    pointer to rocblas_int. Array on the GPU of dimension n.

    The vector of pivot indices. Elements of ipiv are 1-based indices. For 1 <= k <= n, if ipiv_j[k] > 0 then rows and columns k and ipiv_j[k] were interchanged and D_j[k,k] is a 1-by-1 diagonal block. If, instead, ipiv_j[k] = ipiv_j[k-1] < 0 and uplo is upper (or ipiv_j[k] = ipiv_j[k+1] < 0 and uplo is lower), then rows and columns k-1 and -ipiv_j[k] (or rows and columns k+1 and -ipiv_j[k]) were interchanged and D_j[k-1,k-1] to D_j[k,k] (or D_j[k,k] to D_j[k+1,k+1]) is a 2-by-2 diagonal block.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, D_j is singular. D_j[i,i] is the first diagonal zero.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2. Orthogonal factorizations#

3.3.2.1. rocsolver_<type>geqr2()#

rocblas_status rocsolver_zgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeqr2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQR2 computes a QR factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]

where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_1H_2\cdots H_k, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{ipiv}[i] \cdot v_i v_i' \]

where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m - i elements of Householder vector v_i.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to type. Array on the GPU of dimension min(m,n).

    The Householder scalars.

3.3.2.2. rocsolver_<type>geqr2_batched()#

rocblas_status rocsolver_zgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqr2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQR2_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]

where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the first i-1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m - i elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.3. rocsolver_<type>geqr2_strided_batched()#

rocblas_status rocsolver_zgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqr2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQR2_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]

where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the first i-1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m - i elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.4. rocsolver_<type>geqrf()#

rocblas_status rocsolver_zgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeqrf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQRF computes a QR factorization of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} R\\ 0 \end{array}\right] \end{split}\]

where R is upper triangular (upper trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_1H_2\cdots H_k, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{ipiv}[i] \cdot v_i v_i' \]

where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be factored. On exit, the elements on and above the diagonal contain the factor R; the elements below the diagonal are the last m - i elements of Householder vector v_i.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to type. Array on the GPU of dimension min(m,n).

    The Householder scalars.

3.3.2.5. rocsolver_<type>geqrf_batched()#

rocblas_status rocsolver_zgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqrf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQRF_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]

where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the first i-1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m - i elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.6. rocsolver_<type>geqrf_strided_batched()#

rocblas_status rocsolver_zgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqrf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQRF_STRIDED_BATCHED computes the QR factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} R_j\\ 0 \end{array}\right] \end{split}\]

where \(R_j\) is upper triangular (upper trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}H_{j_2}\cdots H_{j_k}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the first i-1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the diagonal contain the factor R_j. The elements below the diagonal are the last m - i elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.7. rocsolver_<type>gerq2()#

rocblas_status rocsolver_zgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgerq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GERQ2 computes a RQ factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]

where R is upper triangular (upper trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_1'H_2' \cdots H_k', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{ipiv}[i] \cdot v_i v_i' \]

where the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to type. Array on the GPU of dimension min(m,n).

    The Householder scalars.

3.3.2.8. rocsolver_<type>gerq2_batched()#

rocblas_status rocsolver_zgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQ2_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]

where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last n-i elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.9. rocsolver_<type>gerq2_strided_batched()#

rocblas_status rocsolver_zgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQ2_STRIDED_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]

where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last n-i elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.10. rocsolver_<type>gerqf()#

rocblas_status rocsolver_zgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgerqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GERQF computes a RQ factorization of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The factorization has the form

\[ A = \left[\begin{array}{cc} 0 & R \end{array}\right] Q \]

where R is upper triangular (upper trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_1'H_2' \cdots H_k', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{ipiv}[i] \cdot v_i v_i' \]

where the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to type. Array on the GPU of dimension min(m,n).

    The Householder scalars.

3.3.2.11. rocsolver_<type>gerqf_batched()#

rocblas_status rocsolver_zgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerqf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQF_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]

where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last n-i elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.12. rocsolver_<type>gerqf_strided_batched()#

rocblas_status rocsolver_zgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgerqf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GERQF_STRIDED_BATCHED computes the RQ factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = \left[\begin{array}{cc} 0 & R_j \end{array}\right] Q_j \]

where \(R_j\) is upper triangular (upper trapezoidal if m > n), and \(Q_j\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_1}'H_{j_2}' \cdots H_{j_k}', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last n-i elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and above the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor R_j; the elements below the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.13. rocsolver_<type>geql2()#

rocblas_status rocsolver_zgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeql2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQL2 computes a QL factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]

where L is lower triangular (lower trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_kH_{k-1}\cdots H_1, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{ipiv}[i] \cdot v_i v_i' \]

where the last m-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to type. Array on the GPU of dimension min(m,n).

    The Householder scalars.

3.3.2.14. rocsolver_<type>geql2_batched()#

rocblas_status rocsolver_zgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeql2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQL2_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]

where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_{j_k}H_{j_{k-1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last m-i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.15. rocsolver_<type>geql2_strided_batched()#

rocblas_status rocsolver_zgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeql2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQL2_STRIDED_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]

where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_{j_k}H_{j_{k-1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last m-i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.16. rocsolver_<type>geqlf()#

rocblas_status rocsolver_zgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgeqlf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GEQLF computes a QL factorization of a general m-by-n matrix A.

(This is the blocked version of the algorithm).

The factorization has the form

\[\begin{split} A = Q\left[\begin{array}{c} 0\\ L \end{array}\right] \end{split}\]

where L is lower triangular (lower trapezoidal if m < n), and Q is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_kH_{k-1}\cdots H_1, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{ipiv}[i] \cdot v_i v_i' \]

where the last m-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_i.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to type. Array on the GPU of dimension min(m,n).

    The Householder scalars.

3.3.2.17. rocsolver_<type>geqlf_batched()#

rocblas_status rocsolver_zgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqlf_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQLF_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]

where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_{j_k}H_{j_{k-1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last m-i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.18. rocsolver_<type>geqlf_strided_batched()#

rocblas_status rocsolver_zgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgeqlf_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GEQLF_STRIDED_BATCHED computes the QL factorization of a batch of general m-by-n matrices.

(This is the blocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[\begin{split} A_j = Q_j\left[\begin{array}{c} 0\\ L_j \end{array}\right] \end{split}\]

where \(L_j\) is lower triangular (lower trapezoidal if m < n), and \(Q_j\) is a m-by-m orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_{j_k}H_{j_{k-1}}\cdots H_{j_1}, \quad \text{with} \: k = \text{min}(m,n) \]

Each Householder matrix \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i} v_{j_i}' \]

where the last m-i elements of the Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the (m-n)-th subdiagonal (when m >= n) or the (n-m)-th superdiagonal (when n > m) contain the factor L_j; the elements above the sub/superdiagonal are the first i - 1 elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.19. rocsolver_<type>gelq2()#

rocblas_status rocsolver_zgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sgelq2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

GELQ2 computes a LQ factorization of a general m-by-n matrix A.

(This is the unblocked version of the algorithm).

The factorization has the form

\[ A = \left[\begin{array}{cc} L & 0 \end{array}\right] Q \]

where L is lower triangular (lower trapezoidal if m > n), and Q is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q = H_k'H_{k-1}' \cdots H_1', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{ipiv}[i] \cdot v_i' v_i \]

where the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be factored. On exit, the elements on and below the diagonal contain the factor L; the elements above the diagonal are the last n - i elements of Householder vector v_i.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • ipiv[out]

    pointer to type. Array on the GPU of dimension min(m,n).

    The Householder scalars.

3.3.2.20. rocsolver_<type>gelq2_batched()#

rocblas_status rocsolver_zgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgelq2_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GELQ2_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = \left[\begin{array}{cc} L_j & 0 \end{array}\right] Q_j \]

where \(L_j\) is lower triangular (lower trapezoidal if m > n), and \(Q_j\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_k}'H_{j_{k-1}}' \cdots H_{j_1}', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i}' v_{j_i} \]

where the first i-1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    Array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the last n - i elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.21. rocsolver_<type>gelq2_strided_batched()#

rocblas_status rocsolver_zgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_cgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_dgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#
rocblas_status rocsolver_sgelq2_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *ipiv, const rocblas_stride strideP, const rocblas_int batch_count)#

GELQ2_STRIDED_BATCHED computes the LQ factorization of a batch of general m-by-n matrices.

(This is the unblocked version of the algorithm).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = \left[\begin{array}{cc} L_j & 0 \end{array}\right] Q_j \]

where \(L_j\) is lower triangular (lower trapezoidal if m > n), and \(Q_j\) is a n-by-n orthogonal/unitary matrix represented as the product of Householder matrices

\[ Q_j = H_{j_k}'H_{j_{k-1}}' \cdots H_{j_1}', \quad \text{with} \: k = \text{min}(m,n). \]

Each Householder matrices \(H_{j_i}\) is given by

\[ H_{j_i} = I - \text{ipiv}_j[i] \cdot v_{j_i}' v_{j_i} \]

where the first i-1 elements of Householder vector \(v_{j_i}\) are zero, and \(v_{j_i}[i] = 1\).

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all the matrices A_j in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all the matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_j to be factored. On exit, the elements on and below the diagonal contain the factor L_j. The elements above the diagonal are the last n - i elements of Householder vector v_(j_i).

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.

  • ipiv[out]

    pointer to type. Array on the GPU (the size depends on the value of strideP).

    Contains the vectors ipiv_j of corresponding Householder scalars.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use is strideP >= min(m,n).

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.3.2.22. rocsolver_<type>gelqf()#

rocblas_status rocsolver_zgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cgelqf(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#
rocblas_status rocsolver_dgelqf(