Lapack-like Functions

Contents

Lapack-like Functions#

Other Lapack-like routines provided by rocSOLVER. These are divided into the following subcategories:

Note

Throughout the APIs’ descriptions, we use the following notations:

  • x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.

  • If X is a real vector or matrix, XT indicates its transpose; if X is complex, then XH represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.

  • x_i =xi; we sometimes use both notations, xi when displaying mathematical equations, and x_i in the text describing the function parameters.

Triangular factorizations#

rocsolver_<type>getf2_npvt()#

rocblas_status rocsolver_zgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_sgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

GETF2_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

A=LU

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2 routines instead.

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = j > 0, U is singular. U[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getf2_npvt_batched()#

rocblas_status rocsolver_zgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

GETF2_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix Ai in the batch has the form

Ai=LiUi

where Li is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and Ui is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_BATCHED routines instead.

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getf2_npvt_strided_batched()#

rocblas_status rocsolver_zgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

GETF2_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix Ai in the batch has the form

Ai=LiUi

where Li is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and Ui is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_STRIDED_BATCHED routines instead.

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getrf_npvt()#

rocblas_status rocsolver_zgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_sgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

GETRF_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

A=LU

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF routines instead.

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = j > 0, U is singular. U[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getrf_npvt_batched()#

rocblas_status rocsolver_zgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

GETRF_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix Ai in the batch has the form

Ai=LiUi

where Li is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and Ui is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_BATCHED routines instead.

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getrf_npvt_strided_batched()#

rocblas_status rocsolver_zgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

GETRF_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix Ai in the batch has the form

Ai=LiUi

where Li is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and Ui is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_STRIDED_BATCHED routines instead.

Parameters:
  • handle[in] rocblas_handle.

  • m[in]

    rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • n[in]

    rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.

  • lda[in]

    rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

Linear-systems solvers#

rocsolver_<type>getri_npvt()#

rocblas_status rocsolver_zgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_cgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_dgetri_npvt(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
rocblas_status rocsolver_sgetri_npvt(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

GETRI_NPVT inverts a general n-by-n matrix A using the LU factorization computed by GETRF_NPVT.

The inverse is computed by solving the linear system

A1L=U1

where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • A[inout]

    pointer to type. Array on the GPU of dimension lda*n.

    On entry, the factors L and U of the factorization A = L*U returned by

    GETRF_NPVT. On exit, the inverse of A if info = 0; otherwise undefined.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_batched()#

rocblas_status rocsolver_zgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_BATCHED.

The inverse of matrix Aj in the batch is computed by solving the linear system

Aj1Lj=Uj1

where Lj is the lower triangular factor of Aj with unit diagonal elements, and Uj is the upper triangular factor.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[inout]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the factors L_j and U_j of the factorization A = L_j*U_j returned by

    GETRF_NPVT_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_npvt_strided_batched()#

rocblas_status rocsolver_zgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_STRIDED_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_STRIDED_BATCHED.

The inverse of matrix Aj in the batch is computed by solving the linear system

Aj1Lj=Uj1

where Lj is the lower triangular factor of Aj with unit diagonal elements, and Uj is the upper triangular factor.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[inout]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the factors L_j and U_j of the factorization A_j = L_j*U_j returned by

    GETRF_NPVT_STRIDED_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_outofplace()#

rocblas_status rocsolver_zgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_cgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_dgetri_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, double *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_sgetri_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, float *C, const rocblas_int ldc, rocblas_int *info)#

GETRI_OUTOFPLACE computes the inverse C=A1 of a general n-by-n matrix A.

The inverse is computed by solving the linear system

AC=I

where I is the identity matrix, and A is factorized as A=PLU as given by GETRF.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • A[in]

    pointer to type. Array on the GPU of dimension lda*n.

    The factors L and U of the factorization A = P*L*U returned by

    GETRF.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • ipiv[in]

    pointer to rocblas_int. Array on the GPU of dimension n.

    The pivot indices returned by

    GETRF.

  • C[out]

    pointer to type. Array on the GPU of dimension ldc*n.

    If info = 0, the inverse of A. Otherwise, undefined.

  • ldc[in]

    rocblas_int. ldc >= n.

    Specifies the leading dimension of C.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_outofplace_batched()#

rocblas_status rocsolver_zgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

GETRI_OUTOFPLACE_BATCHED computes the inverse Cj=Aj1 of a batch of general n-by-n matrices Aj.

The inverse is computed by solving the linear system

AjCj=I

where I is the identity matrix, and Aj is factorized as Aj=PjLjUj as given by GETRF_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[in]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by

    GETRF_BATCHED.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • ipiv[in]

    pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    The pivot indices returned by

    GETRF_BATCHED.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(i+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • C[out]

    array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • ldc[in]

    rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_outofplace_strided_batched()#

rocblas_status rocsolver_zgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

GETRI_OUTOFPLACE_STRIDED_BATCHED computes the inverse Cj=Aj1 of a batch of general n-by-n matrices Aj.

The inverse is computed by solving the linear system

AjCj=I

where I is the identity matrix, and Aj is factorized as Aj=PjLjUj as given by GETRF_STRIDED_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[in]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by

    GETRF_STRIDED_BATCHED.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • ipiv[in]

    pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    The pivot indices returned by

    GETRF_STRIDED_BATCHED.

  • strideP[in]

    rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • C[out]

    pointer to type. Array on the GPU (the size depends on the value of strideC).

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • ldc[in]

    rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • strideC[in]

    rocblas_stride.

    Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace()#

rocblas_status rocsolver_zgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_cgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_dgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, double *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_sgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, float *C, const rocblas_int ldc, rocblas_int *info)#

GETRI_NPVT_OUTOFPLACE computes the inverse C=A1 of a general n-by-n matrix A without partial pivoting.

The inverse is computed by solving the linear system

AC=I

where I is the identity matrix, and A is factorized as A=LU as given by GETRF_NPVT.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • A[in]

    pointer to type. Array on the GPU of dimension lda*n.

    The factors L and U of the factorization A = L*U returned by

    GETRF_NPVT.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • C[out]

    pointer to type. Array on the GPU of dimension ldc*n.

    If info = 0, the inverse of A. Otherwise, undefined.

  • ldc[in]

    rocblas_int. ldc >= n.

    Specifies the leading dimension of C.

  • info[out]

    pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_outofplace_batched()#

rocblas_status rocsolver_zgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_OUTOFPLACE_BATCHED computes the inverse Cj=Aj1 of a batch of general n-by-n matrices Aj without partial pivoting.

The inverse is computed by solving the linear system

AjCj=I

where I is the identity matrix, and Aj is factorized as Aj=LjUj as given by GETRF_NPVT_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[in]

    array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    The factors L_j and U_j of the factorization A_j = L_j*U_j returned by

    GETRF_NPVT_BATCHED.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • C[out]

    array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • ldc[in]

    rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace_strided_batched()#

rocblas_status rocsolver_zgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_cgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_dgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
rocblas_status rocsolver_sgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_OUTOFPLACE_STRIDED_BATCHED computes the inverse Cj=Aj1 of a batch of general n-by-n matrices Aj without partial pivoting.

The inverse is computed by solving the linear system

AjCj=I

where I is the identity matrix, and Aj is factorized as Aj=LjUj as given by GETRF_NPVT_STRIDED_BATCHED.

Parameters:
  • handle[in] rocblas_handle.

  • n[in]

    rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • A[in]

    pointer to type. Array on the GPU (the size depends on the value of strideA).

    The factors L_j and U_j of the factorization A_j = L_j*U_j returned by

    GETRF_NPVT_STRIDED_BATCHED.

  • lda[in]

    rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • strideA[in]

    rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • C[out]

    pointer to type. Array on the GPU (the size depends on the value of strideC).

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • ldc[in]

    rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • strideC[in]

    rocblas_stride.

    Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n

  • info[out]

    pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • batch_count[in]

    rocblas_int. batch_count >= 0.

    Number of matrices in the batch.