Lapack-like Functions

Lapack-like Functions#

Other Lapack-like routines provided by rocSOLVER. These are divided into the following subcategories:

Triangular factorizations. Based on Gaussian elimination.
Linear-systems solvers. Based on triangular factorizations.

Note

Throughout the APIs’ descriptions, we use the following notations:

x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.
If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.
x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.

Triangular factorizations#

rocsolver_<type>getf2_npvt()#

rocblas_status rocsolver_zgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_cgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_dgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_sgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

GETF2_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = LU \]

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2 routines instead.

Parameters:

handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.

The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.

The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.

On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in]
rocblas_int. lda >= m.

Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getf2_npvt_batched()#

rocblas_status rocsolver_zgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

GETF2_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = L_jU_j \]

where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_BATCHED routines instead.

Parameters:

handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.

The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.

The number of columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.

Specifies the leading dimension of matrices A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getf2_npvt_strided_batched()#

rocblas_status rocsolver_zgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

GETF2_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = L_jU_j \]

where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_STRIDED_BATCHED routines instead.

Parameters:

handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.

The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.

The number of columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).

On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.

Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.

Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getrf_npvt()#

rocblas_status rocsolver_zgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_cgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_dgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_sgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

GETRF_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = LU \]

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF routines instead.

Parameters:

handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.

The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.

The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.

On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in]
rocblas_int. lda >= m.

Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getrf_npvt_batched()#

rocblas_status rocsolver_zgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

GETRF_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = L_jU_j \]

where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_BATCHED routines instead.

Parameters:

handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.

The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.

The number of columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.

Specifies the leading dimension of matrices A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getrf_npvt_strided_batched()#

rocblas_status rocsolver_zgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

GETRF_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_j\) in the batch has the form

\[ A_j = L_jU_j \]

where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_STRIDED_BATCHED routines instead.

Parameters:

handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.

The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.

The number of columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).

On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.

Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.

Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

Linear-systems solvers#

rocsolver_<type>getri_npvt()#

rocblas_status rocsolver_zgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_cgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_dgetri_npvt(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#

rocblas_status rocsolver_sgetri_npvt(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#

GETRI_NPVT inverts a general n-by-n matrix A using the LU factorization computed by GETRF_NPVT.

The inverse is computed by solving the linear system

\[ A^{-1}L = U^{-1} \]

where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.

On entry, the factors L and U of the factorization A = L*U returned by
GETRF_NPVT. On exit, the inverse of A if info = 0; otherwise undefined.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_batched()#

rocblas_status rocsolver_zgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_BATCHED.

The inverse of matrix \(A_j\) in the batch is computed by solving the linear system

\[ A_j^{-1} L_j = U_j^{-1} \]

where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

On entry, the factors L_j and U_j of the factorization A = L_j*U_j returned by
GETRF_NPVT_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of matrices A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getri_npvt_strided_batched()#

rocblas_status rocsolver_zgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_STRIDED_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_STRIDED_BATCHED.

The inverse of matrix \(A_j\) in the batch is computed by solving the linear system

\[ A_j^{-1} L_j = U_j^{-1} \]

where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).

On entry, the factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.

Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getri_outofplace()#

rocblas_status rocsolver_zgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#

rocblas_status rocsolver_cgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#

rocblas_status rocsolver_dgetri_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, double *C, const rocblas_int ldc, rocblas_int *info)#

rocblas_status rocsolver_sgetri_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, float *C, const rocblas_int ldc, rocblas_int *info)#

GETRI_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A.

The inverse is computed by solving the linear system

\[ AC = I \]

where I is the identity matrix, and A is factorized as \(A = PLU\) as given by GETRF.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of the matrix A.
A – [in]
pointer to type. Array on the GPU of dimension lda*n.

The factors L and U of the factorization A = P*L*U returned by
GETRF.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of A.
ipiv – [in]
pointer to rocblas_int. Array on the GPU of dimension n.

The pivot indices returned by
GETRF.
C – [out]
pointer to type. Array on the GPU of dimension ldc*n.

If info = 0, the inverse of A. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.

Specifies the leading dimension of C.
info – [out]
pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_outofplace_batched()#

rocblas_status rocsolver_zgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

GETRI_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_BATCHED.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of all matrices A_j in the batch.
A – [in]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_BATCHED.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of matrices A_j.
ipiv – [in]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

The pivot indices returned by
GETRF_BATCHED.
strideP – [in]
rocblas_stride.

Stride from the start of one vector ipiv_j to the next one ipiv_(i+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.
C – [out]
array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.

If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.

Specifies the leading dimension of C_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getri_outofplace_strided_batched()#

rocblas_status rocsolver_zgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

GETRI_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_STRIDED_BATCHED.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of all matrices A_j in the batch.
A – [in]
pointer to type. Array on the GPU (the size depends on the value of strideA).

The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_STRIDED_BATCHED.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.

Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [in]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

The pivot indices returned by
GETRF_STRIDED_BATCHED.
strideP – [in]
rocblas_stride.

Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
C – [out]
pointer to type. Array on the GPU (the size depends on the value of strideC).

If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.

Specifies the leading dimension of C_j.
strideC – [in]
rocblas_stride.

Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace()#

rocblas_status rocsolver_zgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#

rocblas_status rocsolver_cgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#

rocblas_status rocsolver_dgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, double *C, const rocblas_int ldc, rocblas_int *info)#

rocblas_status rocsolver_sgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, float *C, const rocblas_int ldc, rocblas_int *info)#

GETRI_NPVT_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A without partial pivoting.

The inverse is computed by solving the linear system

\[ AC = I \]

where I is the identity matrix, and A is factorized as \(A = LU\) as given by GETRF_NPVT.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of the matrix A.
A – [in]
pointer to type. Array on the GPU of dimension lda*n.

The factors L and U of the factorization A = L*U returned by
GETRF_NPVT.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of A.
C – [out]
pointer to type. Array on the GPU of dimension ldc*n.

If info = 0, the inverse of A. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.

Specifies the leading dimension of C.
info – [out]
pointer to a rocblas_int on the GPU.

If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_outofplace_batched()#

rocblas_status rocsolver_zgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_BATCHED.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of all matrices A_j in the batch.
A – [in]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_BATCHED.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of matrices A_j.
C – [out]
array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.

If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.

Specifies the leading dimension of C_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace_strided_batched()#

rocblas_status rocsolver_zgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_cgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_dgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

rocblas_status rocsolver_sgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#

GETRI_NPVT_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_STRIDED_BATCHED.

Parameters:

handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.

The number of rows and columns of all matrices A_j in the batch.
A – [in]
pointer to type. Array on the GPU (the size depends on the value of strideA).

The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED.
lda – [in]
rocblas_int. lda >= n.

Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.

Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
C – [out]
pointer to type. Array on the GPU (the size depends on the value of strideC).

If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.

Specifies the leading dimension of C_j.
strideC – [in]
rocblas_stride.

Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.

If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.

Number of matrices in the batch.

Lapack-like Functions

Contents

Lapack-like Functions#

Triangular factorizations#

Linear-systems solvers#