Lapack-like Functions#
Other Lapack-like routines provided by rocSOLVER. These are divided into the following subcategories:
Triangular factorizations. Based on Gaussian elimination.
Linear-systems solvers. Based on triangular factorizations.
Note
Throughout the APIs’ descriptions, we use the following notations:
x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.
If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.
x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.
Triangular factorizations#
rocsolver_<type>getf2_npvt()#
-
rocblas_status rocsolver_zgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_cgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_dgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_sgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#
GETF2_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = LU \]where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2 routines instead.
- Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
rocsolver_<type>getf2_npvt_batched()#
-
rocblas_status rocsolver_zgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
GETF2_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_BATCHED routines instead.
- Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getf2_npvt_strided_batched()#
-
rocblas_status rocsolver_zgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
GETF2_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_STRIDED_BATCHED routines instead.
- Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getrf_npvt()#
-
rocblas_status rocsolver_zgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_cgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_dgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_sgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#
GETRF_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization has the form
\[ A = LU \]where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF routines instead.
- Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of the matrix A.
n – [in]
rocblas_int. n >= 0.
The number of columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
rocsolver_<type>getrf_npvt_batched()#
-
rocblas_status rocsolver_zgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
GETRF_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_BATCHED routines instead.
- Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorizations. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getrf_npvt_strided_batched()#
-
rocblas_status rocsolver_zgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
GETRF_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.
(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).
The factorization of matrix \(A_j\) in the batch has the form
\[ A_j = L_jU_j \]where \(L_j\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_j\) is upper triangular (upper trapezoidal if m < n).
Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_STRIDED_BATCHED routines instead.
- Parameters:
handle – [in] rocblas_handle.
m – [in]
rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
n – [in]
rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the m-by-n matrices A_j to be factored. On exit, the factors L_j and U_j from the factorization. The unit diagonal elements of L_j are not stored.
lda – [in]
rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for factorization of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero element in the diagonal. The factorization from this point might be incomplete.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
Linear-systems solvers#
rocsolver_<type>getri_npvt()#
-
rocblas_status rocsolver_zgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_cgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_dgetri_npvt(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)#
-
rocblas_status rocsolver_sgetri_npvt(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)#
GETRI_NPVT inverts a general n-by-n matrix A using the LU factorization computed by GETRF_NPVT.
The inverse is computed by solving the linear system
\[ A^{-1}L = U^{-1} \]where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [inout]
pointer to type. Array on the GPU of dimension lda*n.
On entry, the factors L and U of the factorization A = L*U returned by
GETRF_NPVT. On exit, the inverse of A if info = 0; otherwise undefined.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getri_npvt_batched()#
-
rocblas_status rocsolver_zgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)#
GETRI_NPVT_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_BATCHED.
The inverse of matrix \(A_j\) in the batch is computed by solving the linear system
\[ A_j^{-1} L_j = U_j^{-1} \]where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [inout]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the factors L_j and U_j of the factorization A = L_j*U_j returned by
GETRF_NPVT_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getri_npvt_strided_batched()#
-
rocblas_status rocsolver_zgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)#
GETRI_NPVT_STRIDED_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_STRIDED_BATCHED.
The inverse of matrix \(A_j\) in the batch is computed by solving the linear system
\[ A_j^{-1} L_j = U_j^{-1} \]where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [inout]
pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getri_outofplace()#
-
rocblas_status rocsolver_zgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#
-
rocblas_status rocsolver_cgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#
-
rocblas_status rocsolver_dgetri_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, double *C, const rocblas_int ldc, rocblas_int *info)#
-
rocblas_status rocsolver_sgetri_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, float *C, const rocblas_int ldc, rocblas_int *info)#
GETRI_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A.
The inverse is computed by solving the linear system
\[ AC = I \]where I is the identity matrix, and A is factorized as \(A = PLU\) as given by GETRF.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [in]
pointer to type. Array on the GPU of dimension lda*n.
The factors L and U of the factorization A = P*L*U returned by
GETRF.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
ipiv – [in]
pointer to rocblas_int. Array on the GPU of dimension n.
The pivot indices returned by
GETRF.C – [out]
pointer to type. Array on the GPU of dimension ldc*n.
If info = 0, the inverse of A. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.
Specifies the leading dimension of C.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getri_outofplace_batched()#
-
rocblas_status rocsolver_zgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
GETRI_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_BATCHED.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [in]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_BATCHED.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
ipiv – [in]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
The pivot indices returned by
GETRF_BATCHED.strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(i+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.
C – [out]
array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getri_outofplace_strided_batched()#
-
rocblas_status rocsolver_zgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
GETRI_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_STRIDED_BATCHED.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [in]
pointer to type. Array on the GPU (the size depends on the value of strideA).
The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_STRIDED_BATCHED.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
ipiv – [in]
pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
The pivot indices returned by
GETRF_STRIDED_BATCHED.strideP – [in]
rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
C – [out]
pointer to type. Array on the GPU (the size depends on the value of strideC).
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
strideC – [in]
rocblas_stride.
Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getri_npvt_outofplace()#
-
rocblas_status rocsolver_zgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#
-
rocblas_status rocsolver_cgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#
-
rocblas_status rocsolver_dgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, double *C, const rocblas_int ldc, rocblas_int *info)#
-
rocblas_status rocsolver_sgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, float *C, const rocblas_int ldc, rocblas_int *info)#
GETRI_NPVT_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A without partial pivoting.
The inverse is computed by solving the linear system
\[ AC = I \]where I is the identity matrix, and A is factorized as \(A = LU\) as given by GETRF_NPVT.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
A – [in]
pointer to type. Array on the GPU of dimension lda*n.
The factors L and U of the factorization A = L*U returned by
GETRF_NPVT.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of A.
C – [out]
pointer to type. Array on the GPU of dimension ldc*n.
If info = 0, the inverse of A. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.
Specifies the leading dimension of C.
info – [out]
pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.
rocsolver_<type>getri_npvt_outofplace_batched()#
-
rocblas_status rocsolver_zgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)#
GETRI_NPVT_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_BATCHED.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [in]
array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_BATCHED.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
C – [out]
array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.
rocsolver_<type>getri_npvt_outofplace_strided_batched()#
-
rocblas_status rocsolver_zgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_cgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_dgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
-
rocblas_status rocsolver_sgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)#
GETRI_NPVT_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.
The inverse is computed by solving the linear system
\[ A_j C_j = I \]where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_STRIDED_BATCHED.
- Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
A – [in]
pointer to type. Array on the GPU (the size depends on the value of strideA).
The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED.lda – [in]
rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
strideA – [in]
rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
C – [out]
pointer to type. Array on the GPU (the size depends on the value of strideC).
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
ldc – [in]
rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
strideC – [in]
rocblas_stride.
Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
info – [out]
pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
batch_count – [in]
rocblas_int. batch_count >= 0.
Number of matrices in the batch.