rocsolver_sgesvd_strided_batched Interface Reference#
GESVD_STRIDED_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrix A (Singular Value Decomposition). More...
Public Member Functions | |
integer(kind(rocblas_status_success)) function | rocsolver_sgesvd_strided_batched_ (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count) |
integer(kind(rocblas_status_success)) function | rocsolver_sgesvd_strided_batched_full_rank (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count) |
integer(kind(rocblas_status_success)) function | rocsolver_sgesvd_strided_batched_rank_0 (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count) |
integer(kind(rocblas_status_success)) function | rocsolver_sgesvd_strided_batched_rank_1 (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count) |
Detailed Description
GESVD_STRIDED_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrix A (Singular Value Decomposition).
The SVD of matrix A_j in the batch is given by:
\[ A_j = U_j S_j V_j' \]
where the m-by-n matrix \(S_j\) is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of \(A_j\). \(U_j\) and \(V_j\) are orthogonal (unitary) matrices. The first min(m,n) columns of \(U_j\) and \(V_j\) are the left and right singular vectors of \(A_j\), respectively.
The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of \(V_j'\).
left_svect and right_svect are rocblas_svect enums that can take the following values:
- rocblas_svect_all: the entire matrix \(U_j\) (or \(V_j'\)) is computed,
- rocblas_svect_singular: only the singular vectors (first min(m,n) columns of \(U_j\) or rows of \(V_j'\)) are computed,
- rocblas_svect_overwrite: the first columns (or rows) of \(A_j\) are overwritten with the singular vectors, or
- rocblas_svect_none: no columns (or rows) of \(U_j\) (or \(V_j'\)) are computed, i.e. no singular vectors.
left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of \(A_j\) are destroyed by the time the function returns.
- Note
- When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix \(A_j\) via a QR (or LQ) factorization, and working with the triangular factor afterwards (thin-SVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations out-of-place, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details, see the "Tuning rocSOLVER performance" and "Memory model" sections of the documentation.
- Parameters
-
[in] handle rocblas_handle. [in] left_svect rocblas_svect.
Specifies how the left singular vectors are computed.[in] right_svect rocblas_svect.
Specifies how the right singular vectors are computed.[in] m rocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.[in] n rocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.[in,out] A pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the matrices A_j. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) of A_j contain the left (or right) corresponding singular vectors; otherwise, the contents of A_j are destroyed.[in] lda rocblas_int. lda >= m.
The leading dimension of A_j.[in] strideA rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.[out] S pointer to real type. Array on the GPU (the size depends on the value of strideS).
The singular values of A_j in decreasing order.[in] strideS rocblas_stride.
Stride from the start of one vector S_j to the next one S_(j+1). There is no restriction for the value of strideS. Normal use case is strideS >= min(m,n).[out] U pointer to type. Array on the GPU (the side depends on the value of strideU).
The matrices U_j of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.[in] ldu rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise.
The leading dimension of U_j.[in] strideU rocblas_stride.
Stride from the start of one matrix U_j to the next one U_(j+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*min(m,n) if left_svect is set to singular, or strideU >= ldu*m when left_svect is equal to all.[out] V pointer to type. Array on the GPU (the size depends on the value of strideV).
The matrices V_j of right singular vectors stored as rows (transposed conjugate-transposed). Not referenced if right_svect is set to overwrite or none.[in] ldv rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise.
The leading dimension of V.[in] strideV rocblas_stride.
Stride from the start of one matrix V_j to the next one V_(j+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.[out] E pointer to real type. Array on the GPU (the size depends on the value of strideE).
This array is used to work internally with the bidiagonal matrix B_j associated with A_j (using BDSQR). On exit, if info > 0, E_j contains the unconverged off-diagonal elements of B_j (or properly speaking, a bidiagonal matrix orthogonally equivalent to B_j). The diagonal elements of this matrix are in S_j; those that converged correspond to a subset of the singular values of A_j (not necessarily ordered).[in] strideE rocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.[in] fast_alg rocblas_workmode.
If set to rocblas_outofplace, the function will execute the fast thin-SVD version of the algorithm when possible.[out] info pointer to a rocblas_int on the GPU.
If info[j] = 0, successful exit. If info[j] = i > 0, BDSQR did not converge. i elements of E_j did not converge to zero.[in] batch_count rocblas_int. batch_count >= 0.
Number of matrices in the batch.
Member Function/Subroutine Documentation
◆ rocsolver_sgesvd_strided_batched_()
integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_strided_batched::rocsolver_sgesvd_strided_batched_ | ( | type(c_ptr), value | handle, |
integer(kind(rocblas_svect_all)), value | left_svect, | ||
integer(kind(rocblas_svect_all)), value | right_svect, | ||
integer(c_int), value | m, | ||
integer(c_int), value | n, | ||
type(c_ptr), value | A, | ||
integer(c_int), value | lda, | ||
integer(c_int64_t), value | strideA, | ||
type(c_ptr), value | S, | ||
integer(c_int64_t), value | strideS, | ||
type(c_ptr), value | U, | ||
integer(c_int), value | ldu, | ||
integer(c_int64_t), value | strideU, | ||
type(c_ptr), value | V, | ||
integer(c_int), value | ldv, | ||
integer(c_int64_t), value | strideV, | ||
type(c_ptr), value | E, | ||
integer(c_int64_t), value | strideE, | ||
integer(kind(rocblas_outofplace)), value | fast_alg, | ||
integer(c_int) | myInfo, | ||
integer(c_int), value | batch_count | ||
) |
◆ rocsolver_sgesvd_strided_batched_full_rank()
integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_strided_batched::rocsolver_sgesvd_strided_batched_full_rank | ( | type(c_ptr) | handle, |
integer(kind(rocblas_svect_all)) | left_svect, | ||
integer(kind(rocblas_svect_all)) | right_svect, | ||
integer(c_int) | m, | ||
integer(c_int) | n, | ||
real(c_float), dimension(:,:), target | A, | ||
integer(c_int) | lda, | ||
integer(c_int64_t) | strideA, | ||
real(c_float), dimension(:), target | S, | ||
integer(c_int64_t) | strideS, | ||
real(c_float), dimension(:,:), target | U, | ||
integer(c_int) | ldu, | ||
integer(c_int64_t) | strideU, | ||
real(c_float), dimension(:,:), target | V, | ||
integer(c_int) | ldv, | ||
integer(c_int64_t) | strideV, | ||
real(c_float), dimension(:), target | E, | ||
integer(c_int64_t) | strideE, | ||
integer(kind(rocblas_outofplace)) | fast_alg, | ||
integer(c_int) | myInfo, | ||
integer(c_int) | batch_count | ||
) |
◆ rocsolver_sgesvd_strided_batched_rank_0()
integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_strided_batched::rocsolver_sgesvd_strided_batched_rank_0 | ( | type(c_ptr) | handle, |
integer(kind(rocblas_svect_all)) | left_svect, | ||
integer(kind(rocblas_svect_all)) | right_svect, | ||
integer(c_int) | m, | ||
integer(c_int) | n, | ||
real(c_float), target | A, | ||
integer(c_int) | lda, | ||
integer(c_int64_t) | strideA, | ||
real(c_float), target | S, | ||
integer(c_int64_t) | strideS, | ||
real(c_float), target | U, | ||
integer(c_int) | ldu, | ||
integer(c_int64_t) | strideU, | ||
real(c_float), target | V, | ||
integer(c_int) | ldv, | ||
integer(c_int64_t) | strideV, | ||
real(c_float), target | E, | ||
integer(c_int64_t) | strideE, | ||
integer(kind(rocblas_outofplace)) | fast_alg, | ||
integer(c_int) | myInfo, | ||
integer(c_int) | batch_count | ||
) |
◆ rocsolver_sgesvd_strided_batched_rank_1()
integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_strided_batched::rocsolver_sgesvd_strided_batched_rank_1 | ( | type(c_ptr) | handle, |
integer(kind(rocblas_svect_all)) | left_svect, | ||
integer(kind(rocblas_svect_all)) | right_svect, | ||
integer(c_int) | m, | ||
integer(c_int) | n, | ||
real(c_float), dimension(:), target | A, | ||
integer(c_int) | lda, | ||
integer(c_int64_t) | strideA, | ||
real(c_float), dimension(:), target | S, | ||
integer(c_int64_t) | strideS, | ||
real(c_float), dimension(:), target | U, | ||
integer(c_int) | ldu, | ||
integer(c_int64_t) | strideU, | ||
real(c_float), dimension(:), target | V, | ||
integer(c_int) | ldv, | ||
integer(c_int64_t) | strideV, | ||
real(c_float), dimension(:), target | E, | ||
integer(c_int64_t) | strideE, | ||
integer(kind(rocblas_outofplace)) | fast_alg, | ||
integer(c_int) | myInfo, | ||
integer(c_int) | batch_count | ||
) |
The documentation for this interface was generated from the following file: