rocsolver_sgesvd_strided_batched Interface Reference

rocsolver_sgesvd_strided_batched Interface Reference#

HIPFORT API Reference: hipfort_rocsolver::rocsolver_sgesvd_strided_batched Interface Reference

GESVD_STRIDED_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrix A (Singular Value Decomposition). More...

Public Member Functions
integer(kind(rocblas_status_success)) function	rocsolver_sgesvd_strided_batched_ (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)

integer(kind(rocblas_status_success)) function	rocsolver_sgesvd_strided_batched_full_rank (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)

integer(kind(rocblas_status_success)) function	rocsolver_sgesvd_strided_batched_rank_0 (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)

integer(kind(rocblas_status_success)) function	rocsolver_sgesvd_strided_batched_rank_1 (handle, left_svect, right_svect, m, n, A, lda, strideA, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)

Detailed Description

GESVD_STRIDED_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrix A (Singular Value Decomposition).

The SVD of matrix A_j in the batch is given by:

\[ A_j = U_j S_j V_j' \]

where the m-by-n matrix \(S_j\) is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of \(A_j\). \(U_j\) and \(V_j\) are orthogonal (unitary) matrices. The first min(m,n) columns of \(U_j\) and \(V_j\) are the left and right singular vectors of \(A_j\), respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of \(V_j'\).

left_svect and right_svect are rocblas_svect enums that can take the following values:

rocblas_svect_all: the entire matrix \(U_j\) (or \(V_j'\)) is computed,
rocblas_svect_singular: only the singular vectors (first min(m,n) columns of \(U_j\) or rows of \(V_j'\)) are computed,
rocblas_svect_overwrite: the first columns (or rows) of \(A_j\) are overwritten with the singular vectors, or
rocblas_svect_none: no columns (or rows) of \(U_j\) (or \(V_j'\)) are computed, i.e. no singular vectors.

left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of \(A_j\) are destroyed by the time the function returns.

Note: When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix \(A_j\) via a QR (or LQ) factorization, and working with the triangular factor afterwards (thin-SVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations out-of-place, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details, see the "Tuning rocSOLVER performance" and "Memory model" sections of the documentation.

Parameters

[in]	handle	rocblas_handle.
[in]	left_svect	rocblas_svect. Specifies how the left singular vectors are computed.
[in]	right_svect	rocblas_svect. Specifies how the right singular vectors are computed.
[in]	m	rocblas_int. m >= 0. The number of rows of all matrices A_j in the batch.
[in]	n	rocblas_int. n >= 0. The number of columns of all matrices A_j in the batch.
[in,out]	A	pointer to type. Array on the GPU (the size depends on the value of strideA). On entry, the matrices A_j. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) of A_j contain the left (or right) corresponding singular vectors; otherwise, the contents of A_j are destroyed.
[in]	lda	rocblas_int. lda >= m. The leading dimension of A_j.
[in]	strideA	rocblas_stride. Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n.
[out]	S	pointer to real type. Array on the GPU (the size depends on the value of strideS). The singular values of A_j in decreasing order.
[in]	strideS	rocblas_stride. Stride from the start of one vector S_j to the next one S_(j+1). There is no restriction for the value of strideS. Normal use case is strideS >= min(m,n).
[out]	U	pointer to type. Array on the GPU (the side depends on the value of strideU). The matrices U_j of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.
[in]	ldu	rocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise. The leading dimension of U_j.
[in]	strideU	rocblas_stride. Stride from the start of one matrix U_j to the next one U_(j+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldumin(m,n) if left_svect is set to singular, or strideU >= ldum when left_svect is equal to all.
[out]	V	pointer to type. Array on the GPU (the size depends on the value of strideV). The matrices V_j of right singular vectors stored as rows (transposed conjugate-transposed). Not referenced if right_svect is set to overwrite or none.
[in]	ldv	rocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise. The leading dimension of V.
[in]	strideV	rocblas_stride. Stride from the start of one matrix V_j to the next one V_(j+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.
[out]	E	pointer to real type. Array on the GPU (the size depends on the value of strideE). This array is used to work internally with the bidiagonal matrix B_j associated with A_j (using BDSQR). On exit, if info > 0, E_j contains the unconverged off-diagonal elements of B_j (or properly speaking, a bidiagonal matrix orthogonally equivalent to B_j). The diagonal elements of this matrix are in S_j; those that converged correspond to a subset of the singular values of A_j (not necessarily ordered).
[in]	strideE	rocblas_stride. Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.
[in]	fast_alg	rocblas_workmode. If set to rocblas_outofplace, the function will execute the fast thin-SVD version of the algorithm when possible.
[out]	info	pointer to a rocblas_int on the GPU. If info[j] = 0, successful exit. If info[j] = i > 0, BDSQR did not converge. i elements of E_j did not converge to zero.
[in]	batch_count	rocblas_int. batch_count >= 0. Number of matrices in the batch.

Member Function/Subroutine Documentation