rocsolver_sgesvd_batched Interface Reference

rocsolver_sgesvd_batched Interface Reference#

HIPFORT API Reference: hipfort_rocsolver::rocsolver_sgesvd_batched Interface Reference
hipfort_rocsolver::rocsolver_sgesvd_batched Interface Reference

GESVD_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrix A (Singular Value Decomposition). More...

Public Member Functions

integer(kind(rocblas_status_success)) function rocsolver_sgesvd_batched_ (handle, left_svect, right_svect, m, n, A, lda, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)
 
integer(kind(rocblas_status_success)) function rocsolver_sgesvd_batched_full_rank (handle, left_svect, right_svect, m, n, A, lda, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)
 
integer(kind(rocblas_status_success)) function rocsolver_sgesvd_batched_rank_0 (handle, left_svect, right_svect, m, n, A, lda, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)
 
integer(kind(rocblas_status_success)) function rocsolver_sgesvd_batched_rank_1 (handle, left_svect, right_svect, m, n, A, lda, S, strideS, U, ldu, strideU, V, ldv, strideV, E, strideE, fast_alg, myInfo, batch_count)
 

Detailed Description

GESVD_BATCHED computes the singular values and optionally the singular vectors of a batch of general m-by-n matrix A (Singular Value Decomposition).

The SVD of matrix A_j in the batch is given by:

\[ A_j = U_j S_j V_j' \]

where the m-by-n matrix \(S_j\) is zero except, possibly, for its min(m,n) diagonal elements, which are the singular values of \(A_j\). \(U_j\) and \(V_j\) are orthogonal (unitary) matrices. The first min(m,n) columns of \(U_j\) and \(V_j\) are the left and right singular vectors of \(A_j\), respectively.

The computation of the singular vectors is optional and it is controlled by the function arguments left_svect and right_svect as described below. When computed, this function returns the transpose (or transpose conjugate) of the right singular vectors, i.e. the rows of \(V_j'\).

left_svect and right_svect are rocblas_svect enums that can take the following values:

  • rocblas_svect_all: the entire matrix \(U_j\) (or \(V_j'\)) is computed,
  • rocblas_svect_singular: only the singular vectors (first min(m,n) columns of \(U_j\) or rows of \(V_j'\)) are computed,
  • rocblas_svect_overwrite: the first columns (or rows) of \(A_j\) are overwritten with the singular vectors, or
  • rocblas_svect_none: no columns (or rows) of \(U_j\) (or \(V_j'\)) are computed, i.e. no singular vectors.

left_svect and right_svect cannot both be set to overwrite. When neither is set to overwrite, the contents of \(A_j\) are destroyed by the time the function returns.

Note
When m >> n (or n >> m) the algorithm could be sped up by compressing the matrix \(A_j\) via a QR (or LQ) factorization, and working with the triangular factor afterwards (thin-SVD). If the singular vectors are also requested, its computation could be sped up as well via executing some intermediate operations out-of-place, and relying more on matrix multiplications (GEMMs); this will require, however, a larger memory workspace. The parameter fast_alg controls whether the fast algorithm is executed or not. For more details, see the "Tuning rocSOLVER performance" and "Memory model" sections of the documentation.
Parameters
[in]handlerocblas_handle.
[in]left_svectrocblas_svect.
Specifies how the left singular vectors are computed.
[in]right_svectrocblas_svect.
Specifies how the right singular vectors are computed.
[in]mrocblas_int. m >= 0.
The number of rows of all matrices A_j in the batch.
[in]nrocblas_int. n >= 0.
The number of columns of all matrices A_j in the batch.
[in,out]AArray of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the matrices A_j. On exit, if left_svect (or right_svect) is equal to overwrite, the first columns (or rows) of A_j contain the left (or right) corresponding singular vectors; otherwise, the contents of A_j are destroyed.
[in]ldarocblas_int. lda >= m.
The leading dimension of A_j.
[out]Spointer to real type. Array on the GPU (the size depends on the value of strideS).
The singular values of A_j in decreasing order.
[in]strideSrocblas_stride.
Stride from the start of one vector S_j to the next one S_(j+1). There is no restriction for the value of strideS. Normal use case is strideS >= min(m,n).
[out]Upointer to type. Array on the GPU (the side depends on the value of strideU).
The matrices U_j of left singular vectors stored as columns. Not referenced if left_svect is set to overwrite or none.
[in]ldurocblas_int. ldu >= m if left_svect is all or singular; ldu >= 1 otherwise.
The leading dimension of U_j.
[in]strideUrocblas_stride.
Stride from the start of one matrix U_j to the next one U_(j+1). There is no restriction for the value of strideU. Normal use case is strideU >= ldu*min(m,n) if left_svect is set to singular, or strideU >= ldu*m when left_svect is equal to all.
[out]Vpointer to type. Array on the GPU (the size depends on the value of strideV).
The matrices V_j of right singular vectors stored as rows (transposed conjugate-transposed). Not referenced if right_svect is set to overwrite or none.
[in]ldvrocblas_int. ldv >= n if right_svect is all; ldv >= min(m,n) if right_svect is set to singular; or ldv >= 1 otherwise.
The leading dimension of V.
[in]strideVrocblas_stride.
Stride from the start of one matrix V_j to the next one V_(j+1). There is no restriction for the value of strideV. Normal use case is strideV >= ldv*n.
[out]Epointer to real type. Array on the GPU (the size depends on the value of strideE).
This array is used to work internally with the bidiagonal matrix B_j associated with A_j (using BDSQR). On exit, if info[j] > 0, E_j contains the unconverged off-diagonal elements of B_j (or properly speaking, a bidiagonal matrix orthogonally equivalent to B_j). The diagonal elements of this matrix are in S_j; those that converged correspond to a subset of the singular values of A_j (not necessarily ordered).
[in]strideErocblas_stride.
Stride from the start of one vector E_j to the next one E_(j+1). There is no restriction for the value of strideE. Normal use case is strideE >= min(m,n)-1.
[in]fast_algrocblas_workmode.
If set to rocblas_outofplace, the function will execute the fast thin-SVD version of the algorithm when possible.
[out]infopointer to a rocblas_int on the GPU.
If info[j] = 0, successful exit. If info[j] = i > 0, BDSQR did not converge. i elements of E_j did not converge to zero.
[in]batch_countrocblas_int. batch_count >= 0.
Number of matrices in the batch.

Member Function/Subroutine Documentation

◆ rocsolver_sgesvd_batched_()

integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_batched::rocsolver_sgesvd_batched_ ( type(c_ptr), value  handle,
integer(kind(rocblas_svect_all)), value  left_svect,
integer(kind(rocblas_svect_all)), value  right_svect,
integer(c_int), value  m,
integer(c_int), value  n,
type(c_ptr)  A,
integer(c_int), value  lda,
type(c_ptr), value  S,
integer(c_int64_t), value  strideS,
type(c_ptr), value  U,
integer(c_int), value  ldu,
integer(c_int64_t), value  strideU,
type(c_ptr), value  V,
integer(c_int), value  ldv,
integer(c_int64_t), value  strideV,
type(c_ptr), value  E,
integer(c_int64_t), value  strideE,
integer(kind(rocblas_outofplace)), value  fast_alg,
integer(c_int)  myInfo,
integer(c_int), value  batch_count 
)

◆ rocsolver_sgesvd_batched_full_rank()

integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_batched::rocsolver_sgesvd_batched_full_rank ( type(c_ptr)  handle,
integer(kind(rocblas_svect_all))  left_svect,
integer(kind(rocblas_svect_all))  right_svect,
integer(c_int)  m,
integer(c_int)  n,
real(c_float), dimension(:,:,:), target  A,
integer(c_int)  lda,
real(c_float), dimension(:), target  S,
integer(c_int64_t)  strideS,
real(c_float), dimension(:,:), target  U,
integer(c_int)  ldu,
integer(c_int64_t)  strideU,
real(c_float), dimension(:,:), target  V,
integer(c_int)  ldv,
integer(c_int64_t)  strideV,
real(c_float), dimension(:), target  E,
integer(c_int64_t)  strideE,
integer(kind(rocblas_outofplace))  fast_alg,
integer(c_int)  myInfo,
integer(c_int)  batch_count 
)

◆ rocsolver_sgesvd_batched_rank_0()

integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_batched::rocsolver_sgesvd_batched_rank_0 ( type(c_ptr)  handle,
integer(kind(rocblas_svect_all))  left_svect,
integer(kind(rocblas_svect_all))  right_svect,
integer(c_int)  m,
integer(c_int)  n,
real(c_float), target  A,
integer(c_int)  lda,
real(c_float), target  S,
integer(c_int64_t)  strideS,
real(c_float), target  U,
integer(c_int)  ldu,
integer(c_int64_t)  strideU,
real(c_float), target  V,
integer(c_int)  ldv,
integer(c_int64_t)  strideV,
real(c_float), target  E,
integer(c_int64_t)  strideE,
integer(kind(rocblas_outofplace))  fast_alg,
integer(c_int)  myInfo,
integer(c_int)  batch_count 
)

◆ rocsolver_sgesvd_batched_rank_1()

integer(kind(rocblas_status_success)) function hipfort_rocsolver::rocsolver_sgesvd_batched::rocsolver_sgesvd_batched_rank_1 ( type(c_ptr)  handle,
integer(kind(rocblas_svect_all))  left_svect,
integer(kind(rocblas_svect_all))  right_svect,
integer(c_int)  m,
integer(c_int)  n,
real(c_float), dimension(:), target  A,
integer(c_int)  lda,
real(c_float), dimension(:), target  S,
integer(c_int64_t)  strideS,
real(c_float), dimension(:), target  U,
integer(c_int)  ldu,
integer(c_int64_t)  strideU,
real(c_float), dimension(:), target  V,
integer(c_int)  ldv,
integer(c_int64_t)  strideV,
real(c_float), dimension(:), target  E,
integer(c_int64_t)  strideE,
integer(kind(rocblas_outofplace))  fast_alg,
integer(c_int)  myInfo,
integer(c_int)  batch_count 
)

The documentation for this interface was generated from the following file: