rocSOLVER LAPACK Auxiliary Functions#

These are functions that support more advanced LAPACK routines. The auxiliary functions are divided into the following categories:

Note

Throughout the APIs’ descriptions, we use the following notations:

  • i, j, and k are used as general purpose indices. In some legacy LAPACK APIs, k could be a parameter indicating some problem/matrix dimension.

  • Depending on the context, when it is necessary to index rows, columns and blocks or submatrices, i is assigned to rows, j to columns and k to blocks. l is always used to index matrices/problems in a batch.

  • x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.

  • To identify a block in a matrix or a matrix in the batch, k and l are used as sub-indices

  • x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.

  • If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.

  • When a matrix A is formed as the product of several matrices, the following notation is used: A=M(1)M(2)…M(t).

Vector and Matrix manipulations#

rocsolver_<type>lacgv()#

rocblas_status rocsolver_zlacgv(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx)#
rocblas_status rocsolver_clacgv(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx)#

LACGV conjugates the complex vector x.

It conjugates the n entries of a complex vector x with increment incx.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The dimension of vector x.

  • x[inout] pointer to type. Array on the GPU of size at least n (size depends on the value of incx). On entry, the vector x. On exit, each entry is overwritten with its conjugate value.

  • incx[in] rocblas_int. incx != 0. The distance between two consecutive elements of x. If incx is negative, the elements of x are indexed in reverse order.

rocsolver_<type>laswp()#

rocblas_status rocsolver_zlaswp(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)#
rocblas_status rocsolver_claswp(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)#
rocblas_status rocsolver_dlaswp(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)#
rocblas_status rocsolver_slaswp(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)#

LASWP performs a series of row interchanges on the matrix A.

Row interchanges are done one by one. If \(\text{ipiv}[k_1 + (j - k_1) \cdot \text{abs}(\text{incx})] = r\), then the j-th row of A will be interchanged with the r-th row of A, for \(j = k_1,k_1+1,\dots,k_2\). Indices \(k_1\) and \(k_2\) are 1-based indices.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix to which the row interchanges will be applied. On exit, the resulting permuted matrix.

  • lda[in] rocblas_int. lda > 0. The leading dimension of the array A.

  • k1[in] rocblas_int. k1 > 0. The k_1 index. It is the first element of ipiv for which a row interchange will be done. This is a 1-based index.

  • k2[in] rocblas_int. k2 > k1 > 0. The k_2 index. k_2 - k_1 + 1 is the number of elements of ipiv for which a row interchange will be done. This is a 1-based index.

  • ipiv[in] pointer to rocblas_int. Array on the GPU of dimension at least \(k_1 + (k_2 - k_1)\cdot \text{abs}(\text{incx})\). The vector of pivot indices. Only the elements in positions \(k_1\) through \(k_1 + (k_2 - k_1)\cdot \text{abs}(\text{incx})\) of this vector are accessed. Elements of ipiv are considered 1-based.

  • incx[in] rocblas_int. incx != 0. The distance between successive values of ipiv. If incx is negative, the pivots are applied in reverse order.

rocsolver_<type>lauum()#

rocblas_status rocsolver_zlauum(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda)#
rocblas_status rocsolver_clauum(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda)#
rocblas_status rocsolver_dlauum(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda)#
rocblas_status rocsolver_slauum(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda)#

LAUUM computes the product of the upper (or lower) triangular part U (or L) of a symmetric/Hemitian matrix A with its transpose.

If uplo indicates upper, then \(UU'\) is computed. If uplo indicates lower, then \(L'L\) is computed instead.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower triangular part of A will be used. If uplo indicates lower (or upper), then the upper (or lower) part of A is not referenced.

  • n[in] rocblas_int. n >= 0. The number of columns and rows of the matrix A.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, it contains the upper (or lower) part of the symmetric/Hermitian matrix. On exit, the upper (or lower) part is overwritten with the result of U*U’ (or L’*L).

  • lda[in] rocblas_int. lda >= n. The leading dimension of the array A.

Householder reflections#

rocsolver_<type>larfg()#

rocblas_status rocsolver_zlarfg(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *alpha, rocblas_double_complex *x, const rocblas_int incx, rocblas_double_complex *tau)#
rocblas_status rocsolver_clarfg(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *alpha, rocblas_float_complex *x, const rocblas_int incx, rocblas_float_complex *tau)#
rocblas_status rocsolver_dlarfg(rocblas_handle handle, const rocblas_int n, double *alpha, double *x, const rocblas_int incx, double *tau)#
rocblas_status rocsolver_slarfg(rocblas_handle handle, const rocblas_int n, float *alpha, float *x, const rocblas_int incx, float *tau)#

LARFG generates a Householder reflector H of order n.

The reflector H is such that

\[\begin{split} H'\left[\begin{array}{c} \text{alpha}\\ x \end{array}\right]=\left[\begin{array}{c} \text{beta}\\ 0 \end{array}\right] \end{split}\]

where x is an n-1 vector, and alpha and beta are scalars. Matrix H can be generated as

\[\begin{split} H = I - \text{tau}\left[\begin{array}{c} 1\\ v \end{array}\right]\left[\begin{array}{cc} 1 & v' \end{array}\right] \end{split}\]

where v is an n-1 vector, and tau is a scalar known as the Householder scalar. The vector

\[\begin{split} \bar{v}=\left[\begin{array}{c} 1\\ v \end{array}\right] \end{split}\]

is the Householder vector associated with the reflection.

Note

The matrix H is orthogonal/unitary (i.e. \(H'H=HH'=I\)). It is symmetric when real (i.e. \(H^T=H\)), but not Hermitian when complex (i.e. \(H^H\neq H\) in general).

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The order (size) of reflector H.

  • alpha[inout] pointer to type. A scalar on the GPU. On entry, the scalar alpha. On exit, it is overwritten with beta.

  • x[inout] pointer to type. Array on the GPU of size at least n-1 (size depends on the value of incx). On entry, the vector x, On exit, it is overwritten with vector v.

  • incx[in] rocblas_int. incx > 0. The distance between two consecutive elements of x.

  • tau[out] pointer to type. A scalar on the GPU. The Householder scalar tau.

rocsolver_<type>larft()#

rocblas_status rocsolver_zlarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *tau, rocblas_double_complex *T, const rocblas_int ldt)#
rocblas_status rocsolver_clarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *tau, rocblas_float_complex *T, const rocblas_int ldt)#
rocblas_status rocsolver_dlarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *tau, double *T, const rocblas_int ldt)#
rocblas_status rocsolver_slarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *tau, float *T, const rocblas_int ldt)#

LARFT generates the triangular factor T of a block reflector H of order n.

The block reflector H is defined as the product of k Householder matrices

\[\begin{split} \begin{array}{cl} H = H(1)H(2)\cdots H(k) & \: \text{if direct indicates forward direction, or} \\ H = H(k)\cdots H(2)H(1) & \: \text{if direct indicates backward direction} \end{array} \end{split}\]

The triangular factor T is upper triangular in the forward direction and lower triangular in the backward direction. If storev is column-wise, then

\[ H = I - VTV' \]

where the \(j\)th column of matrix V contains the Householder vector associated with \(H(j)\). If storev is row-wise, then

\[ H = I - V'TV \]

where the \(i\)th row of matrix V contains the Householder vector associated with \(H(i)\).

Parameters:
  • handle[in] rocblas_handle.

  • direct[in] rocblas_direct. Specifies the direction in which the Householder matrices are applied.

  • storev[in] rocblas_storev. Specifies how the Householder vectors are stored in matrix V.

  • n[in] rocblas_int. n >= 0. The order (size) of the block reflector.

  • k[in] rocblas_int. k >= 1. The number of Householder matrices forming H.

  • V[in] pointer to type. Array on the GPU of size ldv*k if column-wise, or ldv*n if row-wise. The matrix of Householder vectors.

  • ldv[in] rocblas_int. ldv >= n if column-wise, or ldv >= k if row-wise. Leading dimension of V.

  • tau[in] pointer to type. Array of k scalars on the GPU. The vector of all the Householder scalars.

  • T[out] pointer to type. Array on the GPU of dimension ldt*k. The triangular factor. T is upper triangular if direct indicates forward direction, otherwise it is lower triangular. The rest of the array is not used.

  • ldt[in] rocblas_int. ldt >= k. The leading dimension of T.

rocsolver_<type>larf()#

rocblas_status rocsolver_zlarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx, const rocblas_double_complex *alpha, rocblas_double_complex *A, const rocblas_int lda)#
rocblas_status rocsolver_clarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx, const rocblas_float_complex *alpha, rocblas_float_complex *A, const rocblas_int lda)#
rocblas_status rocsolver_dlarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, double *x, const rocblas_int incx, const double *alpha, double *A, const rocblas_int lda)#
rocblas_status rocsolver_slarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, float *x, const rocblas_int incx, const float *alpha, float *A, const rocblas_int lda)#

LARF applies a Householder reflector H to a general matrix A.

The Householder reflector H, of order m or n, is to be applied to an m-by-n matrix A from the left or the right, depending on the value of side. H is given by

\[ H = I - \text{alpha}\cdot xx' \]

where alpha is the Householder scalar and x is a Householder vector. H is never actually computed.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Determines whether H is applied from the left or the right.

  • m[in] rocblas_int. m >= 0. Number of rows of A.

  • n[in] rocblas_int. n >= 0. Number of columns of A.

  • x[in] pointer to type. Array on the GPU of size at least 1 + (m-1)*abs(incx) if left side, or at least 1 + (n-1)*abs(incx) if right side. The Householder vector x.

  • incx[in] rocblas_int. incx != 0. Distance between two consecutive elements of x. If incx < 0, the elements of x are indexed in reverse order.

  • alpha[in] pointer to type. A scalar on the GPU. The Householder scalar. If alpha = 0, then H = I (A will remain the same; x is never used)

  • A[inout] pointer to type. Array on the GPU of size lda*n. On entry, the matrix A. On exit, it is overwritten with H*A (or A*H).

  • lda[in] rocblas_int. lda >= m. Leading dimension of A.

rocsolver_<type>larfb()#

rocblas_status rocsolver_zlarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *T, const rocblas_int ldt, rocblas_double_complex *A, const rocblas_int lda)#
rocblas_status rocsolver_clarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *T, const rocblas_int ldt, rocblas_float_complex *A, const rocblas_int lda)#
rocblas_status rocsolver_dlarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *T, const rocblas_int ldt, double *A, const rocblas_int lda)#
rocblas_status rocsolver_slarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *T, const rocblas_int ldt, float *A, const rocblas_int lda)#

LARFB applies a block reflector H to a general m-by-n matrix A.

The block reflector H is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} HA & \: \text{(No transpose from the left),}\\ H'A & \: \text{(Transpose or conjugate transpose from the left),}\\ AH & \: \text{(No transpose from the right), or}\\ AH' & \: \text{(Transpose or conjugate transpose from the right).} \end{array} \end{split}\]

The block reflector H is defined as the product of k Householder matrices as

\[\begin{split} \begin{array}{cl} H = H(1)H(2)\cdots H(k) & \: \text{if direct indicates forward direction, or} \\ H = H(k)\cdots H(2)H(1) & \: \text{if direct indicates backward direction} \end{array} \end{split}\]

H is never stored. It is calculated as

\[ H = I - VTV' \]

where the \(j\)th column of matrix V contains the Householder vector associated with \(H(j)\), if storev is column-wise; or

\[ H = I - V'TV \]

where the \(i\)th row of matrix V contains the Householder vector associated with \(H(i)\), if storev is row-wise. T is the associated triangular factor as computed by LARFT.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply H.

  • trans[in] rocblas_operation. Specifies whether the block reflector or its transpose/conjugate transpose is to be applied.

  • direct[in] rocblas_direct. Specifies the direction in which the Householder matrices are to be applied to generate H.

  • storev[in] rocblas_storev. Specifies how the Householder vectors are stored in matrix V.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix A.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix A.

  • k[in] rocblas_int. k >= 1. The number of Householder matrices.

  • V[in] pointer to type. Array on the GPU of size ldv*k if column-wise, ldv*n if row-wise and applying from the right, or ldv*m if row-wise and applying from the left. The matrix of Householder vectors.

  • ldv[in] rocblas_int. ldv >= k if row-wise, ldv >= m if column-wise and applying from the left, or ldv >= n if column-wise and applying from the right. Leading dimension of V.

  • T[in] pointer to type. Array on the GPU of dimension ldt*k. The triangular factor of the block reflector.

  • ldt[in] rocblas_int. ldt >= k. The leading dimension of T.

  • A[inout] pointer to type. Array on the GPU of size lda*n. On entry, the matrix A. On exit, it is overwritten with H*A, A*H, H’*A, or A*H’.

  • lda[in] rocblas_int. lda >= m. Leading dimension of A.

Bidiagonal forms#

rocsolver_<type>labrd()#

rocblas_status rocsolver_zlabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup, rocblas_double_complex *X, const rocblas_int ldx, rocblas_double_complex *Y, const rocblas_int ldy)#
rocblas_status rocsolver_clabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup, rocblas_float_complex *X, const rocblas_int ldx, rocblas_float_complex *Y, const rocblas_int ldy)#
rocblas_status rocsolver_dlabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup, double *X, const rocblas_int ldx, double *Y, const rocblas_int ldy)#
rocblas_status rocsolver_slabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup, float *X, const rocblas_int ldx, float *Y, const rocblas_int ldy)#

LABRD computes the bidiagonal form of the first k rows and columns of a general m-by-n matrix A, as well as the matrices X and Y needed to reduce the remaining part of A.

The reduced form is given by:

\[ B = Q'AP \]

where the leading k-by-k block of B is upper bidiagonal if m >= n, or lower bidiagonal if m < n. Q and P are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(k), & \text{and} \\ P = G(1)G(2)\cdots G(k). \end{array} \end{split}\]

Each Householder matrix \(H(i)\) and \(G(i)\) is given by

\[\begin{split} \begin{array}{cl} H(i) = I - \text{tauq}[i]\cdot v_i^{}v_i', & \text{and} \\ G(i) = I - \text{taup}[i]\cdot u_i^{}u_i'. \end{array} \end{split}\]

If m >= n, the first \(i-1\) elements of the Householder vector \(v_i\) are zero, and \(v_i[i]=1\); while the first \(i\) elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1]=1\). If m < n, the first \(i\) elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1]=1\); while the first \(i-1\) elements of the Householder vector \(u_i\) are zero, and \(u_i[i]=1\).

The unreduced part of the matrix A can be updated using the block update

\[ A = A - VY' - XU' \]

where V and U are the m-by-k and n-by-k matrices formed with the vectors \(v_i\) and \(u_i\), respectively.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix A.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix A.

  • k[in] rocblas_int. min(m,n) >= k >= 0. The number of leading rows and columns of matrix A that will be reduced.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the m-by-n matrix to be reduced. On exit, the first k elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n), contain the bidiagonal form B. If m >= n, the elements below the diagonal of the first k columns are the possibly non-zero elements of the Householder vectors associated with Q, while the elements above the superdiagonal of the first k rows are the n - i - 1 possibly non-zero elements of the Householder vectors related to P. If m < n, the elements below the subdiagonal of the first k columns are the m - i - 1 possibly non-zero elements of the Householder vectors related to Q, while the elements above the diagonal of the first k rows are the n - i possibly non-zero elements of the vectors associated with P.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • D[out] pointer to real type. Array on the GPU of dimension k. The diagonal elements of B.

  • E[out] pointer to real type. Array on the GPU of dimension k. The off-diagonal elements of B.

  • tauq[out] pointer to type. Array on the GPU of dimension k. The Householder scalars associated with matrix Q.

  • taup[out] pointer to type. Array on the GPU of dimension k. The Householder scalars associated with matrix P.

  • X[out] pointer to type. Array on the GPU of dimension ldx*k. The m-by-k matrix needed to update the unreduced part of A.

  • ldx[in] rocblas_int. ldx >= m. The leading dimension of X.

  • Y[out] pointer to type. Array on the GPU of dimension ldy*k. The n-by-k matrix needed to update the unreduced part of A.

  • ldy[in] rocblas_int. ldy >= n. The leading dimension of Y.

rocsolver_<type>bdsqr()#

rocblas_status rocsolver_zbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *U, const rocblas_int ldu, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_cbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *U, const rocblas_int ldu, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_dbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, double *V, const rocblas_int ldv, double *U, const rocblas_int ldu, double *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_sbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, float *V, const rocblas_int ldv, float *U, const rocblas_int ldu, float *C, const rocblas_int ldc, rocblas_int *info)#

BDSQR computes the singular value decomposition (SVD) of an n-by-n bidiagonal matrix B, using the implicit QR algorithm.

The SVD of B has the form:

\[ B = QSP' \]

where S is the n-by-n diagonal matrix of singular values of B, the columns of Q are the left singular vectors of B, and the columns of P are its right singular vectors.

The computation of the singular vectors is optional; this function accepts input matrices U (of size nu-by-n) and V (of size n-by-nv) that are overwritten with \(UQ\) and \(P'V\). If nu = 0 no left vectors are computed; if nv = 0 no right vectors are computed.

Optionally, this function can also compute \(Q'C\) for a given n-by-nc input matrix C.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether B is upper or lower bidiagonal.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of matrix B.

  • nv[in] rocblas_int. nv >= 0. The number of columns of matrix V.

  • nu[in] rocblas_int. nu >= 0. The number of rows of matrix U.

  • nc[in] rocblas_int. nu >= 0. The number of columns of matrix C.

  • D[inout] pointer to real type. Array on the GPU of dimension n. On entry, the diagonal elements of B. On exit, if info = 0, the singular values of B in decreasing order; if info > 0, the diagonal elements of a bidiagonal matrix orthogonally equivalent to B.

  • E[inout] pointer to real type. Array on the GPU of dimension n-1. On entry, the off-diagonal elements of B. On exit, if info > 0, the off-diagonal elements of a bidiagonal matrix orthogonally equivalent to B (if info = 0 this matrix converges to zero).

  • V[inout] pointer to type. Array on the GPU of dimension ldv*nv. On entry, the matrix V. On exit, it is overwritten with P’*V. (Not referenced if nv = 0).

  • ldv[in] rocblas_int. ldv >= n if nv > 0, or ldv >=1 if nv = 0. The leading dimension of V.

  • U[inout] pointer to type. Array on the GPU of dimension ldu*n. On entry, the matrix U. On exit, it is overwritten with U*Q. (Not referenced if nu = 0).

  • ldu[in] rocblas_int. ldu >= nu. The leading dimension of U.

  • C[inout] pointer to type. Array on the GPU of dimension ldc*nc. On entry, the matrix C. On exit, it is overwritten with Q’*C. (Not referenced if nc = 0).

  • ldc[in] rocblas_int. ldc >= n if nc > 0, or ldc >=1 if nc = 0. The leading dimension of C.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, i elements of E have not converged to zero.

rocsolver_<type>bdsvdx()#

rocblas_status rocsolver_dbdsvdx(rocblas_handle handle, const rocblas_fill uplo, const rocblas_svect svect, const rocblas_srange srange, const rocblas_int n, double *D, double *E, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, double *S, double *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_sbdsvdx(rocblas_handle handle, const rocblas_fill uplo, const rocblas_svect svect, const rocblas_srange srange, const rocblas_int n, float *D, float *E, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, rocblas_int *nsv, float *S, float *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#

BDSVDX computes a set of singular values of a bidiagonal matrix B.

This function computes all the singular values of B, all the singular values in the half-open interval \([vl, vu)\), or the il-th through iu-th singular values, depending on the value of srange.

Depending on the value of svect, the corresponding singular vectors will be computed and stored as blocks in the output matrix Z. That is,

\[\begin{split} Z = \left[\begin{array}{c} U\\ V \end{array}\right] \end{split}\]

where U contains the corresponding left singular vectors of B, and V contains the corresponding right singular vectors.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether B is upper or lower bidiagonal.

  • svect[in] rocblas_svect. Specifies how the singular vectors are computed. Only rocblas_svect_none and rocblas_svect_singular are accepted.

  • srange[in] rocblas_srange. Specifies the type of range or interval of the singular values to be computed.

  • n[in] rocblas_int. n >= 0. The order of the bidiagonal matrix B.

  • D[in] pointer to real type. Array on the GPU of dimension n. The diagonal elements of the bidiagonal matrix.

  • E[in] pointer to real type. Array on the GPU of dimension n-1. The off-diagonal elements of the bidiagonal matrix.

  • vl[in] real type. 0 <= vl < vu. The lower bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of B or the singular values within a set of indices.

  • vu[in] real type. 0 <= vl < vu. The upper bound of the search interval [vl, vu). Ignored if srange indicates to look for all the singular values of B or the singular values within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the largest singular value to be computed. Ignored if srange indicates to look for all the singular values of B or the singular values in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise. The index of the smallest singular value to be computed. Ignored if srange indicates to look for all the singular values of B or the singular values in a half-open interval.

  • nsv[out] pointer to a rocblas_int on the GPU. The total number of singular values found. If srange is rocblas_srange_all, nsv = n. If srange is rocblas_srange_index, nsv = iu - il + 1. Otherwise, 0 <= nsv <= n.

  • S[out] pointer to real type. Array on the GPU of dimension nsv. The first nsv elements contain the computed singular values in descending order. Note: If srange is rocblas_srange_value, then the value of nsv is not known in advance. In this case, the user should ensure that S is large enough to hold n values.

  • Z[out] pointer to real type. Array on the GPU of dimension ldz*nsv. If info = 0, the first nsv columns contain the computed singular vectors corresponding to the singular values in S. The first n rows of Z contain the matrix U, and the next n rows contain the matrix V. Not referenced if svect is rocblas_svect_none. Note: If srange is rocblas_srange_value, then the value of nsv is not known in advance. In this case, the user should ensure that Z is large enough to hold n columns.

  • ldz[in] rocblas_int. ldz >= 2*n if svect is rocblas_svect_singular; ldz >= 1 otherwise. Specifies the leading dimension of Z.

  • ifail[out] pointer to rocblas_int. Array on the GPU of dimension n. If info = 0, the first nsv elements of ifail are zero. Otherwise, contains the indices of those eigenvectors that failed to converge, as returned by STEIN. Not referenced if svect is rocblas_svect_none.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, i eigenvectors did not converge in STEIN; their indices are stored in ifail.

Tridiagonal forms#

rocsolver_<type>latrd()#

rocblas_status rocsolver_zlatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *E, rocblas_double_complex *tau, rocblas_double_complex *W, const rocblas_int ldw)#
rocblas_status rocsolver_clatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *E, rocblas_float_complex *tau, rocblas_float_complex *W, const rocblas_int ldw)#
rocblas_status rocsolver_dlatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *E, double *tau, double *W, const rocblas_int ldw)#
rocblas_status rocsolver_slatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *E, float *tau, float *W, const rocblas_int ldw)#

LATRD computes the tridiagonal form of k rows and columns of a symmetric/hermitian matrix A, as well as the matrix W needed to update the remaining part of A.

The reduced form is given by:

\[ T = Q'AQ \]

If uplo is lower, the first k rows and columns of T form the tridiagonal block. If uplo is upper, then the last k rows and columns of T form the tridiagonal block. Q is an orthogonal/unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H(1)H(2)\cdots H(k) & \text{if uplo indicates lower, or}\\ Q = H(n)H(n-1)\cdots H(n-k+1) & \text{if uplo is upper}. \end{array} \end{split}\]

Each Householder matrix \(H(i)\) is given by

\[ H(i) = I - \text{tau}[i]\cdot v_i^{}v_i' \]

where tau[ \(i\)] is the corresponding Householder scalar. When uplo indicates lower, the first \(i\) elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo is upper, the last n- \(i\) elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

The unreduced part of the matrix A can be updated using a rank update of the form:

\[ A = A - VW' - WV' \]

where V is the n-by-k matrix formed by the vectors \(v_i\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • k[in] rocblas_int. 0 <= k <= n. The number of rows and columns of the matrix A to be reduced.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the n-by-n matrix to be reduced. On exit, if uplo is lower, the first k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements below the diagonal contain the possibly non-zero entries of the Householder vectors associated with Q, stored as columns. If uplo is upper, the last k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements above the diagonal contain the possibly non-zero entries of the Householder vectors associated with Q, stored as columns.

  • lda[in] rocblas_int. lda >= n. The leading dimension of A.

  • E[out] pointer to real type. Array on the GPU of dimension n-1. If upper (lower), the last (first) k elements of E are the off-diagonal elements of the computed tridiagonal block.

  • tau[out] pointer to type. Array on the GPU of dimension n-1. If upper (lower), the last (first) k elements of tau are the Householder scalars related to Q.

  • W[out] pointer to type. Array on the GPU of dimension ldw*k. The n-by-k matrix needed to update the unreduced part of A.

  • ldw[in] rocblas_int. ldw >= n. The leading dimension of W.

rocsolver_<type>sterf()#

rocblas_status rocsolver_dsterf(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *info)#
rocblas_status rocsolver_ssterf(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *info)#

STERF computes the eigenvalues of a symmetric tridiagonal matrix.

The eigenvalues of the symmetric tridiagonal matrix are computed by the Pal-Walker-Kahan variant of the QL/QR algorithm, and returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the tridiagonal matrix.

  • D[inout] pointer to real type. Array on the GPU of dimension n. On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • E[inout] pointer to real type. Array on the GPU of dimension n-1. On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, this array converges to zero. If info > 0, the off-diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, STERF did not converge. i elements of E did not converge to zero.

rocsolver_<type>stebz()#

rocblas_status rocsolver_dstebz(rocblas_handle handle, const rocblas_erange erange, const rocblas_eorder eorder, const rocblas_int n, const double vl, const double vu, const rocblas_int il, const rocblas_int iu, const double abstol, double *D, double *E, rocblas_int *nev, rocblas_int *nsplit, double *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_int *info)#
rocblas_status rocsolver_sstebz(rocblas_handle handle, const rocblas_erange erange, const rocblas_eorder eorder, const rocblas_int n, const float vl, const float vu, const rocblas_int il, const rocblas_int iu, const float abstol, float *D, float *E, rocblas_int *nev, rocblas_int *nsplit, float *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_int *info)#

STEBZ computes a set of eigenvalues of a symmetric tridiagonal matrix T.

This function computes all the eigenvalues of T, all the eigenvalues in the half-open interval (vl, vu], or the il-th through iu-th eigenvalues, depending on the value of erange.

The eigenvalues are returned in increasing order either for the entire matrix, or grouped by independent diagonal blocks (if they exist), depending on the value of eorder.

Parameters:
  • handle[in] rocblas_handle.

  • erange[in] rocblas_erange. Specifies the type of range or interval of the eigenvalues to be computed.

  • eorder[in] rocblas_eorder. Specifies whether the computed eigenvalues will be ordered by their position in the entire spectrum, or grouped by independent diagonal (split off) blocks.

  • n[in] rocblas_int. n >= 0. The order of the tridiagonal matrix T.

  • vl[in] real type. vl < vu. The lower bound of the search interval (vl, vu]. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues within a set of indices.

  • vu[in] real type. vl < vu. The upper bound of the search interval (vl, vu]. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues within a set of indices.

  • il[in] rocblas_int. il = 1 if n = 0; 1 <= il <= iu otherwise. The index of the smallest eigenvalue to be computed. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues in a half-open interval.

  • iu[in] rocblas_int. iu = 0 if n = 0; 1 <= il <= iu otherwise. The index of the largest eigenvalue to be computed. Ignored if erange indicates to look for all the eigenvalues of T or the eigenvalues in a half-open interval.

  • abstol[in] real type. The absolute tolerance. An eigenvalue is considered to be located if it lies in an interval whose width is <= abstol. If abstol is negative, then machine-epsilon times the 1-norm of the tridiagonal form of A will be used as tolerance. If abstol=0, then the tolerance will be set to twice the underflow threshold; this is the tolerance that could get the most accurate results.

  • D[in] pointer to real type. Array on the GPU of dimension n. The diagonal elements of the tridiagonal matrix.

  • E[in] pointer to real type. Array on the GPU of dimension n-1. The off-diagonal elements of the tridiagonal matrix.

  • nev[out] pointer to a rocblas_int on the GPU. The total number of eigenvalues found.

  • nsplit[out] pointer to a rocblas_int on the GPU. The number of split off blocks in the matrix.

  • W[out] pointer to real type. Array on the GPU of dimension n. The first nev elements contain the computed eigenvalues. (The remaining elements can be used as workspace for internal computations).

  • iblock[out] pointer to rocblas_int. Array on the GPU of dimension n. The block indices corresponding to each eigenvalue. When matrix T has split off blocks (nsplit > 1), then if iblock[i] = k, the eigenvalue W[i] belongs to the k-th diagonal block from the top.

  • isplit[out] pointer to rocblas_int. Array on the GPU of dimension n. The splitting indices that divide the tridiagonal matrix into diagonal blocks. The k-th block stretches from the end of the (k-1)-th block (or the top left corner of the tridiagonal matrix, in the case of the 1st block) to the isplit[k]-th row/column.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = 1, the bisection did not converge for some eigenvalues, i.e. the returned values are not as accurate as the given tolerance. The non-converged eigenvalues are flagged by negative entries in iblock.

rocsolver_<type>steqr()#

rocblas_status rocsolver_zsteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_csteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_dsteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, double *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_ssteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, float *C, const rocblas_int ldc, rocblas_int *info)#

STEQR computes the eigenvalues and (optionally) eigenvectors of a symmetric tridiagonal matrix.

The eigenvalues of the symmetric tridiagonal matrix are computed by the implicit QL/QR algorithm, and returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. When D and E correspond to the tridiagonal form of a full symmetric/Hermitian matrix, as returned by, e.g., SYTRD or HETRD, the eigenvectors of the original matrix can also be computed, depending on the value of evect.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies how the eigenvectors are computed.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the tridiagonal matrix.

  • D[inout] pointer to real type. Array on the GPU of dimension n. On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • E[inout] pointer to real type. Array on the GPU of dimension n-1. On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, this array converges to zero. If info > 0, the off-diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • C[inout] pointer to type. Array on the GPU of dimension ldc*n. On entry, if evect is original, the orthogonal/unitary matrix used for the reduction to tridiagonal form as returned by, e.g., ORGTR or UNGTR. On exit, it is overwritten with the eigenvectors of the original symmetric/Hermitian matrix (if evect is original), or the eigenvectors of the tridiagonal matrix (if evect is tridiagonal). (Not referenced if evect is none).

  • ldc[in] rocblas_int. ldc >= n if evect is original or tridiagonal. Specifies the leading dimension of C. (Not referenced if evect is none).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, STEQR did not converge. i elements of E did not converge to zero.

rocsolver_<type>stedc()#

rocblas_status rocsolver_zstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_cstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_dstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, double *C, const rocblas_int ldc, rocblas_int *info)#
rocblas_status rocsolver_sstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, float *C, const rocblas_int ldc, rocblas_int *info)#

STEDC computes the eigenvalues and (optionally) eigenvectors of a symmetric tridiagonal matrix.

This function uses the divide and conquer method to compute the eigenvectors. The eigenvalues are returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. When D and E correspond to the tridiagonal form of a full symmetric/Hermitian matrix, as returned by, e.g., SYTRD or HETRD, the eigenvectors of the original matrix can also be computed, depending on the value of evect.

Parameters:
  • handle[in] rocblas_handle.

  • evect[in] rocblas_evect. Specifies how the eigenvectors are computed.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the tridiagonal matrix.

  • D[inout] pointer to real type. Array on the GPU of dimension n. On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order.

  • E[inout] pointer to real type. Array on the GPU of dimension n-1. On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, the values of this array are destroyed.

  • C[inout] pointer to type. Array on the GPU of dimension ldc*n. On entry, if evect is original, the orthogonal/unitary matrix used for the reduction to tridiagonal form as returned by, e.g., ORGTR or UNGTR. On exit, if info = 0, it is overwritten with the eigenvectors of the original symmetric/Hermitian matrix (if evect is original), or the eigenvectors of the tridiagonal matrix (if evect is tridiagonal). (Not referenced if evect is none).

  • ldc[in] rocblas_int. ldc >= n if evect is original or tridiagonal. Specifies the leading dimension of C. (Not referenced if evect is none).

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, STEDC failed to compute an eigenvalue on the sub-matrix formed by the rows and columns info/(n+1) through mod(info,n+1).

rocsolver_<type>stein()#

rocblas_status rocsolver_zstein(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *nev, double *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_double_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_cstein(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *nev, float *W, rocblas_int *iblock, rocblas_int *isplit, rocblas_float_complex *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_dstein(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *nev, double *W, rocblas_int *iblock, rocblas_int *isplit, double *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#
rocblas_status rocsolver_sstein(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *nev, float *W, rocblas_int *iblock, rocblas_int *isplit, float *Z, const rocblas_int ldz, rocblas_int *ifail, rocblas_int *info)#

STEIN computes the eigenvectors associated with a set of provided eigenvalues of a symmetric tridiagonal matrix.

The eigenvectors of the symmetric tridiagonal matrix are computed using inverse iteration.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. The eigenvalues must be provided in the array W, as returned by STEBZ.

Parameters:
  • handle[in] rocblas_handle.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the tridiagonal matrix.

  • D[in] pointer to real type. Array on the GPU of dimension n. The diagonal elements of the tridiagonal matrix.

  • E[in] pointer to real type. Array on the GPU of dimension n-1. The off-diagonal elements of the tridiagonal matrix.

  • nev[in] pointer to a rocblas_int on the GPU. 0 <= nev <= n. The number of provided eigenvalues, and the number of eigenvectors to be computed.

  • W[in] pointer to real type. Array on the GPU of dimension >= nev. A subset of nev eigenvalues of the tridiagonal matrix, as returned by STEBZ.

  • iblock[in] pointer to rocblas_int. Array on the GPU of dimension n. The block indices corresponding to each eigenvalue, as returned by STEBZ. If iblock[i] = k, then eigenvalue W[i] belongs to the k-th block from the top.

  • isplit[in] pointer to rocblas_int. Array on the GPU of dimension n. The splitting indices that divide the tridiagonal matrix into diagonal blocks, as returned by STEBZ. The k-th block stretches from the end of the (k-1)-th block (or the top left corner of the tridiagonal matrix, in the case of the 1st block) to the isplit[k]-th row/column.

  • Z[out] pointer to type. Array on the GPU of dimension ldz*nev. On exit, contains the eigenvectors of the tridiagonal matrix corresponding to the provided eigenvalues, stored by columns.

  • ldz[in] rocblas_int. ldz >= n. Specifies the leading dimension of Z.

  • ifail[out] pointer to rocblas_int. Array on the GPU of dimension n. If info = 0, the first nev elements of ifail are zero. Otherwise, contains the indices of those eigenvectors that failed to converge.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, i eigenvectors did not converge; their indices are stored in IFAIL.

Symmetric matrices#

rocsolver_<type>lasyf()#

rocblas_status rocsolver_zlasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_clasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_dlasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#
rocblas_status rocsolver_slasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)#

LASYF computes a partial factorization of a symmetric matrix \(A\) using Bunch-Kaufman diagonal pivoting.

The partial factorization has the form

\[\begin{split} A = \left[ \begin{array}{cc} I & U_{12} \\ 0 & U_{22} \end{array} \right] \left[ \begin{array}{cc} A_{11} & 0 \\ 0 & D \end{array} \right] \left[ \begin{array}{cc} I & 0 \\ U_{12}^T & U_{22}^T \end{array} \right] \end{split}\]

or

\[\begin{split} A = \left[ \begin{array}{cc} L_{11} & 0 \\ L_{21} & I \end{array} \right] \left[ \begin{array}{cc} D & 0 \\ 0 & A_{22} \end{array} \right] \left[ \begin{array}{cc} L_{11}^T & L_{21}^T \\ 0 & I \end{array} \right] \end{split}\]

depending on the value of uplo. The order of the block diagonal matrix \(D\) is either \(nb\) or \(nb-1\), and is returned in the argument \(kb\).

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix A.

  • nb[in] rocblas_int. 2 <= nb <= n. The number of columns of A to be factored.

  • kb[out] pointer to a rocblas_int on the GPU. The number of columns of A that were actually factored (either nb or nb-1).

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the symmetric matrix A to be factored. On exit, the partially factored matrix.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • ipiv[out] pointer to rocblas_int. Array on the GPU of dimension n. The vector of pivot indices. Elements of ipiv are 1-based indices. If uplo is upper, then only the last kb elements of ipiv will be set. For n - kb < k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0, then rows and columns k-1 and -ipiv[k] were interchanged and D[k-1,k-1] to D[k,k] is a 2-by-2 diagonal block. If uplo is lower, then only the first kb elements of ipiv will be set. For 1 <= k <= kb, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k+1] < 0, then rows and columns k+1 and -ipiv[k] were interchanged and D[k,k] to D[k+1,k+1] is a 2-by-2 diagonal block.

  • info[out] pointer to a rocblas_int on the GPU. If info = 0, successful exit. If info = i > 0, D is singular. D[i,i] is the first diagonal zero.

Orthonormal matrices#

rocsolver_<type>org2r()#

rocblas_status rocsolver_dorg2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorg2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)#

ORG2R generates an m-by-n Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(k). \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

rocsolver_<type>orgqr()#

rocblas_status rocsolver_dorgqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorgqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)#

ORGQR generates an m-by-n Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(k) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

rocsolver_<type>orgl2()#

rocblas_status rocsolver_dorgl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorgl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)#

ORGL2 generates an m-by-n Matrix Q with orthonormal rows.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H(k)H(k-1)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. 0 <= m <= n. The number of rows of the matrix Q.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= m. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

rocsolver_<type>orglq()#

rocblas_status rocsolver_dorglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)#

ORGLQ generates an m-by-n Matrix Q with orthonormal rows.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H(k)H(k-1)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. 0 <= m <= n. The number of rows of the matrix Q.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= m. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

rocsolver_<type>org2l()#

rocblas_status rocsolver_dorg2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorg2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)#

ORG2L generates an m-by-n Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

\[ Q = H(k)H(k-1)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

rocsolver_<type>orgql()#

rocblas_status rocsolver_dorgql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorgql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)#

ORGQL generates an m-by-n Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the last n column of the product of k Householder reflectors of order m

\[ Q = H(k)H(k-1)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

rocsolver_<type>orgbr()#

rocblas_status rocsolver_dorgbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorgbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)#

ORGBR generates an m-by-n Matrix Q with orthonormal rows or columns.

If storev is column-wise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(k) \]

If m < k, Q is defined as the product of Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(m-1) \]

On the other hand, if storev is row-wise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H(k)H(k-1)\cdots H(1) \]

If n <= k, Q is defined as the product of Householder reflectors of order n

\[ Q = H(n-1)H(n-2)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEBRD in its arguments A and tauq or taup.

Parameters:
  • handle[in] rocblas_handle.

  • storev[in] rocblas_storev. Specifies whether to work column-wise or row-wise.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q. If row-wise, then min(n,k) <= m <= n.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix Q. If column-wise, then min(m,k) <= n <= m.

  • k[in] rocblas_int. k >= 0. The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by GEBRD.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the Householder vectors as returned by GEBRD. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension min(m,k) if column-wise, or min(n,k) if row-wise. The Householder scalars as returned by GEBRD.

rocsolver_<type>orgtr()#

rocblas_status rocsolver_dorgtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)#
rocblas_status rocsolver_sorgtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)#

ORGTR generates an n-by-n orthogonal Matrix Q.

Q is defined as the product of n-1 Householder reflectors of order n. If uplo indicates upper, then Q has the form

\[ Q = H(n-1)H(n-2)\cdots H(1) \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H(1)H(2)\cdots H(n-1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by SYTRD in its arguments A and tau.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix Q.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the Householder vectors as returned by SYTRD. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= n. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension n-1. The Householder scalars as returned by SYTRD.

rocsolver_<type>orm2r()#

rocblas_status rocsolver_dorm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sorm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORM2R multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2) \cdots H(k) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQRF in the first k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, or lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>ormqr()#

rocblas_status rocsolver_dormqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sormqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORMQR multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2)\cdots H(k) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQRF in the first k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, or lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>orml2()#

rocblas_status rocsolver_dorml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sorml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORML2 multiplies a matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)H(k-1)\cdots H(1) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right. The Householder vectors as returned by GELQF in the first k rows of its argument A.

  • lda[in] rocblas_int. lda >= k. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>ormlq()#

rocblas_status rocsolver_dormlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sormlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORMLQ multiplies a matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)H(k-1)\cdots H(1) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right. The Householder vectors as returned by GELQF in the first k rows of its argument A.

  • lda[in] rocblas_int. lda >= k. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>orm2l()#

rocblas_status rocsolver_dorm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sorm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORM2L multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)H(k-1)\cdots H(1) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQLF in the last k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>ormql()#

rocblas_status rocsolver_dormql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sormql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORMQL multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)H(k-1)\cdots H(1) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQLF in the last k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>ormbr()#

rocblas_status rocsolver_dormbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sormbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORMBR multiplies a matrix Q with orthonormal rows or columns by a general m-by-n matrix C.

If storev is column-wise, then the matrix Q has orthonormal columns. If storev is row-wise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

The order q of the orthogonal matrix Q is q = m if applying from the left, or q = n if applying from the right.

When storev is column-wise, if q >= k, then Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2)\cdots H(k), \]

and if q < k, then Q is defined as the product

\[ Q = H(1)H(2)\cdots H(q-1). \]

When storev is row-wise, if q > k, then Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2)\cdots H(k), \]

and if q <= k, Q is defined as the product

\[ Q = H(1)H(2)\cdots H(q-1). \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by GEBRD in its arguments A and tauq or taup.

Parameters:
  • handle[in] rocblas_handle.

  • storev[in] rocblas_storev. Specifies whether to work column-wise or row-wise.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0. The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by GEBRD.

  • A[in] pointer to type. Array on the GPU of size lda*min(q,k) if column-wise, or lda*q if row-wise. The Householder vectors as returned by GEBRD.

  • lda[in] rocblas_int. lda >= q if column-wise, or lda >= min(q,k) if row-wise. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least min(q,k). The Householder scalars as returned by GEBRD.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>ormtr()#

rocblas_status rocsolver_dormtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)#
rocblas_status rocsolver_sormtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)#

ORMTR multiplies an orthogonal matrix Q by a general m-by-n matrix C.

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

The order q of the orthogonal matrix Q is q = m if applying from the left, or q = n if applying from the right.

Q is defined as a product of q-1 Householder reflectors. If uplo indicates upper, then Q has the form

\[ Q = H(q-1)H(q-2)\cdots H(1). \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H(1)H(2)\cdots H(q-1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by SYTRD in its arguments A and tau.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • uplo[in] rocblas_fill. Specifies whether the SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • A[in] pointer to type. Array on the GPU of size lda*q. On entry, the Householder vectors as returned by SYTRD.

  • lda[in] rocblas_int. lda >= q. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least q-1. The Householder scalars as returned by SYTRD.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

Unitary matrices#

rocsolver_<type>ung2r()#

rocblas_status rocsolver_zung2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cung2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNG2R generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(k) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

rocsolver_<type>ungqr()#

rocblas_status rocsolver_zungqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cungqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNGQR generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(k) \]

Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

rocsolver_<type>ungl2()#

rocblas_status rocsolver_zungl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cungl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNGL2 generates an m-by-n complex Matrix Q with orthonormal rows.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H(k)^HH(k-1)^H\cdots H(1)^H \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. 0 <= m <= n. The number of rows of the matrix Q.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= m. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

rocsolver_<type>unglq()#

rocblas_status rocsolver_zunglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cunglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNGLQ generates an m-by-n complex Matrix Q with orthonormal rows.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H(k)^HH(k-1)^H\cdots H(1)^H \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. 0 <= m <= n. The number of rows of the matrix Q.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= m. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

rocsolver_<type>ung2l()#

rocblas_status rocsolver_zung2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cung2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNG2L generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

\[ Q = H(k)H(k-1)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

rocsolver_<type>ungql()#

rocblas_status rocsolver_zungql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cungql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNGQL generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

\[ Q = H(k)H(k-1)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q.

  • n[in] rocblas_int. 0 <= n <= m. The number of columns of the matrix Q.

  • k[in] rocblas_int. 0 <= k <= n. The number of Householder reflectors.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the matrix A as returned by GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

rocsolver_<type>ungbr()#

rocblas_status rocsolver_zungbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cungbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNGBR generates an m-by-n complex Matrix Q with orthonormal rows or columns.

If storev is column-wise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(k) \]

If m < k, Q is defined as the product of Householder reflectors of order m

\[ Q = H(1)H(2)\cdots H(m-1) \]

On the other hand, if storev is row-wise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H(k)H(k-1)\cdots H(1) \]

If n <= k, Q is defined as the product of Householder reflectors of order n

\[ Q = H(n-1)H(n-2)\cdots H(1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEBRD in its arguments A and tauq or taup.

Parameters:
  • handle[in] rocblas_handle.

  • storev[in] rocblas_storev. Specifies whether to work column-wise or row-wise.

  • m[in] rocblas_int. m >= 0. The number of rows of the matrix Q. If row-wise, then min(n,k) <= m <= n.

  • n[in] rocblas_int. n >= 0. The number of columns of the matrix Q. If column-wise, then min(m,k) <= n <= m.

  • k[in] rocblas_int. k >= 0. The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by GEBRD.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the Householder vectors as returned by GEBRD. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension min(m,k) if column-wise, or min(n,k) if row-wise. The Householder scalars as returned by GEBRD.

rocsolver_<type>ungtr()#

rocblas_status rocsolver_zungtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)#
rocblas_status rocsolver_cungtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)#

UNGTR generates an n-by-n unitary Matrix Q.

Q is defined as the product of n-1 Householder reflectors of order n. If uplo indicates upper, then Q has the form

\[ Q = H(n-1)H(n-2)\cdots H(1) \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H(1)H(2)\cdots H(n-1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by HETRD in its arguments A and tau.

Parameters:
  • handle[in] rocblas_handle.

  • uplo[in] rocblas_fill. Specifies whether the HETRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • n[in] rocblas_int. n >= 0. The number of rows and columns of the matrix Q.

  • A[inout] pointer to type. Array on the GPU of dimension lda*n. On entry, the Householder vectors as returned by HETRD. On exit, the computed matrix Q.

  • lda[in] rocblas_int. lda >= m. Specifies the leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension n-1. The Householder scalars as returned by HETRD.

rocsolver_<type>unm2r()#

rocblas_status rocsolver_zunm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNM2R multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2)\cdots H(k) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQRF in the first k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, or lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>unmqr()#

rocblas_status rocsolver_zunmqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunmqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNMQR multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2)\cdots H(k) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQRF in the first k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, or lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQRF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>unml2()#

rocblas_status rocsolver_zunml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNML2 multiplies a complex matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)^HH(k-1)^H\cdots H(1)^H \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right. The Householder vectors as returned by GELQF in the first k rows of its argument A.

  • lda[in] rocblas_int. lda >= k. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>unmlq()#

rocblas_status rocsolver_zunmlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunmlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNMLQ multiplies a complex matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)^HH(k-1)^H\cdots H(1)^H \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right. The Householder vectors as returned by GELQF in the first k rows of its argument A.

  • lda[in] rocblas_int. lda >= k. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GELQF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>unm2l()#

rocblas_status rocsolver_zunm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNM2L multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)H(k-1)\cdots H(1) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQLF in the last k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>unmql()#

rocblas_status rocsolver_zunmql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunmql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNMQL multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H(k)H(k-1)\cdots H(1) \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right. The number of Householder reflectors that form Q.

  • A[in] pointer to type. Array on the GPU of size lda*k. The Householder vectors as returned by GEQLF in the last k columns of its argument A.

  • lda[in] rocblas_int. lda >= m if side is left, lda >= n if side is right. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least k. The Householder scalars as returned by GEQLF.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>unmbr()#

rocblas_status rocsolver_zunmbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunmbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNMBR multiplies a complex matrix Q with orthonormal rows or columns by a general m-by-n matrix C.

If storev is column-wise, then the matrix Q has orthonormal columns. If storev is row-wise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

The order q of the unitary matrix Q is q = m if applying from the left, or q = n if applying from the right.

When storev is column-wise, if q >= k, then Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2)\cdots H(k), \]

and if q < k, then Q is defined as the product

\[ Q = H(1)H(2)\cdots H(q-1). \]

When storev is row-wise, if q > k, then Q is defined as the product of k Householder reflectors

\[ Q = H(1)H(2)\cdots H(k), \]

and if q <= k, Q is defined as the product

\[ Q = H(1)H(2)\cdots H(q-1). \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by GEBRD in its arguments A and tauq or taup.

Parameters:
  • handle[in] rocblas_handle.

  • storev[in] rocblas_storev. Specifies whether to work column-wise or row-wise.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • k[in] rocblas_int. k >= 0. The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by GEBRD.

  • A[in] pointer to type. Array on the GPU of size lda*min(q,k) if column-wise, or lda*q if row-wise. The Householder vectors as returned by GEBRD.

  • lda[in] rocblas_int. lda >= q if column-wise, or lda >= min(q,k) if row-wise. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least min(q,k). The Householder scalars as returned by GEBRD.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.

rocsolver_<type>unmtr()#

rocblas_status rocsolver_zunmtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)#
rocblas_status rocsolver_cunmtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)#

UNMTR multiplies a unitary matrix Q by a general m-by-n matrix C.

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

The order q of the unitary matrix Q is q = m if applying from the left, or q = n if applying from the right.

Q is defined as a product of q-1 Householder reflectors. If uplo indicates upper, then Q has the form

\[ Q = H(q-1)H(q-2)\cdots H(1). \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H(1)H(2)\cdots H(q-1) \]

The Householder matrices \(H(i)\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by HETRD in its arguments A and tau.

Parameters:
  • handle[in] rocblas_handle.

  • side[in] rocblas_side. Specifies from which side to apply Q.

  • uplo[in] rocblas_fill. Specifies whether the HETRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • trans[in] rocblas_operation. Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • m[in] rocblas_int. m >= 0. Number of rows of matrix C.

  • n[in] rocblas_int. n >= 0. Number of columns of matrix C.

  • A[in] pointer to type. Array on the GPU of size lda*q. On entry, the Householder vectors as returned by HETRD.

  • lda[in] rocblas_int. lda >= q. Leading dimension of A.

  • ipiv[in] pointer to type. Array on the GPU of dimension at least q-1. The Householder scalars as returned by HETRD.

  • C[inout] pointer to type. Array on the GPU of size ldc*n. On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • ldc[in] rocblas_int. ldc >= m. Leading dimension of C.