Sparse level 3 functions#

This module contains all sparse level 3 routines.

The sparse level 3 routines describe operations between a matrix in sparse format and multiple vectors in dense format that can also be seen as a dense matrix.

hipsparseXbsrmm()#

hipsparseStatus_t hipsparseSbsrmm(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transB, int mb, int n, int kb, int nnzb, const float *alpha, const hipsparseMatDescr_t descrA, const float *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, const float *B, int ldb, const float *beta, float *C, int ldc)#
hipsparseStatus_t hipsparseDbsrmm(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transB, int mb, int n, int kb, int nnzb, const double *alpha, const hipsparseMatDescr_t descrA, const double *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, const double *B, int ldb, const double *beta, double *C, int ldc)#
hipsparseStatus_t hipsparseCbsrmm(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transB, int mb, int n, int kb, int nnzb, const hipComplex *alpha, const hipsparseMatDescr_t descrA, const hipComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, const hipComplex *B, int ldb, const hipComplex *beta, hipComplex *C, int ldc)#
hipsparseStatus_t hipsparseZbsrmm(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transB, int mb, int n, int kb, int nnzb, const hipDoubleComplex *alpha, const hipsparseMatDescr_t descrA, const hipDoubleComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, const hipDoubleComplex *B, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *C, int ldc)#

Sparse matrix dense matrix multiplication using BSR storage format.

hipsparseXbsrmm multiplies the scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in BSR storage format, and the column-oriented dense \(k \times n\) matrix \(B\) and adds the result to the column-oriented dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if transA == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if transB == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ B^T, & \text{if transB == HIPSPARSE_OPERATION_TRANSPOSE} \\ \end{array} \right. \end{split}\]
and where \(k = blockDim \times kb\) and \(m = blockDim \times mb\).

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only transA == HIPSPARSE_OPERATION_NON_TRANSPOSE is supported.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • dirA[in] the storage format of the blocks. Can be HIPSPARSE_DIRECTION_ROW or HIPSPARSE_DIRECTION_COLUMN.

  • transA[in] matrix \(A\) operation type. Currently, only HIPSPARSE_OPERATION_NON_TRANSPOSE is supported.

  • transB[in] matrix \(B\) operation type. Currently, only HIPSPARSE_OPERATION_NON_TRANSPOSE and HIPSPARSE_OPERATION_TRANSPOSE are supported.

  • mb[in] number of block rows of the sparse BSR matrix \(A\). Must be non-negative.

  • n[in] number of columns of the dense matrix \(op(B)\) and \(C\). Must be non-negative.

  • kb[in] number of block columns of the sparse BSR matrix \(A\). Must be non-negative.

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix \(A\). Must be non-negative.

  • alpha[in] scalar \(\alpha\).

  • descrA[in] descriptor of the sparse BSR matrix \(A\). Currently, only HIPSPARSE_MATRIX_TYPE_GENERAL is supported.

  • bsrValA[in] array of nnzb*blockDim*blockDim elements of the sparse BSR matrix \(A\).

  • bsrRowPtrA[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix \(A\).

  • bsrColIndA[in] array of nnzb elements containing the block column indices of the sparse BSR matrix \(A\).

  • blockDim[in] size of the blocks in the sparse BSR matrix. Must be positive.

  • B[in] array of dimension ldb*n ( \(op(B) == B\)), ldb*k otherwise.

  • ldb[in] leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \( op(B) == B\)) where k=blockDim*kb, \(\max{(1, n)}\) otherwise.

  • beta[in] scalar \(\beta\).

  • C[inout] array of dimension ldc*n.

  • ldc[in] leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \( op(A) == A\)) where m=blockDim*mb, \(\max{(1, k)}\) where k=blockDim*kb otherwise.

Return values:
  • HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.

  • HIPSPARSE_STATUS_NOT_INITIALIZEDhandle is not initialized.

  • HIPSPARSE_STATUS_INVALID_VALUEhandle, descrA, alpha or beta is nullptr, mb, n, kb or nnzb is negative, ldb or ldc is invalid, blockDim is less than or equal to zero, or bsrValA, bsrRowPtrA, bsrColIndA, B or C is nullptr.

  • HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.

  • HIPSPARSE_STATUS_NOT_SUPPORTEDtransA is not HIPSPARSE_OPERATION_NON_TRANSPOSE, transB is HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE, or hipsparseMatrixType_t is not HIPSPARSE_MATRIX_TYPE_GENERAL.

  1int main(int argc, char* argv[])
  2{
  3    // hipSPARSE handle
  4    hipsparseHandle_t handle;
  5    HIPSPARSE_CHECK(hipsparseCreate(&handle));
  6
  7    //     1 2 0 3 0 0
  8    // A = 0 4 5 0 0 0
  9    //     0 0 0 7 8 0
 10    //     0 0 1 2 4 1
 11
 12    const int                  blockDim = 2;
 13    const int                  mb       = 2;
 14    const int                  kb       = 3;
 15    const int                  nnzb     = 4;
 16    const hipsparseDirection_t dir      = HIPSPARSE_DIRECTION_ROW;
 17
 18    std::vector<int>   hbsrRowPtr = {0, 2, 4};
 19    std::vector<int>   hbsrColInd = {0, 1, 1, 2};
 20    std::vector<float> hbsrVal    = {1, 2, 0, 4, 0, 3, 5, 0, 0, 7, 1, 2, 8, 0, 4, 1};
 21
 22    // Set dimension n of B
 23    const int n = 3;
 24    const int m = mb * blockDim;
 25    const int k = kb * blockDim;
 26
 27    // Allocate and generate dense matrix B (k x n)
 28    std::vector<float> hB = {1.0f,
 29                             2.0f,
 30                             3.0f,
 31                             4.0f,
 32                             5.0f,
 33                             6.0f,
 34                             7.0f,
 35                             8.0f,
 36                             9.0f,
 37                             10.0f,
 38                             11.0f,
 39                             12.0f,
 40                             13.0f,
 41                             14.0f,
 42                             15.0f,
 43                             16.0f,
 44                             17.0f,
 45                             18.0f};
 46
 47    int*   dbsrRowPtr = NULL;
 48    int*   dbsrColInd = NULL;
 49    float* dbsrVal    = NULL;
 50    HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
 51    HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
 52    HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * blockDim * blockDim));
 53    HIP_CHECK(
 54        hipMemcpy(dbsrRowPtr, hbsrRowPtr.data(), sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
 55    HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd.data(), sizeof(int) * nnzb, hipMemcpyHostToDevice));
 56    HIP_CHECK(hipMemcpy(dbsrVal,
 57                        hbsrVal.data(),
 58                        sizeof(float) * nnzb * blockDim * blockDim,
 59                        hipMemcpyHostToDevice));
 60
 61    // Copy B to the device
 62    float* dB;
 63    HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * k * n));
 64    HIP_CHECK(hipMemcpy(dB, hB.data(), sizeof(float) * k * n, hipMemcpyHostToDevice));
 65
 66    // alpha and beta
 67    float alpha = 1.0f;
 68    float beta  = 0.0f;
 69
 70    // Allocate memory for the resulting matrix C
 71    float* dC;
 72    HIP_CHECK(hipMalloc((void**)&dC, sizeof(float) * m * n));
 73
 74    // Matrix descriptor
 75    hipsparseMatDescr_t descr;
 76    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
 77
 78    // Perform the matrix multiplication
 79    HIPSPARSE_CHECK(hipsparseSbsrmm(handle,
 80                                    dir,
 81                                    HIPSPARSE_OPERATION_NON_TRANSPOSE,
 82                                    HIPSPARSE_OPERATION_NON_TRANSPOSE,
 83                                    mb,
 84                                    n,
 85                                    kb,
 86                                    nnzb,
 87                                    &alpha,
 88                                    descr,
 89                                    dbsrVal,
 90                                    dbsrRowPtr,
 91                                    dbsrColInd,
 92                                    blockDim,
 93                                    dB,
 94                                    k,
 95                                    &beta,
 96                                    dC,
 97                                    m));
 98
 99    // Copy results to host
100    std::vector<float> hC(m * n);
101    HIP_CHECK(hipMemcpy(hC.data(), dC, sizeof(float) * m * n, hipMemcpyDeviceToHost));
102
103    std::cout << "hC" << std::endl;
104    for(int i = 0; i < m * n; i++)
105    {
106        std::cout << hC[i] << " ";
107    }
108    std::cout << std::endl;
109
110    HIP_CHECK(hipFree(dbsrRowPtr));
111    HIP_CHECK(hipFree(dbsrColInd));
112    HIP_CHECK(hipFree(dbsrVal));
113    HIP_CHECK(hipFree(dB));
114    HIP_CHECK(hipFree(dC));
115
116    return 0;
117}

hipsparseXcsrmm()#

hipsparseStatus_t hipsparseScsrmm(hipsparseHandle_t handle, hipsparseOperation_t transA, int m, int n, int k, int nnz, const float *alpha, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const float *B, int ldb, const float *beta, float *C, int ldc)#
hipsparseStatus_t hipsparseDcsrmm(hipsparseHandle_t handle, hipsparseOperation_t transA, int m, int n, int k, int nnz, const double *alpha, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const double *B, int ldb, const double *beta, double *C, int ldc)#
hipsparseStatus_t hipsparseCcsrmm(hipsparseHandle_t handle, hipsparseOperation_t transA, int m, int n, int k, int nnz, const hipComplex *alpha, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipComplex *B, int ldb, const hipComplex *beta, hipComplex *C, int ldc)#
hipsparseStatus_t hipsparseZcsrmm(hipsparseHandle_t handle, hipsparseOperation_t transA, int m, int n, int k, int nnz, const hipDoubleComplex *alpha, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipDoubleComplex *B, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *C, int ldc)#

Sparse matrix dense matrix multiplication using CSR storage format.

hipsparseXcsrmm multiplies the scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and the column-oriented dense \(k \times n\) matrix \(B\) and adds the result to the column-oriented dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot B + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if transA == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ A^T, & \text{if transA == HIPSPARSE_OPERATION_TRANSPOSE} \\ A^H, & \text{if transA == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]

for(i = 0; i < ldc; ++i)
{
    for(j = 0; j < n; ++j)
    {
        C[i][j] = beta * C[i][j];

        for(k = csrRowPtr[i]; k < csrRowPtr[i + 1]; ++k)
        {
            C[i][j] += alpha * csrVal[k] * B[csrColInd[k]][j];
        }
    }
}

Deprecated:

This function is deprecated when using the CUDA backend (CUDA 10.0+) and will be removed in CUDA 11.0. This deprecation does not apply to the ROCm backend.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • transA[in] matrix \(A\) operation type.

  • m[in] number of rows of the sparse CSR matrix \(A\). Must be non-negative.

  • n[in] number of columns of the dense matrix \(op(B)\) and \(C\). Must be non-negative.

  • k[in] number of columns of the sparse CSR matrix \(A\). Must be non-negative.

  • nnz[in] number of non-zero entries of the sparse CSR matrix \(A\). Must be non-negative.

  • alpha[in] scalar \(\alpha\).

  • descrA[in] descriptor of the sparse CSR matrix \(A\). Currently, only HIPSPARSE_MATRIX_TYPE_GENERAL is supported.

  • csrSortedValA[in] array of nnz elements of the sparse CSR matrix \(A\).

  • csrSortedRowPtrA[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • csrSortedColIndA[in] array of nnz elements containing the column indices of the sparse CSR matrix \(A\).

  • B[in] array of dimension ldb*n ( \(op(B) == B\)), ldb*k otherwise.

  • ldb[in] leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \(op(B) == B\)), \(\max{(1, n)}\) otherwise.

  • beta[in] scalar \(\beta\).

  • C[inout] array of dimension ldc*n.

  • ldc[in] leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \(op(A) == A\)), \(\max{(1, k)}\) otherwise.

Return values:
  • HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.

  • HIPSPARSE_STATUS_NOT_INITIALIZEDhandle is not initialized.

  • HIPSPARSE_STATUS_INVALID_VALUEhandle, descrA, alpha or beta is nullptr, m, n, k or nnz is negative, ldb or ldc is invalid, or csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, B or C is nullptr.

  • HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.

  • HIPSPARSE_STATUS_NOT_SUPPORTEDhipsparseMatrixType_t is not HIPSPARSE_MATRIX_TYPE_GENERAL.

  1int main(int argc, char* argv[])
  2{
  3    // hipSPARSE handle
  4    hipsparseHandle_t handle;
  5    HIPSPARSE_CHECK(hipsparseCreate(&handle));
  6
  7    //     1 2 0 3 0 0
  8    // A = 0 4 5 0 0 0
  9    //     0 0 0 7 8 0
 10    //     0 0 1 2 4 1
 11
 12    const int                  m   = 4;
 13    const int                  k   = 6;
 14    const int                  nnz = 11;
 15    const hipsparseDirection_t dir = HIPSPARSE_DIRECTION_ROW;
 16
 17    std::vector<int>   hcsrRowPtr = {0, 3, 5, 7, 11};
 18    std::vector<int>   hcsrColInd = {0, 1, 3, 1, 2, 3, 4, 2, 3, 4, 5};
 19    std::vector<float> hcsrVal    = {1, 2, 3, 4, 5, 7, 8, 1, 2, 4, 1};
 20
 21    // Set dimension n of B
 22    const int n = 3;
 23
 24    // Allocate and generate dense matrix B (k x n)
 25    std::vector<float> hB = {1.0f,
 26                             2.0f,
 27                             3.0f,
 28                             4.0f,
 29                             5.0f,
 30                             6.0f,
 31                             7.0f,
 32                             8.0f,
 33                             9.0f,
 34                             10.0f,
 35                             11.0f,
 36                             12.0f,
 37                             13.0f,
 38                             14.0f,
 39                             15.0f,
 40                             16.0f,
 41                             17.0f,
 42                             18.0f};
 43
 44    int*   dcsrRowPtr = NULL;
 45    int*   dcsrColInd = NULL;
 46    float* dcsrVal    = NULL;
 47    HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
 48    HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
 49    HIP_CHECK(hipMalloc((void**)&dcsrVal, sizeof(float) * nnz));
 50    HIP_CHECK(
 51        hipMemcpy(dcsrRowPtr, hcsrRowPtr.data(), sizeof(int) * (m + 1), hipMemcpyHostToDevice));
 52    HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd.data(), sizeof(int) * nnz, hipMemcpyHostToDevice));
 53    HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal.data(), sizeof(float) * nnz, hipMemcpyHostToDevice));
 54
 55    // Copy B to the device
 56    float* dB;
 57    HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * k * n));
 58    HIP_CHECK(hipMemcpy(dB, hB.data(), sizeof(float) * k * n, hipMemcpyHostToDevice));
 59
 60    // alpha and beta
 61    float alpha = 1.0f;
 62    float beta  = 0.0f;
 63
 64    // Allocate memory for the resulting matrix C
 65    float* dC;
 66    HIP_CHECK(hipMalloc((void**)&dC, sizeof(float) * m * n));
 67
 68    // Matrix descriptor
 69    hipsparseMatDescr_t descr;
 70    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
 71
 72    // Perform the matrix multiplication
 73    HIPSPARSE_CHECK(hipsparseScsrmm(handle,
 74                                    HIPSPARSE_OPERATION_NON_TRANSPOSE,
 75                                    m,
 76                                    n,
 77                                    k,
 78                                    nnz,
 79                                    &alpha,
 80                                    descr,
 81                                    dcsrVal,
 82                                    dcsrRowPtr,
 83                                    dcsrColInd,
 84                                    dB,
 85                                    k,
 86                                    &beta,
 87                                    dC,
 88                                    m));
 89
 90    // Copy results to host
 91    std::vector<float> hC(6 * 3);
 92    HIP_CHECK(hipMemcpy(hC.data(), dC, sizeof(float) * m * n, hipMemcpyDeviceToHost));
 93
 94    std::cout << "hC" << std::endl;
 95    for(int i = 0; i < m * n; i++)
 96    {
 97        std::cout << hC[i] << " ";
 98    }
 99    std::cout << std::endl;
100
101    HIP_CHECK(hipFree(dcsrRowPtr));
102    HIP_CHECK(hipFree(dcsrColInd));
103    HIP_CHECK(hipFree(dcsrVal));
104    HIP_CHECK(hipFree(dB));
105    HIP_CHECK(hipFree(dC));
106
107    HIPSPARSE_CHECK(hipsparseDestroy(handle));
108
109    return 0;
110}

hipsparseXcsrmm2()#

hipsparseStatus_t hipsparseScsrmm2(hipsparseHandle_t handle, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int n, int k, int nnz, const float *alpha, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const float *B, int ldb, const float *beta, float *C, int ldc)#
hipsparseStatus_t hipsparseDcsrmm2(hipsparseHandle_t handle, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int n, int k, int nnz, const double *alpha, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const double *B, int ldb, const double *beta, double *C, int ldc)#
hipsparseStatus_t hipsparseCcsrmm2(hipsparseHandle_t handle, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int n, int k, int nnz, const hipComplex *alpha, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipComplex *B, int ldb, const hipComplex *beta, hipComplex *C, int ldc)#
hipsparseStatus_t hipsparseZcsrmm2(hipsparseHandle_t handle, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int n, int k, int nnz, const hipDoubleComplex *alpha, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipDoubleComplex *B, int ldb, const hipDoubleComplex *beta, hipDoubleComplex *C, int ldc)#

Sparse matrix dense matrix multiplication using CSR storage format.

hipsparseXcsrmm2 multiplies the scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and the column-oriented dense \(k \times n\) matrix \(B\) and adds the result to the column-oriented dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if transA == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ A^T, & \text{if transA == HIPSPARSE_OPERATION_TRANSPOSE} \\ A^H, & \text{if transA == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if transB == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ B^T, & \text{if transB == HIPSPARSE_OPERATION_TRANSPOSE} \\ B^H, & \text{if transB == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]

for(i = 0; i < ldc; ++i)
{
    for(j = 0; j < n; ++j)
    {
        C[i][j] = beta * C[i][j];

        for(k = csrRowPtr[i]; k < csrRowPtr[i + 1]; ++k)
        {
            C[i][j] += alpha * csrVal[k] * B[csrColInd[k]][j];
        }
    }
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • transA[in] matrix \(A\) operation type.

  • transB[in] matrix \(B\) operation type.

  • m[in] number of rows of the sparse CSR matrix \(A\).

  • n[in] number of columns of the dense matrix \(op(B)\) and \(C\).

  • k[in] number of columns of the sparse CSR matrix \(A\).

  • nnz[in] number of non-zero entries of the sparse CSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descrA[in] descriptor of the sparse CSR matrix \(A\). Currently, only HIPSPARSE_MATRIX_TYPE_GENERAL is supported.

  • csrSortedValA[in] array of nnz elements of the sparse CSR matrix \(A\).

  • csrSortedRowPtrA[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • csrSortedColIndA[in] array of nnz elements containing the column indices of the sparse CSR matrix \(A\).

  • B[in] array of dimension ldb*n ( \(op(B) == B\)), ldb*k otherwise.

  • ldb[in] leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \(op(B) == B\)), \(\max{(1, n)}\) otherwise.

  • beta[in] scalar \(\beta\).

  • C[inout] array of dimension ldc*n.

  • ldc[in] leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \(op(A) == A\)), \(\max{(1, k)}\) otherwise.

Return values:
  • HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.

  • HIPSPARSE_STATUS_INVALID_VALUEhandle, m, n, k, nnz, ldb, ldc descrA, alpha, csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, B, beta or C is invalid.

  • HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.

  • HIPSPARSE_STATUS_NOT_SUPPORTEDhipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXbsrsm2_zeroPivot()#

hipsparseStatus_t hipsparseXbsrsm2_zeroPivot(hipsparseHandle_t handle, bsrsm2Info_t info, int *position)#

hipsparseXbsrsm2_zeroPivot returns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXbsrsm2_analysis() or hipsparseXbsrsm2_solve() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the BSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.

Deprecated:

This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.

Note

hipsparseXbsrsm2_zeroPivot is a blocking function. It might influence performance negatively.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • info[in] structure that holds the information collected during the analysis step.

  • position[inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:
  • HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.

  • HIPSPARSE_STATUS_NOT_INITIALIZEDhandle is not initialized.

  • HIPSPARSE_STATUS_INVALID_VALUEhandle, info or position is nullptr.

  • HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

  • HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.

hipsparseXbsrsm2_bufferSize()#

hipsparseStatus_t hipsparseSbsrsm2_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, int *pBufferSizeInBytes)#
hipsparseStatus_t hipsparseDbsrsm2_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, int *pBufferSizeInBytes)#
hipsparseStatus_t hipsparseCbsrsm2_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, int *pBufferSizeInBytes)#
hipsparseStatus_t hipsparseZbsrsm2_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, int *pBufferSizeInBytes)#

hipsparseXbsrsm2_buffer_size returns the size of the temporary storage buffer in bytes that is required by hipsparseXbsrsm2_analysis() and hipsparseXbsrsm2_solve(). The temporary storage buffer must be allocated by the user.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • dirA[in] matrix storage of BSR blocks.

  • transA[in] matrix \(A\) operation type.

  • transX[in] matrix \(X\) operation type.

  • mb[in] number of block rows of the sparse BSR matrix \(A\).

  • nrhs[in] number of columns of the dense matrix \(op(X)\).

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix \(A\).

  • descrA[in] descriptor of the sparse BSR matrix \(A\).

  • bsrSortedValA[in] array of nnzb blocks of the sparse BSR matrix.

  • bsrSortedRowPtrA[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • bsrSortedColIndA[in] array of nnzb containing the block column indices of the sparse BSR matrix.

  • blockDim[in] block dimension of the sparse BSR matrix.

  • info[in] structure that holds the information collected during the analysis step.

  • pBufferSizeInBytes[out] number of bytes of the temporary storage buffer required by hipsparseXbsrsm2_analysis() and hipsparseXbsrsm2_solve().

Return values:

hipsparseXbsrsm2_analysis()#

hipsparseStatus_t hipsparseSbsrsm2_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, const float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseDbsrsm2_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, const double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseCbsrsm2_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, const hipComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseZbsrsm2_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipsparseMatDescr_t descrA, const hipDoubleComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

Sparse triangular system solve using BSR storage format.

hipsparseXbsrsm2_analysis performs the analysis step for hipsparseXbsrsm2_solve(). It is expected that this function will be executed only once for a given matrix and particular operation type.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • dirA[in] matrix storage of BSR blocks.

  • transA[in] matrix \(A\) operation type.

  • transX[in] matrix \(X\) operation type.

  • mb[in] number of block rows of the sparse BSR matrix \(A\).

  • nrhs[in] number of columns of the dense matrix \(op(X)\).

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix \(A\).

  • descrA[in] descriptor of the sparse BSR matrix \(A\).

  • bsrSortedValA[in] array of nnzb blocks of the sparse BSR matrix \(A\).

  • bsrSortedRowPtrA[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix \(A\).

  • bsrSortedColIndA[in] array of nnzb containing the block column indices of the sparse BSR matrix \(A\).

  • blockDim[in] block dimension of the sparse BSR matrix \(A\).

  • info[out] structure that holds the information collected during the analysis step.

  • policy[in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.

  • pBuffer[in] temporary storage buffer allocated by the user.

Return values:

hipsparseXbsrsm2_solve()#

hipsparseStatus_t hipsparseSbsrsm2_solve(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const float *alpha, const hipsparseMatDescr_t descrA, const float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, const float *B, int ldb, float *X, int ldx, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseDbsrsm2_solve(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const double *alpha, const hipsparseMatDescr_t descrA, const double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, const double *B, int ldb, double *X, int ldx, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseCbsrsm2_solve(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipComplex *alpha, const hipsparseMatDescr_t descrA, const hipComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, const hipComplex *B, int ldb, hipComplex *X, int ldx, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseZbsrsm2_solve(hipsparseHandle_t handle, hipsparseDirection_t dirA, hipsparseOperation_t transA, hipsparseOperation_t transX, int mb, int nrhs, int nnzb, const hipDoubleComplex *alpha, const hipsparseMatDescr_t descrA, const hipDoubleComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrsm2Info_t info, const hipDoubleComplex *B, int ldb, hipDoubleComplex *X, int ldx, hipsparseSolvePolicy_t policy, void *pBuffer)#

Sparse triangular system solve using BSR storage format.

hipsparseXbsrsm2_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in BSR storage format, a column-oriented dense solution matrix \(X\) and the column-oriented dense right-hand side matrix \(B\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot op(X) = \alpha \cdot op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if transA == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ A^T, & \text{if transA == HIPSPARSE_OPERATION_TRANSPOSE} \\ A^H, & \text{if transA == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]
,
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if transX == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ B^T, & \text{if transX == HIPSPARSE_OPERATION_TRANSPOSE} \\ B^H, & \text{if transX == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]
and
\[\begin{split} op(X) = \left\{ \begin{array}{ll} X, & \text{if transX == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ X^T, & \text{if transX == HIPSPARSE_OPERATION_TRANSPOSE} \\ X^H, & \text{if transX == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]
and where \(m = blockDim \times mb\).

Note that as indicated above, the operation type of both \(op(B)\) and \(op(X)\) is specified by the transX parameter and that the operation type of \(B\) and \(X\) must match. For example, if \(op(B)=B\) then \(op(X)=X\). Likewise, if \(op(B)=B^T\) then \(op(X)=X^T\).

Given that the sparse matrix \(A\) is a square matrix, its size is \(m \times m\) regardless of whether \(A\) is transposed or not. The size of the column-oriented dense matrices \(B\) and \(X\) have size that depends on the value of transX:

\[\begin{split} op(B) = \left\{ \begin{array}{ll} ldb \times nrhs, \text{ } ldb \ge m, & \text{if transX == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if transX == HIPSPARSE_OPERATION_TRANSPOSE} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if transX == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]
and
\[\begin{split} op(X) = \left\{ \begin{array}{ll} ldb \times nrhs, \text{ } ldb \ge m, & \text{if transX == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if transX == HIPSPARSE_OPERATION_TRANSPOSE} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if transX == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]

hipsparseXbsrsm2_solve requires a user allocated temporary buffer. Its size is returned by hipsparseXbsrsm2_bufferSize(). The size of the required buffer is larger when transA equals HIPSPARSE_OPERATION_TRANSPOSE or HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE and when transX is HIPSPARSE_OPERATION_NON_TRANSPOSE. The subsequent solve will also be faster when \(A\) is non-transposed and \(B\) is transposed (or conjugate transposed). For example, instead of solving:

\[\begin{split} \left[ \begin{array}{c | c} \begin{array}{c c} a_{00} & a_{01} \\ a_{10} & a_{11} \end{array} & \begin{array}{c c} 0 & 0 \\ 0 & 0 \end{array} \\ \hline \begin{array}{c c} a_{20} & a_{21} \\ a_{30} & a_{31} \end{array} & \begin{array}{c c} a_{22} & a_{23} \\ a_{32} & a_{33} \end{array} \\ \end{array} \right] \cdot \begin{bmatrix} x_{00} & x_{01} \\ x_{10} & x_{11} \\ x_{20} & x_{21} \\ x_{30} & x_{31} \\ \end{bmatrix} = \begin{bmatrix} b_{00} & b_{01} \\ b_{10} & b_{11} \\ b_{20} & b_{21} \\ b_{30} & b_{31} \\ \end{bmatrix} \end{split}\]

Consider solving:

\[\begin{split} \left[ \begin{array}{c | c} \begin{array}{c c} a_{00} & a_{01} \\ a_{10} & a_{11} \end{array} & \begin{array}{c c} 0 & 0 \\ 0 & 0 \end{array} \\ \hline \begin{array}{c c} a_{20} & a_{21} \\ a_{30} & a_{31} \end{array} & \begin{array}{c c} a_{22} & a_{23} \\ a_{32} & a_{33} \end{array} \\ \end{array} \right] \cdot \begin{bmatrix} x_{00} & x_{10} & x_{20} & x_{30} \\ x_{01} & x_{11} & x_{21} & x_{31} \end{bmatrix}^{T} = \begin{bmatrix} b_{00} & b_{10} & b_{20} & b_{30} \\ b_{01} & b_{11} & b_{21} & b_{31} \end{bmatrix}^{T} \end{split}\]

Once the temporary storage buffer has been allocated, analysis meta data is required. It can be obtained by hipsparseSbsrsm2_analysis “hipsparseXbsrsm2_analysis()”. The triangular solve is completed by calling hipsparseXbsrsm2_solve and once all solves are performed, the temporary storage buffer allocated by the user can be freed.

Solving a triangular system involves inverting the diagonal blocks. This means that if the sparse matrix is missing the diagonal block (referred to as a structural zero) or the diagonal block is not invertible (referred to as a numerical zero) then a solution is not possible. hipsparseXbsrsm2_solve tracks the location of the first zero pivot (either numerical or structural zero). The zero pivot status can be checked calling hipsparseXbsrsm2_zeroPivot(). If hipsparseXbsrsm2_zeroPivot() returns HIPSPARSE_STATUS_SUCCESS, then no zero pivot was found and therefore the matrix does not have a structural or numerical zero.

The user can specify that the sparse matrix should be interpreted as having identity blocks on the diagonal by setting the diagonal type on the descriptor descrA to HIPSPARSE_DIAG_TYPE_UNIT using hipsparseSetMatDiagType. If hipsparseDiagType_t == HIPSPARSE_DIAG_TYPE_UNIT, no zero pivot will be reported, even if the diagonal block \(A_{j,j}\) for some \(j\) is not invertible.

The sparse CSR matrix passed to hipsparseXbsrsm2_solve does not actually have to be a triangular matrix. Instead the triangular upper or lower part of the sparse matrix is solved based on hipsparseFillMode_t set on the descriptor descrA. If the fill mode is set to HIPSPARSE_FILL_MODE_LOWER, then the lower triangular matrix is solved. If the fill mode is set to HIPSPARSE_FILL_MODE_UPPER then the upper triangular matrix is solved.

Note

The sparse BSR matrix has to be sorted.

Note

Operation type of B and X must match, if \(op(B)=B, op(X)=X\).

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only transA != HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE and transX != HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE is supported.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • dirA[in] matrix storage of BSR blocks.

  • transA[in] matrix \(A\) operation type.

  • transX[in] matrix \(X\) operation type.

  • mb[in] number of block rows of the sparse BSR matrix \(A\).

  • nrhs[in] number of columns of the dense matrix \(op(X)\).

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descrA[in] descriptor of the sparse BSR matrix \(A\).

  • bsrSortedValA[in] array of nnzb blocks of the sparse BSR matrix.

  • bsrSortedRowPtrA[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • bsrSortedColIndA[in] array of nnzb containing the block column indices of the sparse BSR matrix.

  • blockDim[in] block dimension of the sparse BSR matrix.

  • info[in] structure that holds the information collected during the analysis step.

  • B[in] rhs matrix B with leading dimension ldb.

  • ldb[in] leading dimension of rhs matrix \(B\).

  • X[out] solution matrix X with leading dimension ldx.

  • ldx[in] leading dimension of solution matrix \(X\).

  • policy[in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.

  • pBuffer[in] temporary storage buffer allocated by the user.

Return values:
  1int main(int argc, char* argv[])
  2{
  3    // hipSPARSE handle
  4    hipsparseHandle_t handle;
  5    HIPSPARSE_CHECK(hipsparseCreate(&handle));
  6
  7    // A = ( 1.0  0.0  0.0  0.0 )
  8    //     ( 2.0  3.0  0.0  0.0 )
  9    //     ( 4.0  5.0  6.0  0.0 )
 10    //     ( 7.0  0.0  8.0  9.0 )
 11    //
 12    // with bsr_dim = 2
 13    //
 14    //      -------------------
 15    //   = | 1.0 0.0 | 0.0 0.0 |
 16    //     | 2.0 3.0 | 0.0 0.0 |
 17    //      -------------------
 18    //     | 4.0 5.0 | 6.0 0.0 |
 19    //     | 7.0 0.0 | 8.0 9.0 |
 20    //      -------------------
 21
 22    // Number of rows and columns
 23    const int m = 4;
 24
 25    // Number of block rows and block columns
 26    const int mb = 2;
 27    const int nb = 2;
 28
 29    // BSR block dimension
 30    const int bsr_dim = 2;
 31
 32    // Number of right-hand-sides
 33    const int nrhs = 4;
 34
 35    // Number of non-zero blocks
 36    const int nnzb = 3;
 37
 38    // BSR row pointers
 39    std::vector<int> hbsrRowPtr = {0, 1, 3};
 40
 41    // BSR column indices
 42    std::vector<int> hbsrColInd = {0, 0, 1};
 43
 44    // BSR values
 45    std::vector<double> hbsrVal = {1.0, 2.0, 0.0, 3.0, 4.0, 7.0, 5.0, 0.0, 6.0, 8.0, 0.0, 9.0};
 46
 47    // Storage scheme of the BSR blocks
 48    hipsparseDirection_t dir = HIPSPARSE_DIRECTION_COLUMN;
 49
 50    // Transposition of the matrix and rhs matrix
 51    hipsparseOperation_t transA = HIPSPARSE_OPERATION_NON_TRANSPOSE;
 52    hipsparseOperation_t transX = HIPSPARSE_OPERATION_NON_TRANSPOSE;
 53
 54    // Solve policy
 55    hipsparseSolvePolicy_t solve_policy = HIPSPARSE_SOLVE_POLICY_NO_LEVEL;
 56
 57    // Scalar alpha and beta
 58    double alpha = 1.0;
 59
 60    // rhs and solution matrix
 61    const int ldb = nb * bsr_dim;
 62    const int ldx = mb * bsr_dim;
 63
 64    std::vector<double> hB = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
 65    std::vector<double> hX(ldx * nrhs);
 66
 67    // Offload data to device
 68    int*    dbsrRowPtr;
 69    int*    dbsrColInd;
 70    double* dbsrVal;
 71    double* dB;
 72    double* dX;
 73
 74    HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
 75    HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
 76    HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(double) * nnzb * bsr_dim * bsr_dim));
 77    HIP_CHECK(hipMalloc((void**)&dB, sizeof(double) * nb * bsr_dim * nrhs));
 78    HIP_CHECK(hipMalloc((void**)&dX, sizeof(double) * mb * bsr_dim * nrhs));
 79
 80    HIP_CHECK(
 81        hipMemcpy(dbsrRowPtr, hbsrRowPtr.data(), sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
 82    HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd.data(), sizeof(int) * nnzb, hipMemcpyHostToDevice));
 83    HIP_CHECK(hipMemcpy(
 84        dbsrVal, hbsrVal.data(), sizeof(double) * nnzb * bsr_dim * bsr_dim, hipMemcpyHostToDevice));
 85    HIP_CHECK(
 86        hipMemcpy(dB, hB.data(), sizeof(double) * nb * bsr_dim * nrhs, hipMemcpyHostToDevice));
 87
 88    // Matrix descriptor
 89    hipsparseMatDescr_t descr;
 90    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
 91
 92    // Matrix fill mode
 93    HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
 94
 95    // Matrix diagonal type
 96    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_NON_UNIT));
 97
 98    // Matrix info structure
 99    bsrsm2Info_t info;
100    HIPSPARSE_CHECK(hipsparseCreateBsrsm2Info(&info));
101
102    // Obtain required buffer size
103    int buffer_size;
104    HIPSPARSE_CHECK(hipsparseDbsrsm2_bufferSize(handle,
105                                                dir,
106                                                transA,
107                                                transX,
108                                                mb,
109                                                nrhs,
110                                                nnzb,
111                                                descr,
112                                                dbsrVal,
113                                                dbsrRowPtr,
114                                                dbsrColInd,
115                                                bsr_dim,
116                                                info,
117                                                &buffer_size));
118
119    // Allocate temporary buffer
120    void* dbuffer;
121    HIP_CHECK(hipMalloc(&dbuffer, buffer_size));
122
123    // Perform analysis step
124    HIPSPARSE_CHECK(hipsparseDbsrsm2_analysis(handle,
125                                              dir,
126                                              transA,
127                                              transX,
128                                              mb,
129                                              nrhs,
130                                              nnzb,
131                                              descr,
132                                              dbsrVal,
133                                              dbsrRowPtr,
134                                              dbsrColInd,
135                                              bsr_dim,
136                                              info,
137                                              solve_policy,
138                                              dbuffer));
139
140    // Call dbsrsm to perform lower triangular solve LX = B
141    HIPSPARSE_CHECK(hipsparseDbsrsm2_solve(handle,
142                                           dir,
143                                           transA,
144                                           transX,
145                                           mb,
146                                           nrhs,
147                                           nnzb,
148                                           &alpha,
149                                           descr,
150                                           dbsrVal,
151                                           dbsrRowPtr,
152                                           dbsrColInd,
153                                           bsr_dim,
154                                           info,
155                                           dB,
156                                           ldb,
157                                           dX,
158                                           ldx,
159                                           solve_policy,
160                                           dbuffer));
161
162    // Check for zero pivots
163    int               pivot;
164    hipsparseStatus_t status = hipsparseXbsrsm2_zeroPivot(handle, info, &pivot);
165
166    if(status == HIPSPARSE_STATUS_ZERO_PIVOT)
167    {
168        std::cout << "Found zero pivot in matrix row " << pivot << std::endl;
169    }
170
171    // Copy result back to host
172    HIP_CHECK(
173        hipMemcpy(hX.data(), dX, sizeof(double) * mb * bsr_dim * nrhs, hipMemcpyDeviceToHost));
174
175    std::cout << "hX" << std::endl;
176    for(int i = 0; i < ldx * nrhs; i++)
177    {
178        std::cout << hX[i] << " ";
179    }
180    std::cout << std::endl;
181
182    // Clear hipSPARSE
183    HIPSPARSE_CHECK(hipsparseDestroyBsrsm2Info(info));
184    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
185    HIPSPARSE_CHECK(hipsparseDestroy(handle));
186
187    // Clear device memory
188    HIP_CHECK(hipFree(dbsrRowPtr));
189    HIP_CHECK(hipFree(dbsrColInd));
190    HIP_CHECK(hipFree(dbsrVal));
191    HIP_CHECK(hipFree(dB));
192    HIP_CHECK(hipFree(dX));
193    HIP_CHECK(hipFree(dbuffer));
194
195    return 0;
196}

hipsparseXcsrsm2_zeroPivot()#

hipsparseStatus_t hipsparseXcsrsm2_zeroPivot(hipsparseHandle_t handle, csrsm2Info_t info, int *position)#

hipsparseXcsrsm2_zeroPivot returns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXcsrsm2_analysis() or hipsparseXcsrsm2_solve() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the CSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.

Deprecated:

This function is deprecated when using the CUDA backend (CUDA 11.0+) and will be removed in CUDA 12.0. This deprecation does not apply to the ROCm backend.

Note

hipsparseXcsrsm2_zeroPivot is a blocking function. It might influence performance negatively.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • info[in] structure that holds the information collected during the analysis step.

  • position[inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:
  • HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.

  • HIPSPARSE_STATUS_NOT_INITIALIZEDhandle is not initialized.

  • HIPSPARSE_STATUS_INVALID_VALUEhandle, info or position is nullptr.

  • HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

  • HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.

hipsparseXcsrsm2_bufferSizeExt()#

hipsparseStatus_t hipsparseScsrsm2_bufferSizeExt(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const float *alpha, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const float *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, size_t *pBufferSizeInBytes)#
hipsparseStatus_t hipsparseDcsrsm2_bufferSizeExt(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const double *alpha, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const double *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, size_t *pBufferSizeInBytes)#
hipsparseStatus_t hipsparseCcsrsm2_bufferSizeExt(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const hipComplex *alpha, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipComplex *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, size_t *pBufferSizeInBytes)#
hipsparseStatus_t hipsparseZcsrsm2_bufferSizeExt(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const hipDoubleComplex *alpha, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipDoubleComplex *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, size_t *pBufferSizeInBytes)#

hipsparseXcsrsm2_bufferSizeExt returns the size of the temporary storage buffer in bytes that is required by hipsparseXcsrsm2_analysis() and hipsparseXcsrsm2_solve(). The temporary storage buffer must be allocated by the user.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • algo[in] algorithm to use.

  • transA[in] matrix \(A\) operation type.

  • transB[in] matrix \(B\) operation type.

  • m[in] number of rows of the sparse CSR matrix \(A\).

  • nrhs[in] number of columns of the dense matrix \(op(B)\).

  • nnz[in] number of non-zero entries of the sparse CSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descrA[in] descriptor of the sparse CSR matrix \(A\).

  • csrSortedValA[in] array of nnz elements of the sparse CSR matrix \(A\).

  • csrSortedRowPtrA[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • csrSortedColIndA[in] array of nnz elements containing the column indices of the sparse CSR matrix \(A\).

  • B[in] array of m \(\times\) nrhs elements of the rhs matrix \(B\).

  • ldb[in] leading dimension of rhs matrix \(B\).

  • info[in] structure that holds the information collected during the analysis step.

  • policy[in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.

  • pBufferSizeInBytes[out] number of bytes of the temporary storage buffer required by hipsparseXcsrsm2_analysis() and hipsparseXcsrsm2_solve().

Return values:

hipsparseXcsrsm2_analysis()#

hipsparseStatus_t hipsparseScsrsm2_analysis(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const float *alpha, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const float *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseDcsrsm2_analysis(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const double *alpha, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const double *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseCcsrsm2_analysis(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const hipComplex *alpha, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipComplex *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseZcsrsm2_analysis(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const hipDoubleComplex *alpha, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, const hipDoubleComplex *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseXcsrsm2_analysis performs the analysis step for hipsparseXcsrsm2_solve(). It is expected that this function will be executed only once for a given matrix and particular operation type.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • algo[in] algorithm to use.

  • transA[in] matrix \(A\) operation type.

  • transB[in] matrix \(B\) operation type.

  • m[in] number of rows of the sparse CSR matrix \(A\).

  • nrhs[in] number of columns of the dense matrix \(op(B)\).

  • nnz[in] number of non-zero entries of the sparse CSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descrA[in] descriptor of the sparse CSR matrix \(A\).

  • csrSortedValA[in] array of nnz elements of the sparse CSR matrix \(A\).

  • csrSortedRowPtrA[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • csrSortedColIndA[in] array of nnz elements containing the column indices of the sparse CSR matrix \(A\).

  • B[in] array of m \(\times\) nrhs elements of the rhs matrix \(B\).

  • ldb[in] leading dimension of rhs matrix \(B\).

  • info[out] structure that holds the information collected during the analysis step.

  • policy[in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.

  • pBuffer[in] temporary storage buffer allocated by the user.

Return values:

hipsparseXcsrsm2_solve()#

hipsparseStatus_t hipsparseScsrsm2_solve(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const float *alpha, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, float *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseDcsrsm2_solve(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const double *alpha, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, double *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseCcsrsm2_solve(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const hipComplex *alpha, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, hipComplex *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseStatus_t hipsparseZcsrsm2_solve(hipsparseHandle_t handle, int algo, hipsparseOperation_t transA, hipsparseOperation_t transB, int m, int nrhs, int nnz, const hipDoubleComplex *alpha, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, hipDoubleComplex *B, int ldb, csrsm2Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

Sparse triangular system solve using CSR storage format.

hipsparseXcsrsm2_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR storage format, a column-oriented dense solution matrix \(X\) and the column-oriented dense right-hand side matrix \(B\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot op(X) = \alpha \cdot op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if transA == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ A^T, & \text{if transA == HIPSPARSE_OPERATION_TRANSPOSE} \\ A^H, & \text{if transA == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]
,
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if transB == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ B^T, & \text{if transB == HIPSPARSE_OPERATION_TRANSPOSE} \\ B^H, & \text{if transB == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]
and
\[\begin{split} op(X) = \left\{ \begin{array}{ll} X, & \text{if transB == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ X^T, & \text{if transB == HIPSPARSE_OPERATION_TRANSPOSE} \\ X^H, & \text{if transB == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]

The solution is performed inplace meaning that the matrix \(B\) is overwritten with the solution \(X\) after calling hipsparseXcsrsm2_solve. Given that the sparse matrix \(A\) is a square matrix, its size is \(m \times m\) regardless of whether \(A\) is transposed or not. The size of the column-oriented dense matrices \(B\) and \(X\) have size that depends on the value of transB:

\[\begin{split} op(B)/op(X) = \left\{ \begin{array}{ll} ldb \times nrhs, \text{ } ldb \ge m, & \text{if transB == HIPSPARSE_OPERATION_NON_TRANSPOSE} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if transB == HIPSPARSE_OPERATION_TRANSPOSE} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if transB == HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{array} \right. \end{split}\]

hipsparseXcsrsm2_solve requires a user allocated temporary buffer. Its size is returned by hipsparseXcsrsm2_bufferSizeExt(). The size of the required buffer is larger when transA equals HIPSPARSE_OPERATION_TRANSPOSE or HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE and when transB is HIPSPARSE_OPERATION_NON_TRANSPOSE. The subsequent solve will also be faster when \(A\) is non-transposed and \(B\) is transposed (or conjugate transposed). For example, instead of solving:

\[\begin{split} \begin{bmatrix} a_{00} & 0 & 0 \\ a_{10} & a_{11} & 0 \\ a_{20} & a_{21} & a_{22} \\ \end{bmatrix} \cdot \begin{bmatrix} x_{00} & x_{01} \\ x_{10} & x_{11} \\ x_{20} & x_{21} \\ \end{bmatrix} = \begin{bmatrix} b_{00} & b_{01} \\ b_{10} & b_{11} \\ b_{20} & b_{21} \\ \end{bmatrix} \end{split}\]

Consider solving:

\[\begin{split} \begin{bmatrix} a_{00} & 0 & 0 \\ a_{10} & a_{11} & 0 \\ a_{20} & a_{21} & a_{22} \end{bmatrix} \cdot \begin{bmatrix} x_{00} & x_{10} & x_{20} \\ x_{01} & x_{11} & x_{21} \end{bmatrix}^{T} = \begin{bmatrix} b_{00} & b_{10} & b_{20} \\ b_{01} & b_{11} & b_{21} \end{bmatrix}^{T} \end{split}\]

Once the temporary storage buffer has been allocated, analysis meta data is required. It can be obtained by hipsparseXcsrsm2_analysis(). The triangular solve is completed by calling hipsparseXcsrsm2_solve and once all solves are performed, the temporary storage buffer allocated by the user can be freed.

Solving a triangular system involves division by the diagonal elements. This means that if the sparse matrix is missing the diagonal entry (referred to as a structural zero) or the diagonal entry is zero (referred to as a numerical zero) then a division by zero would occur. hipsparseXcsrsm2_solve tracks the location of the first zero pivot (either numerical or structural zero). The zero pivot status can be checked calling hipsparseXcsrsm2_zeroPivot(). If hipsparseXcsrsm2_zeroPivot() returns HIPSPARSE_STATUS_SUCCESS, then no zero pivot was found and therefore the matrix does not have a structural or numerical zero.

The user can specify that the sparse matrix should be interpreted as having ones on the diagonal by setting the diagonal type on the descriptor descrA to HIPSPARSE_DIAG_TYPE_UNIT using hipsparseSetMatDiagType. If hipsparseDiagType_t == HIPSPARSE_DIAG_TYPE_UNIT, no zero pivot will be reported, even if \(A_{j,j} = 0\) for some \(j\).

The sparse CSR matrix passed to hipsparseXcsrsm2_solve does not actually have to be a triangular matrix. Instead the triangular upper or lower part of the sparse matrix is solved based on hipsparseFillMode_t set on the descriptor descrA. If the fill mode is set to HIPSPARSE_FILL_MODE_LOWER, then the lower triangular matrix is solved. If the fill mode is set to HIPSPARSE_FILL_MODE_UPPER then the upper triangular matrix is solved.

Note

The sparse CSR matrix has to be sorted. This can be achieved by calling hipsparseXcsrsort().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only transA != HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE and transB != HIPSPARSE_OPERATION_CONJUGATE_TRANSPOSE is supported.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • algo[in] algorithm to use.

  • transA[in] matrix \(A\) operation type.

  • transB[in] matrix \(B\) operation type.

  • m[in] number of rows of the sparse CSR matrix \(A\).

  • nrhs[in] number of columns of the dense matrix \(op(B)\).

  • nnz[in] number of non-zero entries of the sparse CSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descrA[in] descriptor of the sparse CSR matrix \(A\).

  • csrSortedValA[in] array of nnz elements of the sparse CSR matrix \(A\).

  • csrSortedRowPtrA[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • csrSortedColIndA[in] array of nnz elements containing the column indices of the sparse CSR matrix \(A\).

  • B[inout] array of m \(\times\) nrhs elements of the rhs matrix \(B\).

  • ldb[in] leading dimension of rhs matrix \(B\).

  • info[in] structure that holds the information collected during the analysis step.

  • policy[in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.

  • pBuffer[in] temporary storage buffer allocated by the user.

Return values:
  1int main(int argc, char* argv[])
  2{
  3    // hipSPARSE handle
  4    hipsparseHandle_t handle;
  5    HIPSPARSE_CHECK(hipsparseCreate(&handle));
  6
  7    // A = ( 1.0  0.0  0.0  0.0 )
  8    //     ( 2.0  3.0  0.0  0.0 )
  9    //     ( 4.0  5.0  6.0  0.0 )
 10    //     ( 7.0  0.0  8.0  9.0 )
 11
 12    // Number of rows and columns
 13    int m = 4;
 14    int n = 4;
 15
 16    // Number of right-hand-sides
 17    int nrhs = 4;
 18
 19    // Number of non-zeros
 20    int nnz = 9;
 21
 22    // CSR row pointers
 23    std::vector<int> hcsrRowPtr = {0, 1, 3, 6, 9};
 24
 25    // CSR column indices
 26    std::vector<int> hcsrColInd = {0, 0, 1, 0, 1, 2, 0, 2, 3};
 27
 28    // CSR values
 29    std::vector<double> hcsrVal = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0};
 30
 31    // Transposition of the matrix and rhs matrix
 32    hipsparseOperation_t transA = HIPSPARSE_OPERATION_NON_TRANSPOSE;
 33    hipsparseOperation_t transB = HIPSPARSE_OPERATION_NON_TRANSPOSE;
 34
 35    // Solve policy
 36    hipsparseSolvePolicy_t solve_policy = HIPSPARSE_SOLVE_POLICY_NO_LEVEL;
 37
 38    // Scalar alpha and beta
 39    double alpha = 1.0;
 40
 41    // rhs and solution matrix
 42    int ldb = n;
 43
 44    std::vector<double> hB = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
 45
 46    // Offload data to device
 47    int*    dcsrRowPtr;
 48    int*    dcsrColInd;
 49    double* dcsrVal;
 50    double* dB;
 51
 52    HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
 53    HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
 54    HIP_CHECK(hipMalloc((void**)&dcsrVal, sizeof(double) * nnz));
 55    HIP_CHECK(hipMalloc((void**)&dB, sizeof(double) * n * nrhs));
 56
 57    HIP_CHECK(
 58        hipMemcpy(dcsrRowPtr, hcsrRowPtr.data(), sizeof(int) * (m + 1), hipMemcpyHostToDevice));
 59    HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd.data(), sizeof(int) * nnz, hipMemcpyHostToDevice));
 60    HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal.data(), sizeof(double) * nnz, hipMemcpyHostToDevice));
 61    HIP_CHECK(hipMemcpy(dB, hB.data(), sizeof(double) * n * nrhs, hipMemcpyHostToDevice));
 62
 63    // Matrix descriptor
 64    hipsparseMatDescr_t descr;
 65    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
 66
 67    // Matrix fill mode
 68    HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
 69
 70    // Matrix diagonal type
 71    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_NON_UNIT));
 72
 73    // Matrix info structure
 74    csrsm2Info_t info;
 75    HIPSPARSE_CHECK(hipsparseCreateCsrsm2Info(&info));
 76
 77    // Obtain required buffer size
 78    size_t buffer_size;
 79    HIPSPARSE_CHECK(hipsparseDcsrsm2_bufferSizeExt(handle,
 80                                                   0,
 81                                                   transA,
 82                                                   transB,
 83                                                   m,
 84                                                   nrhs,
 85                                                   nnz,
 86                                                   &alpha,
 87                                                   descr,
 88                                                   dcsrVal,
 89                                                   dcsrRowPtr,
 90                                                   dcsrColInd,
 91                                                   dB,
 92                                                   ldb,
 93                                                   info,
 94                                                   solve_policy,
 95                                                   &buffer_size));
 96
 97    // Allocate temporary buffer
 98    void* dbuffer;
 99    HIP_CHECK(hipMalloc(&dbuffer, buffer_size));
100
101    // Perform analysis step
102    HIPSPARSE_CHECK(hipsparseDcsrsm2_analysis(handle,
103                                              0,
104                                              transA,
105                                              transB,
106                                              m,
107                                              nrhs,
108                                              nnz,
109                                              &alpha,
110                                              descr,
111                                              dcsrVal,
112                                              dcsrRowPtr,
113                                              dcsrColInd,
114                                              dB,
115                                              ldb,
116                                              info,
117                                              solve_policy,
118                                              dbuffer));
119
120    // Call dcsrsm to perform lower triangular solve LB = B
121    HIPSPARSE_CHECK(hipsparseDcsrsm2_solve(handle,
122                                           0,
123                                           transA,
124                                           transB,
125                                           m,
126                                           nrhs,
127                                           nnz,
128                                           &alpha,
129                                           descr,
130                                           dcsrVal,
131                                           dcsrRowPtr,
132                                           dcsrColInd,
133                                           dB,
134                                           ldb,
135                                           info,
136                                           solve_policy,
137                                           dbuffer));
138
139    // Check for zero pivots
140    int               pivot;
141    hipsparseStatus_t status = hipsparseXcsrsm2_zeroPivot(handle, info, &pivot);
142
143    if(status == HIPSPARSE_STATUS_ZERO_PIVOT)
144    {
145        std::cout << "Found zero pivot in matrix row " << pivot << std::endl;
146    }
147
148    // Copy result back to host
149    HIP_CHECK(hipMemcpy(hB.data(), dB, sizeof(double) * m * nrhs, hipMemcpyDeviceToHost));
150
151    std::cout << "hB" << std::endl;
152    for(size_t i = 0; i < hB.size(); i++)
153    {
154        std::cout << hB[i] << " ";
155    }
156    std::cout << "" << std::endl;
157
158    // Clear hipSPARSE
159    HIPSPARSE_CHECK(hipsparseDestroyCsrsm2Info(info));
160    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
161    HIPSPARSE_CHECK(hipsparseDestroy(handle));
162
163    // Clear device memory
164    HIP_CHECK(hipFree(dcsrRowPtr));
165    HIP_CHECK(hipFree(dcsrColInd));
166    HIP_CHECK(hipFree(dcsrVal));
167    HIP_CHECK(hipFree(dB));
168    HIP_CHECK(hipFree(dbuffer));
169
170    return 0;
171}

hipsparseXgemmi()#

hipsparseStatus_t hipsparseSgemmi(hipsparseHandle_t handle, int m, int n, int k, int nnz, const float *alpha, const float *A, int lda, const float *cscValB, const int *cscColPtrB, const int *cscRowIndB, const float *beta, float *C, int ldc)#
hipsparseStatus_t hipsparseDgemmi(hipsparseHandle_t handle, int m, int n, int k, int nnz, const double *alpha, const double *A, int lda, const double *cscValB, const int *cscColPtrB, const int *cscRowIndB, const double *beta, double *C, int ldc)#
hipsparseStatus_t hipsparseCgemmi(hipsparseHandle_t handle, int m, int n, int k, int nnz, const hipComplex *alpha, const hipComplex *A, int lda, const hipComplex *cscValB, const int *cscColPtrB, const int *cscRowIndB, const hipComplex *beta, hipComplex *C, int ldc)#
hipsparseStatus_t hipsparseZgemmi(hipsparseHandle_t handle, int m, int n, int k, int nnz, const hipDoubleComplex *alpha, const hipDoubleComplex *A, int lda, const hipDoubleComplex *cscValB, const int *cscColPtrB, const int *cscRowIndB, const hipDoubleComplex *beta, hipDoubleComplex *C, int ldc)#

Dense matrix sparse matrix multiplication using CSC storage format.

hipsparseXgemmi multiplies the scalar \(\alpha\) with a dense column-oriented \(m \times k\) matrix \(A\) and the sparse \(k \times n\) matrix \(B\), defined in CSC storage format and adds the result to the dense column-oriented \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot A \cdot B + \beta \cdot C \]

Deprecated:

This function is deprecated when using the CUDA backend (CUDA 11.0+) and will be removed in CUDA 12.0. This deprecation does not apply to the ROCm backend.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the hipsparse library context queue.

  • m[in] number of rows of the dense matrix \(A\). Must be non-negative.

  • n[in] number of columns of the sparse CSC matrix \(op(B)\) and \(C\). Must be non-negative.

  • k[in] number of columns of the dense matrix \(A\). Must be non-negative.

  • nnz[in] number of non-zero entries of the sparse CSC matrix \(B\). Must be non-negative.

  • alpha[in] scalar \(\alpha\).

  • A[in] array of dimension \(lda \times k\) ( \(op(A) == A\)) or \(lda \times m\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • lda[in] leading dimension of \(A\), must be at least \(m\) ( \(op(A) == A\)) or \(k\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • cscValB[in] array of nnz elements of the sparse CSC matrix \(B\).

  • cscColPtrB[in] array of n+1 elements that point to the start of every column of the sparse CSC matrix \(B\).

  • cscRowIndB[in] array of nnz elements containing the column indices of the sparse CSC matrix \(B\).

  • beta[in] scalar \(\beta\).

  • C[inout] array of dimension \(ldc \times n\) that holds the values of \(C\).

  • ldc[in] leading dimension of \(C\), must be at least \(m\).

Return values:
  • HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.

  • HIPSPARSE_STATUS_NOT_INITIALIZEDhandle is not initialized.

  • HIPSPARSE_STATUS_INVALID_VALUEhandle, alpha or beta is nullptr, m, n, k or nnz is negative, lda or ldc is invalid, or A, cscValB, cscColPtrB, cscRowIndB or C is nullptr.

 1int main(int argc, char* argv[])
 2{
 3    // A, B, and C are m×k, k×n, and m×n
 4    const int m = 3, n = 5, k = 4;
 5    const int lda = m, ldc = m;
 6    const int nnz_A = m * k, nnz_B = 10, nnz_C = m * n;
 7
 8    // alpha and beta
 9    float alpha = 0.5f;
10    float beta  = 0.25f;
11
12    std::vector<int>   hcscColPtr = {0, 2, 5, 7, 8, 10};
13    std::vector<int>   hcscRowInd = {0, 2, 0, 1, 3, 1, 3, 2, 0, 2};
14    std::vector<float> hcsc_val   = {1, 6, 2, 4, 9, 5, 2, 7, 3, 8};
15
16    std::vector<float> hA(nnz_A, 1.0f);
17    std::vector<float> hC(nnz_C, 1.0f);
18
19    int*   dcscColPtr;
20    int*   dcscRowInd;
21    float* dcsc_val;
22    HIP_CHECK(hipMalloc((void**)&dcscColPtr, sizeof(int) * (n + 1)));
23    HIP_CHECK(hipMalloc((void**)&dcscRowInd, sizeof(int) * nnz_B));
24    HIP_CHECK(hipMalloc((void**)&dcsc_val, sizeof(float) * nnz_B));
25
26    HIP_CHECK(
27        hipMemcpy(dcscColPtr, hcscColPtr.data(), sizeof(int) * (n + 1), hipMemcpyHostToDevice));
28    HIP_CHECK(hipMemcpy(dcscRowInd, hcscRowInd.data(), sizeof(int) * nnz_B, hipMemcpyHostToDevice));
29    HIP_CHECK(hipMemcpy(dcsc_val, hcsc_val.data(), sizeof(float) * nnz_B, hipMemcpyHostToDevice));
30
31    hipsparseHandle_t handle;
32    HIPSPARSE_CHECK(hipsparseCreate(&handle));
33
34    // Allocate memory for the matrix A
35    float* dA;
36    HIP_CHECK(hipMalloc((void**)&dA, sizeof(float) * nnz_A));
37    HIP_CHECK(hipMemcpy(dA, hA.data(), sizeof(float) * nnz_A, hipMemcpyHostToDevice));
38
39    // Allocate memory for the resulting matrix C
40    float* dC;
41    HIP_CHECK(hipMalloc((void**)&dC, sizeof(float) * nnz_C));
42    HIP_CHECK(hipMemcpy(dC, hC.data(), sizeof(float) * nnz_C, hipMemcpyHostToDevice));
43
44    // Perform operation
45    HIPSPARSE_CHECK(hipsparseSgemmi(
46        handle, m, n, k, nnz_B, &alpha, dA, lda, dcsc_val, dcscColPtr, dcscRowInd, &beta, dC, ldc));
47
48    // Copy device to host
49    HIP_CHECK(hipMemcpy(hC.data(), dC, sizeof(float) * nnz_C, hipMemcpyDeviceToHost));
50
51    std::cout << "hC" << std::endl;
52    for(int i = 0; i < nnz_C; i++)
53    {
54        std::cout << hC[i] << " ";
55    }
56    std::cout << std::endl;
57
58    // Destroy matrix descriptors and handles
59    HIPSPARSE_CHECK(hipsparseDestroy(handle));
60
61    HIP_CHECK(hipFree(dcscColPtr));
62    HIP_CHECK(hipFree(dcscRowInd));
63    HIP_CHECK(hipFree(dcsc_val));
64    HIP_CHECK(hipFree(dA));
65    HIP_CHECK(hipFree(dC));
66
67    return 0;
68}