Sparse Level 3 Functions#

This module holds all sparse level 3 routines.

The sparse level 3 routines describe operations between a matrix in sparse format and multiple vectors in dense format that can also be seen as a dense matrix.

rocsparse_bsrmm()#

rocsparse_status rocsparse_sbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const float *alpha, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const float *B, rocsparse_int ldb, const float *beta, float *C, rocsparse_int ldc)#
rocsparse_status rocsparse_dbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const double *alpha, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const double *B, rocsparse_int ldb, const double *beta, double *C, rocsparse_int ldc)#
rocsparse_status rocsparse_cbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const rocsparse_float_complex *B, rocsparse_int ldb, const rocsparse_float_complex *beta, rocsparse_float_complex *C, rocsparse_int ldc)#
rocsparse_status rocsparse_zbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const rocsparse_double_complex *B, rocsparse_int ldb, const rocsparse_double_complex *beta, rocsparse_double_complex *C, rocsparse_int ldc)#

Sparse matrix dense matrix multiplication using BSR storage format.

rocsparse_bsrmm multiplies the scalar \(\alpha\) with a sparse \(mb \times kb\) matrix \(A\), defined in BSR storage format, and the column-oriented dense \(k \times n\) matrix \(B\) (where \(k = block\_dim \times kb\)) and adds the result to the column-oriented dense \(m \times n\) matrix \(C\) (where \(m = block\_dim \times mb\)) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ \end{array} \right. \end{split}\]

Example

This example multiplies a BSR matrix with a column-oriented dense matrix.

//     1 2 0 3 0 0
// A = 0 4 5 0 0 0
//     0 0 0 7 8 0
//     0 0 1 2 4 1

rocsparse_int block_dim = 2;
rocsparse_int mb   = 2;
rocsparse_int kb   = 3;
rocsparse_int nnzb = 4;
rocsparse_direction dir = rocsparse_direction_row;

bsr_row_ptr[mb+1]                 = {0, 2, 4};                                        // device memory
bsr_col_ind[nnzb]                 = {0, 1, 1, 2};                                     // device memory
bsr_val[nnzb*block_dim*block_dim] = {1, 2, 0, 4, 0, 3, 5, 0, 0, 7, 1, 2, 8, 0, 4, 1}; // device memory

// Set dimension n of B
rocsparse_int n = 64;
rocsparse_int m = mb * block_dim;
rocsparse_int k = kb * block_dim;

// Allocate and generate column-oriented dense matrix B
std::vector<float> hB(k * n);
for(rocsparse_int i = 0; i < k * n; ++i)
{
    hB[i] = static_cast<float>(rand()) / RAND_MAX;
}

// Copy B to the device
float* B;
hipMalloc((void**)&B, sizeof(float) * k * n);
hipMemcpy(B, hB.data(), sizeof(float) * k * n, hipMemcpyHostToDevice);

// alpha and beta
float alpha = 1.0f;
float beta  = 0.0f;

// Allocate memory for the resulting matrix C
float* C;
hipMalloc((void**)&C, sizeof(float) * m * n);

// Perform the matrix multiplication
rocsparse_sbsrmm(handle,
                 dir,
                 rocsparse_operation_none,
                 rocsparse_operation_none,
                 mb,
                 n,
                 kb,
                 nnzb,
                 &alpha,
                 descr,
                 bsr_val,
                 bsr_row_ptr,
                 bsr_col_ind,
                 block_dim,
                 B,
                 k,
                 &beta,
                 C,
                 m);

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans_A == rocsparse_operation_none is supported.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • dir[in] the storage format of the blocks. Can be rocsparse_direction_row or rocsparse_direction_column.

  • trans_A[in] matrix \(A\) operation type. Currently, only rocsparse_operation_none is supported.

  • trans_B[in] matrix \(B\) operation type. Currently, only rocsparse_operation_none and rocsparse_operation_transpose are supported.

  • mb[in] number of block rows of the sparse BSR matrix \(A\).

  • n[in] number of columns of the column-oriented dense matrix \(op(B)\) and \(C\).

  • kb[in] number of block columns of the sparse BSR matrix \(A\).

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse BSR matrix \(A\). Currently, only rocsparse_matrix_type_general is supported.

  • bsr_val[in] array of nnzb*block_dim*block_dim elements of the sparse BSR matrix \(A\).

  • bsr_row_ptr[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix \(A\).

  • bsr_col_ind[in] array of nnzb elements containing the block column indices of the sparse BSR matrix \(A\).

  • block_dim[in] size of the blocks in the sparse BSR matrix.

  • B[in] column-oriented dense matrix of dimension \(ldb \times n\) ( \(op(B) == B\)), \(ldb \times k\) otherwise.

  • ldb[in] leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \( op(B) == B\)) where \(k = block\_dim \times kb\), \(\max{(1, n)}\) otherwise.

  • beta[in] scalar \(\beta\).

  • C[inout] column-oriented dense matrix of dimension \(ldc \times n\).

  • ldc[in] leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \( op(A) == A\)) where \(m = block\_dim \times mb\), \(\max{(1, k)}\) where \(k = block\_dim \times kb\) otherwise.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_sizemb, n, kb, nnzb, ldb or ldc is invalid.

  • rocsparse_status_invalid_pointerdescr, alpha, bsr_val, bsr_row_ptr, bsr_col_ind, B, beta or C pointer is invalid.

  • rocsparse_status_arch_mismatch – the device is not supported.

  • rocsparse_status_not_implementedtrans_A != rocsparse_operation_none or trans_B == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_gebsrmm()#

rocsparse_status rocsparse_sgebsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const float *alpha, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int row_block_dim, rocsparse_int col_block_dim, const float *B, rocsparse_int ldb, const float *beta, float *C, rocsparse_int ldc)#
rocsparse_status rocsparse_dgebsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const double *alpha, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int row_block_dim, rocsparse_int col_block_dim, const double *B, rocsparse_int ldb, const double *beta, double *C, rocsparse_int ldc)#
rocsparse_status rocsparse_cgebsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int row_block_dim, rocsparse_int col_block_dim, const rocsparse_float_complex *B, rocsparse_int ldb, const rocsparse_float_complex *beta, rocsparse_float_complex *C, rocsparse_int ldc)#
rocsparse_status rocsparse_zgebsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int row_block_dim, rocsparse_int col_block_dim, const rocsparse_double_complex *B, rocsparse_int ldb, const rocsparse_double_complex *beta, rocsparse_double_complex *C, rocsparse_int ldc)#

Sparse matrix dense matrix multiplication using GEneral BSR storage format.

rocsparse_gebsrmm multiplies the scalar \(\alpha\) with a sparse \(mb \times kb\) matrix \(A\), defined in GEneral BSR storage format, and the column-oriented dense \(k \times n\) matrix \(B\) (where \(k = col\_block\_dim \times kb\)) and adds the result to the column-oriented dense \(m \times n\) matrix \(C\) (where \(m = row\_block\_dim \times mb\)) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ \end{array} \right. \end{split}\]

Example

This example multiplies a GEneral BSR matrix with a column-oriented dense matrix.

//     1 2 0 3 0 0
// A = 0 4 5 0 0 0
//     0 0 0 7 8 0
//     0 0 1 2 4 1

rocsparse_int row_block_dim = 2;
rocsparse_int col_block_dim = 3;
rocsparse_int mb   = 2;
rocsparse_int kb   = 2;
rocsparse_int nnzb = 4;
rocsparse_direction dir = rocsparse_direction_row;

bsr_row_ptr[mb+1]                 = {0, 2, 4};                                        // device memory
bsr_col_ind[nnzb]                 = {0, 1, 0, 1};                                     // device memory
bsr_val[nnzb*row_block_dim*col_block_dim] = {1, 2, 0, 0, 4, 5, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 7, 8, 0, 2, 4, 1}; // device memory

// Set dimension n of B
rocsparse_int n = 64;
rocsparse_int m = mb * row_block_dim;
rocsparse_int k = kb * col_block_dim;

// Allocate and generate column-oriented dense matrix B
std::vector<float> hB(k * n);
for(rocsparse_int i = 0; i < k * n; ++i)
{
    hB[i] = static_cast<float>(rand()) / RAND_MAX;
}

// Copy B to the device
float* B;
hipMalloc((void**)&B, sizeof(float) * k * n);
hipMemcpy(B, hB.data(), sizeof(float) * k * n, hipMemcpyHostToDevice);

// alpha and beta
float alpha = 1.0f;
float beta  = 0.0f;

// Allocate memory for the resulting matrix C
float* C;
hipMalloc((void**)&C, sizeof(float) * m * n);

// Perform the matrix multiplication
rocsparse_sgebsrmm(handle,
                   dir,
                   rocsparse_operation_none,
                   rocsparse_operation_none,
                   mb,
                   n,
                   kb,
                   nnzb,
                   &alpha,
                   descr,
                   bsr_val,
                   bsr_row_ptr,
                   bsr_col_ind,
                   row_block_dim,
                   col_block_dim,
                   B,
                   k,
                   &beta,
                   C,
                   m);

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans_A == rocsparse_operation_none is supported.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • dir[in] the storage format of the blocks. Can be rocsparse_direction_row or rocsparse_direction_column.

  • trans_A[in] matrix \(A\) operation type. Currently, only rocsparse_operation_none is supported.

  • trans_B[in] matrix \(B\) operation type. Currently, only rocsparse_operation_none and rocsparse_operation_transpose are supported.

  • mb[in] number of block rows of the sparse GEneral BSR matrix \(A\).

  • n[in] number of columns of the column-oriented dense matrix \(op(B)\) and \(C\).

  • kb[in] number of block columns of the sparse GEneral BSR matrix \(A\).

  • nnzb[in] number of non-zero blocks of the sparse GEneral BSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse GEneral BSR matrix \(A\). Currently, only rocsparse_matrix_type_general is supported.

  • bsr_val[in] array of nnzb*row_block_dim*col_block_dim elements of the sparse GEneral BSR matrix \(A\).

  • bsr_row_ptr[in] array of mb+1 elements that point to the start of every block row of the sparse GEneral BSR matrix \(A\).

  • bsr_col_ind[in] array of nnzb elements containing the block column indices of the sparse GEneral BSR matrix \(A\).

  • row_block_dim[in] row size of the blocks in the sparse GEneral BSR matrix.

  • col_block_dim[in] column size of the blocks in the sparse GEneral BSR matrix.

  • B[in] column-oriented dense matrix of dimension \(ldb \times n\) ( \(op(B) == B\)), \(ldb \times k\) otherwise.

  • ldb[in] leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \( op(B) == B\)) where \(k = col\_block\_dim \times kb\), \(\max{(1, n)}\) otherwise.

  • beta[in] scalar \(\beta\).

  • C[inout] column-oriented dense matrix of dimension \(ldc \times n\).

  • ldc[in] leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \( op(A) == A\)) where \(m = row\_block\_dim \times mb\), \(\max{(1, k)}\) where \(k = col\_block\_dim \times kb\) otherwise.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_sizemb, n, kb, nnzb, ldb, ldc, row_block_dim or col_block_dim is invalid.

  • rocsparse_status_invalid_pointerdescr, alpha, bsr_val, bsr_row_ptr, bsr_col_ind, B, beta or C pointer is invalid.

  • rocsparse_status_arch_mismatch – the device is not supported.

  • rocsparse_status_not_implementedtrans_A != rocsparse_operation_none or trans_B == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrmm()#

rocsparse_status rocsparse_scsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *B, rocsparse_int ldb, const float *beta, float *C, rocsparse_int ldc)#
rocsparse_status rocsparse_dcsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *B, rocsparse_int ldb, const double *beta, double *C, rocsparse_int ldc)#
rocsparse_status rocsparse_ccsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *B, rocsparse_int ldb, const rocsparse_float_complex *beta, rocsparse_float_complex *C, rocsparse_int ldc)#
rocsparse_status rocsparse_zcsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *B, rocsparse_int ldb, const rocsparse_double_complex *beta, rocsparse_double_complex *C, rocsparse_int ldc)#

Sparse matrix dense matrix multiplication using CSR storage format.

rocsparse_csrmm multiplies the scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and the column-oriented dense \(k \times n\) matrix \(B\) and adds the result to the column-oriented dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

for(i = 0; i < ldc; ++i)
{
    for(j = 0; j < n; ++j)
    {
        C[i][j] = beta * C[i][j];

        for(k = csr_row_ptr[i]; k < csr_row_ptr[i + 1]; ++k)
        {
            C[i][j] += alpha * csr_val[k] * B[csr_col_ind[k]][j];
        }
    }
}

Example

This example multiplies a CSR matrix with a column-oriented dense matrix.

//     1 2 0 3 0
// A = 0 4 5 0 0
//     6 0 0 7 8

rocsparse_int m   = 3;
rocsparse_int k   = 5;
rocsparse_int nnz = 8;

csr_row_ptr[m+1] = {0, 3, 5, 8};             // device memory
csr_col_ind[nnz] = {0, 1, 3, 1, 2, 0, 3, 4}; // device memory
csr_val[nnz]     = {1, 2, 3, 4, 5, 6, 7, 8}; // device memory

// Set dimension n of B
rocsparse_int n = 64;

// Allocate and generate column-oriented dense matrix B
std::vector<float> hB(k * n);
for(rocsparse_int i = 0; i < k * n; ++i)
{
    hB[i] = static_cast<float>(rand()) / RAND_MAX;
}

// Copy B to the device
float* B;
hipMalloc((void**)&B, sizeof(float) * k * n);
hipMemcpy(B, hB.data(), sizeof(float) * k * n, hipMemcpyHostToDevice);

// alpha and beta
float alpha = 1.0f;
float beta  = 0.0f;

// Allocate memory for the resulting matrix C
float* C;
hipMalloc((void**)&C, sizeof(float) * m * n);

// Perform the matrix multiplication
rocsparse_scsrmm(handle,
                 rocsparse_operation_none,
                 rocsparse_operation_none,
                 m,
                 n,
                 k,
                 nnz,
                 &alpha,
                 descr,
                 csr_val,
                 csr_row_ptr,
                 csr_col_ind,
                 B,
                 k,
                 &beta,
                 C,
                 m);

Note

This function does not produce deterministic results when A is transposed.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] matrix \(A\) operation type.

  • trans_B[in] matrix \(B\) operation type.

  • m[in] number of rows of the sparse CSR matrix \(A\).

  • n[in] number of columns of the column-oriented dense matrix \(op(B)\) and \(C\).

  • k[in] number of columns of the sparse CSR matrix \(A\).

  • nnz[in] number of non-zero entries of the sparse CSR matrix \(A\).

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse CSR matrix \(A\). Currently, only rocsparse_matrix_type_general is supported.

  • csr_val[in] array of nnz elements of the sparse CSR matrix \(A\).

  • csr_row_ptr[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • csr_col_ind[in] array of nnz elements containing the column indices of the sparse CSR matrix \(A\).

  • B[in] column-oriented dense matrix of dimension \(ldb \times n\) ( \(op(B) == B\)), \(ldb \times k\) otherwise.

  • ldb[in] leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \(op(B) == B\)), \(\max{(1, n)}\) otherwise.

  • beta[in] scalar \(\beta\).

  • C[inout] column-oriented dense matrix of dimension \(ldc \times n\).

  • ldc[in] leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \(op(A) == A\)), \(\max{(1, k)}\) otherwise.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_sizem, n, k, nnz, ldb or ldc is invalid.

  • rocsparse_status_invalid_pointerdescr, alpha, csr_val, csr_row_ptr, csr_col_ind, B, beta or C pointer is invalid.

  • rocsparse_status_arch_mismatch – the device is not supported.

  • rocsparse_status_not_implementedrocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsm_zero_pivot()#

rocsparse_status rocsparse_csrsm_zero_pivot(rocsparse_handle handle, rocsparse_mat_info info, rocsparse_int *position)#

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_zero_pivot returns rocsparse_status_zero_pivot, if either a structural or numerical zero has been found during rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() or rocsparse_zcsrsm_solve() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the CSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and rocsparse_status_success is returned instead.

Note

rocsparse_csrsm_zero_pivot is a blocking function. It might influence performance negatively.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • info[in] structure that holds the information collected during the analysis step.

  • position[inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerinfo or position pointer is invalid.

  • rocsparse_status_internal_error – an internal error occurred.

  • rocsparse_status_zero_pivot – zero pivot has been found.

rocsparse_csrsm_buffer_size()#

rocsparse_status rocsparse_scsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)#
rocsparse_status rocsparse_dcsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)#
rocsparse_status rocsparse_ccsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)#
rocsparse_status rocsparse_zcsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)#

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_buffer_size returns the size of the temporary storage buffer that is required by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis(), rocsparse_zcsrsm_analysis(), rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() and rocsparse_zcsrsm_solve(). The temporary storage buffer must be allocated by the user.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] matrix A operation type.

  • trans_B[in] matrix B operation type.

  • m[in] number of rows of the sparse CSR matrix A.

  • nrhs[in] number of columns of the column-oriented dense matrix op(B).

  • nnz[in] number of non-zero entries of the sparse CSR matrix A.

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse CSR matrix A.

  • csr_val[in] array of nnz elements of the sparse CSR matrix A.

  • csr_row_ptr[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix A.

  • csr_col_ind[in] array of nnz elements containing the column indices of the sparse CSR matrix A.

  • B[in] column-oriented dense matrix of dimension m \(\times\) nrhs elements of the rhs matrix B.

  • ldb[in] leading dimension of rhs matrix B.

  • info[in] structure that holds the information collected during the analysis step.

  • policy[in] rocsparse_solve_policy_auto.

  • buffer_size[out] number of bytes of the temporary storage buffer required by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis(), rocsparse_zcsrsm_analysis(), rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() and rocsparse_zcsrsm_solve().

Return values:

rocsparse_csrsm_analysis()#

rocsparse_status rocsparse_scsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#
rocsparse_status rocsparse_dcsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#
rocsparse_status rocsparse_ccsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#
rocsparse_status rocsparse_zcsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_analysis performs the analysis step for rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() and rocsparse_zcsrsm_solve(). It is expected that this function will be executed only once for a given matrix and particular operation type. The analysis meta data can be cleared by rocsparse_csrsm_clear().

rocsparse_csrsm_analysis can share its meta data with rocsparse_scsrilu0_analysis(), rocsparse_dcsrilu0_analysis(), rocsparse_ccsrilu0_analysis(), rocsparse_zcsrilu0_analysis(), rocsparse_scsric0_analysis(), rocsparse_dcsric0_analysis(), rocsparse_ccsric0_analysis(), rocsparse_zcsric0_analysis(), rocsparse_scsrsv_analysis(), rocsparse_dcsrsv_analysis(), rocsparse_ccsrsv_analysis() and rocsparse_zcsrsv_analysis(). Selecting rocsparse_analysis_policy_reuse policy can greatly improve computation performance of meta data. However, the user need to make sure that the sparsity pattern remains unchanged. If this cannot be assured, rocsparse_analysis_policy_force has to be used.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is blocking with respect to the host.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] matrix A operation type.

  • trans_B[in] matrix B operation type.

  • m[in] number of rows of the sparse CSR matrix A.

  • nrhs[in] number of columns of the column-oriented dense matrix op(B).

  • nnz[in] number of non-zero entries of the sparse CSR matrix A.

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse CSR matrix A.

  • csr_val[in] array of nnz elements of the sparse CSR matrix A.

  • csr_row_ptr[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix A.

  • csr_col_ind[in] array of nnz elements containing the column indices of the sparse CSR matrix A.

  • B[in] column-oriented dense matrix of dimension m \(\times\) nrhs elements of the rhs matrix B.

  • ldb[in] leading dimension of rhs matrix B.

  • info[out] structure that holds the information collected during the analysis step.

  • analysis[in] rocsparse_analysis_policy_reuse or rocsparse_analysis_policy_force.

  • solve[in] rocsparse_solve_policy_auto.

  • temp_buffer[in] temporary storage buffer allocated by the user.

Return values:

rocsparse_csrsm_solve()#

rocsparse_status rocsparse_scsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, float *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)#
rocsparse_status rocsparse_dcsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, double *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)#
rocsparse_status rocsparse_ccsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_float_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)#
rocsparse_status rocsparse_zcsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_double_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)#

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR storage format, a column-oriented dense solution matrix \(X\) and the column-oriented dense right-hand side matrix \(B\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot op(X) = \alpha \cdot op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
,
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(X) = \left\{ \begin{array}{ll} X, & \text{if trans_B == rocsparse_operation_none} \\ X^T, & \text{if trans_B == rocsparse_operation_transpose} \\ X^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

The solution is performed inplace meaning that the matrix B is overwritten with the solution X after calling rocsparse_csrsm_solve. Given that the sparse matrix A is a square matrix, its size is \(m \times m\) regardless of whether A is transposed or not. The size of the column-oriented dense matrices B and X have size that depends on the value of trans_B:

\[\begin{split} op(B)/op(X) = \left\{ \begin{array}{ll} ldb \times nrhs, \text{ } ldb \ge m, & \text{if trans_B == rocsparse_operation_none} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if trans_B == rocsparse_operation_transpose} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

rocsparse_csrsm_solve requires a user allocated temporary buffer. Its size is returned by rocsparse_scsrsm_buffer_size(), rocsparse_dcsrsm_buffer_size(), rocsparse_ccsrsm_buffer_size() or rocsparse_zcsrsm_buffer_size(). The size of the required buffer is larger when trans_A equals rocsparse_operation_transpose or rocsparse_operation_conjugate_transpose and when trans_B is rocsparse_operation_none. The subsequent solve will also be faster when \(A\) is non-transposed and \(B\) is transposed (or conjugate transposed). For example, instead of solving:

\[\begin{split} \begin{bmatrix} a_{00} & 0 & 0 \\ a_{10} & a_{11} & 0 \\ a_{20} & a_{21} & a_{22} \\ \end{bmatrix} \cdot \begin{bmatrix} x_{00} & x_{01} \\ x_{10} & x_{11} \\ x_{20} & x_{21} \\ \end{bmatrix} = \begin{bmatrix} b_{00} & b_{01} \\ b_{10} & b_{11} \\ b_{20} & b_{21} \\ \end{bmatrix} \end{split}\]

Consider solving:

\[\begin{split} \begin{bmatrix} a_{00} & 0 & 0 \\ a_{10} & a_{11} & 0 \\ a_{20} & a_{21} & a_{22} \end{bmatrix} \cdot \begin{bmatrix} x_{00} & x_{10} & x_{20} \\ x_{01} & x_{11} & x_{21} \end{bmatrix}^{T} = \begin{bmatrix} b_{00} & b_{10} & b_{20} \\ b_{01} & b_{11} & b_{21} \end{bmatrix}^{T} \end{split}\]

Once the temporary storage buffer has been allocated, analysis meta data is required. It can be obtained by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis() or rocsparse_zcsrsm_analysis().

rocsparse_csrsm_solve reports the first zero pivot (either numerical or structural zero). The zero pivot status can be checked calling rocsparse_csrsm_zero_pivot(). If rocsparse_diag_type == rocsparse_diag_type_unit, no zero pivot will be reported, even if \(A_{j,j} = 0\) for some \(j\).

Example

Consider the lower triangular \(m \times m\) matrix \(L\), stored in CSR storage format with unit diagonal. The following example solves \(L \cdot X = B\).

// Create rocSPARSE handle
rocsparse_handle handle;
rocsparse_create_handle(&handle);

// Create matrix descriptor
rocsparse_mat_descr descr;
rocsparse_create_mat_descr(&descr);
rocsparse_set_mat_fill_mode(descr, rocsparse_fill_mode_lower);
rocsparse_set_mat_diag_type(descr, rocsparse_diag_type_unit);

// Create matrix info structure
rocsparse_mat_info info;
rocsparse_create_mat_info(&info);

// Obtain required buffer size
size_t buffer_size;
rocsparse_dcsrsm_buffer_size(handle,
                             rocsparse_operation_none,
                             rocsparse_operation_none,
                             m,
                             nrhs,
                             nnz,
                             &alpha,
                             descr,
                             csr_val,
                             csr_row_ptr,
                             csr_col_ind,
                             B,
                             ldb,
                             info,
                             rocsparse_solve_policy_auto,
                             &buffer_size);

// Allocate temporary buffer
void* temp_buffer;
hipMalloc(&temp_buffer, buffer_size);

// Perform analysis step
rocsparse_dcsrsm_analysis(handle,
                          rocsparse_operation_none,
                          rocsparse_operation_none,
                          m,
                          nrhs,
                          nnz,
                          &alpha,
                          descr,
                          csr_val,
                          csr_row_ptr,
                          csr_col_ind,
                          B,
                          ldb,
                          info,
                          rocsparse_analysis_policy_reuse,
                          rocsparse_solve_policy_auto,
                          temp_buffer);

// Solve LX = B
rocsparse_dcsrsm_solve(handle,
                       rocsparse_operation_none,
                       rocsparse_operation_none,
                       m,
                       nrhs,
                       nnz,
                       &alpha,
                       descr,
                       csr_val,
                       csr_row_ptr,
                       csr_col_ind,
                       B,
                       ldb,
                       info,
                       rocsparse_solve_policy_auto,
                       temp_buffer);

// No zero pivot should be found, with L having unit diagonal

// Clean up
hipFree(temp_buffer);
rocsparse_destroy_mat_info(info);
rocsparse_destroy_mat_descr(descr);
rocsparse_destroy_handle(handle);

Note

The sparse CSR matrix has to be sorted. This can be achieved by calling rocsparse_csrsort().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans_A != rocsparse_operation_conjugate_transpose and trans_B != rocsparse_operation_conjugate_transpose is supported.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] matrix A operation type.

  • trans_B[in] matrix B operation type.

  • m[in] number of rows of the sparse CSR matrix A.

  • nrhs[in] number of columns of the column-oriented dense matrix op(B).

  • nnz[in] number of non-zero entries of the sparse CSR matrix A.

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse CSR matrix A.

  • csr_val[in] array of nnz elements of the sparse CSR matrix A.

  • csr_row_ptr[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix A.

  • csr_col_ind[in] array of nnz elements containing the column indices of the sparse CSR matrix A.

  • B[inout] column-oriented dense matrix of dimension m \(\times\) nrhs elements of the rhs matrix B.

  • ldb[in] leading dimension of rhs matrix B.

  • info[in] structure that holds the information collected during the analysis step.

  • policy[in] rocsparse_solve_policy_auto.

  • temp_buffer[in] temporary storage buffer allocated by the user.

Return values:

rocsparse_csrsm_clear()#

rocsparse_status rocsparse_csrsm_clear(rocsparse_handle handle, rocsparse_mat_info info)#

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_clear deallocates all memory that was allocated by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis() or rocsparse_zcsrsm_analysis(). This is especially useful, if memory is an issue and the analysis data is not required for further computation, e.g. when switching to another sparse matrix format. Calling rocsparse_csrsm_clear is optional. All allocated resources will be cleared, when the opaque rocsparse_mat_info struct is destroyed using rocsparse_destroy_mat_info().

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • info[inout] structure that holds the information collected during the analysis step.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerinfo pointer is invalid.

  • rocsparse_status_memory_error – the buffer holding the meta data could not be deallocated.

  • rocsparse_status_internal_error – an internal error occurred.

rocsparse_bsrsm_zero_pivot()#

rocsparse_status rocsparse_bsrsm_zero_pivot(rocsparse_handle handle, rocsparse_mat_info info, rocsparse_int *position)#

Sparse triangular system solve using BSR storage format.

rocsparse_bsrsm_zero_pivot returns rocsparse_status_zero_pivot, if either a structural or numerical zero has been found during rocsparse_sbsrsm_solve(), rocsparse_dbsrsm_solve(), rocsparse_cbsrsm_solve() or rocsparse_zbsrsm_solve() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the BSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and rocsparse_status_success is returned instead.

Note

rocsparse_bsrsm_zero_pivot is a blocking function. It might influence performance negatively.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • info[in] structure that holds the information collected during the analysis step.

  • position[inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerinfo or position pointer is invalid.

  • rocsparse_status_internal_error – an internal error occurred.

  • rocsparse_status_zero_pivot – zero pivot has been found.

rocsparse_bsrsm_buffer_size()#

rocsparse_status rocsparse_sbsrsm_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)#
rocsparse_status rocsparse_dbsrsm_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)#
rocsparse_status rocsparse_cbsrsm_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)#
rocsparse_status rocsparse_zbsrsm_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)#

Sparse triangular system solve using BSR storage format.

rocsparse_bsrsm_buffer_size returns the size of the temporary storage buffer that is required by rocsparse_sbsrsm_analysis(), rocsparse_dbsrsm_analysis(), rocsparse_cbsrsm_analysis(), rocsparse_zbsrsm_analysis(), rocsparse_sbsrsm_solve(), rocsparse_dbsrsm_solve(), rocsparse_cbsrsm_solve() and rocsparse_zbsrsm_solve(). The temporary storage buffer must be allocated by the user.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • dir[in] matrix storage of BSR blocks.

  • trans_A[in] matrix A operation type.

  • trans_X[in] matrix X operation type.

  • mb[in] number of block rows of the sparse BSR matrix A.

  • nrhs[in] number of columns of the column-oriented dense matrix op(X).

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix A.

  • descr[in] descriptor of the sparse BSR matrix A.

  • bsr_val[in] array of nnzb blocks of the sparse BSR matrix.

  • bsr_row_ptr[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • bsr_col_ind[in] array of nnzb containing the block column indices of the sparse BSR matrix.

  • block_dim[in] block dimension of the sparse BSR matrix.

  • info[in] structure that holds the information collected during the analysis step.

  • buffer_size[out] number of bytes of the temporary storage buffer required by rocsparse_sbsrsm_analysis(), rocsparse_dbsrsm_analysis(), rocsparse_cbsrsm_analysis(), rocsparse_zbsrsm_analysis(), rocsparse_sbsrsm_solve(), rocsparse_dbsrsm_solve(), rocsparse_cbsrsm_solve() and rocsparse_zbsrsm_solve().

Return values:

rocsparse_bsrsm_analysis()#

rocsparse_status rocsparse_sbsrsm_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#
rocsparse_status rocsparse_dbsrsm_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#
rocsparse_status rocsparse_cbsrsm_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#
rocsparse_status rocsparse_zbsrsm_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)#

Sparse triangular system solve using BSR storage format.

rocsparse_bsrsm_analysis performs the analysis step for rocsparse_sbsrsm_solve(), rocsparse_dbsrsm_solve(), rocsparse_cbsrsm_solve() and rocsparse_zbsrsm_solve(). It is expected that this function will be executed only once for a given matrix and particular operation type. The analysis meta data can be cleared by rocsparse_bsrsm_clear().

rocsparse_bsrsm_analysis can share its meta data with rocsparse_sbsrilu0_analysis(), rocsparse_dbsrilu0_analysis(), rocsparse_cbsrilu0_analysis(), rocsparse_zbsrilu0_analysis(), rocsparse_sbsric0_analysis(), rocsparse_dbsric0_analysis(), rocsparse_cbsric0_analysis(), rocsparse_zbsric0_analysis(), rocsparse_sbsrsv_analysis(), rocsparse_dbsrsv_analysis(), rocsparse_cbsrsv_analysis() and rocsparse_zbsrsv_analysis(). Selecting rocsparse_analysis_policy_reuse policy can greatly improve computation performance of meta data. However, the user need to make sure that the sparsity pattern remains unchanged. If this cannot be assured, rocsparse_analysis_policy_force has to be used.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is blocking with respect to the host.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • dir[in] matrix storage of BSR blocks.

  • trans_A[in] matrix A operation type.

  • trans_X[in] matrix X operation type.

  • mb[in] number of block rows of the sparse BSR matrix A.

  • nrhs[in] number of columns of the column-oriented dense matrix op(X).

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix A.

  • descr[in] descriptor of the sparse BSR matrix A.

  • bsr_val[in] array of nnzb blocks of the sparse BSR matrix A.

  • bsr_row_ptr[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix A.

  • bsr_col_ind[in] array of nnzb containing the block column indices of the sparse BSR matrix A.

  • block_dim[in] block dimension of the sparse BSR matrix A.

  • info[out] structure that holds the information collected during the analysis step.

  • analysis[in] rocsparse_analysis_policy_reuse or rocsparse_analysis_policy_force.

  • solve[in] rocsparse_solve_policy_auto.

  • temp_buffer[in] temporary storage buffer allocated by the user.

Return values:

rocsparse_bsrsm_solve()#

rocsparse_status rocsparse_sbsrsm_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const float *alpha, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, const float *B, rocsparse_int ldb, float *X, rocsparse_int ldx, rocsparse_solve_policy policy, void *temp_buffer)#
rocsparse_status rocsparse_dbsrsm_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const double *alpha, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, const double *B, rocsparse_int ldb, double *X, rocsparse_int ldx, rocsparse_solve_policy policy, void *temp_buffer)#
rocsparse_status rocsparse_cbsrsm_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, const rocsparse_float_complex *B, rocsparse_int ldb, rocsparse_float_complex *X, rocsparse_int ldx, rocsparse_solve_policy policy, void *temp_buffer)#
rocsparse_status rocsparse_zbsrsm_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_X, rocsparse_int mb, rocsparse_int nrhs, rocsparse_int nnzb, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, const rocsparse_double_complex *B, rocsparse_int ldb, rocsparse_double_complex *X, rocsparse_int ldx, rocsparse_solve_policy policy, void *temp_buffer)#

Sparse triangular system solve using BSR storage format.

rocsparse_bsrsm_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix (where \(m = mb \times block\_dim\)), defined in BSR storage format, a column-oriented dense solution matrix \(X\) and the column-oriented dense right-hand side matrix \(B\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot op(X) = \alpha \cdot op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
,
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_X == rocsparse_operation_none} \\ B^T, & \text{if trans_X == rocsparse_operation_transpose} \\ B^H, & \text{if trans_X == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(X) = \left\{ \begin{array}{ll} X, & \text{if trans_X == rocsparse_operation_none} \\ X^T, & \text{if trans_X == rocsparse_operation_transpose} \\ X^H, & \text{if trans_X == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Note that as indicated above, the operation type of both \(op(B)\) and \(op(X)\) is specified by the trans_X parameter and that the operation type of B and X must match. For example, if \(op(B)=B\) then \(op(X)=X\). Likewise, if \(op(B)=B^T\) then \(op(X)=X^T\).

Given that the sparse matrix A is a square matrix, its size is \(m \times m\) regardless of whether A is transposed or not. The size of the column-oriented dense matrices B and X have size that depends on the value of trans_X:

\[\begin{split} op(B) = \left\{ \begin{array}{ll} ldb \times nrhs, \text{ } ldb \ge m, & \text{if trans_X == rocsparse_operation_none} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if trans_X == rocsparse_operation_transpose} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if trans_X == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(X) = \left\{ \begin{array}{ll} ldb \times nrhs, \text{ } ldb \ge m, & \text{if trans_X == rocsparse_operation_none} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if trans_X == rocsparse_operation_transpose} \\ ldb \times m, \text{ } ldb \ge nrhs, & \text{if trans_X == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

rocsparse_bsrsm_solve requires a user allocated temporary buffer. Its size is returned by rocsparse_sbsrsm_buffer_size(), rocsparse_dbsrsm_buffer_size(), rocsparse_cbsrsm_buffer_size() or rocsparse_zbsrsm_buffer_size(). The size of the required buffer is larger when trans_A equals rocsparse_operation_transpose or rocsparse_operation_conjugate_transpose and when trans_X is rocsparse_operation_none. The subsequent solve will also be faster when \(A\) is non-transposed and \(B\) is transposed (or conjugate transposed). For example, instead of solving:

\[\begin{split} \left[ \begin{array}{c | c} \begin{array}{c c} a_{00} & a_{01} \\ a_{10} & a_{11} \end{array} & \begin{array}{c c} 0 & 0 \\ 0 & 0 \end{array} \\ \hline \begin{array}{c c} a_{20} & a_{21} \\ a_{30} & a_{31} \end{array} & \begin{array}{c c} a_{22} & a_{23} \\ a_{32} & a_{33} \end{array} \\ \end{array} \right] \cdot \begin{bmatrix} x_{00} & x_{01} \\ x_{10} & x_{11} \\ x_{20} & x_{21} \\ x_{30} & x_{31} \\ \end{bmatrix} = \begin{bmatrix} b_{00} & b_{01} \\ b_{10} & b_{11} \\ b_{20} & b_{21} \\ b_{30} & b_{31} \\ \end{bmatrix} \end{split}\]

Consider solving:

\[\begin{split} \left[ \begin{array}{c | c} \begin{array}{c c} a_{00} & a_{01} \\ a_{10} & a_{11} \end{array} & \begin{array}{c c} 0 & 0 \\ 0 & 0 \end{array} \\ \hline \begin{array}{c c} a_{20} & a_{21} \\ a_{30} & a_{31} \end{array} & \begin{array}{c c} a_{22} & a_{23} \\ a_{32} & a_{33} \end{array} \\ \end{array} \right] \cdot \begin{bmatrix} x_{00} & x_{10} & x_{20} & x_{30} \\ x_{01} & x_{11} & x_{21} & x_{31} \end{bmatrix}^{T} = \begin{bmatrix} b_{00} & b_{10} & b_{20} & b_{30} \\ b_{01} & b_{11} & b_{21} & b_{31} \end{bmatrix}^{T} \end{split}\]

Once the temporary storage buffer has been allocated, analysis meta data is required. It can be obtained by rocsparse_sbsrsm_analysis(), rocsparse_dbsrsm_analysis(), rocsparse_cbsrsm_analysis() or rocsparse_zbsrsm_analysis().

rocsparse_bsrsm_solve reports the first zero pivot (either numerical or structural zero). The zero pivot status can be checked calling rocsparse_bsrsm_zero_pivot(). If rocsparse_diag_type == rocsparse_diag_type_unit, no zero pivot will be reported, even if \(A_{j,j} = 0\) for some \(j\).

Example

Consider the lower triangular \(m \times m\) matrix \(L\), stored in BSR storage format with non-unit diagonal. The following example solves \(L \cdot X = B\).

// rocSPARSE handle
rocsparse_handle handle;
rocsparse_create_handle(&handle);

// A = ( 1.0  0.0  0.0  0.0 )
//     ( 2.0  3.0  0.0  0.0 )
//     ( 4.0  5.0  6.0  0.0 )
//     ( 7.0  0.0  8.0  9.0 )
//
// with bsr_dim = 2
//
//      -------------------
//   = | 1.0 0.0 | 0.0 0.0 |
//     | 2.0 3.0 | 0.0 0.0 |
//      -------------------
//     | 4.0 5.0 | 6.0 0.0 |
//     | 7.0 0.0 | 8.0 9.0 |
//      -------------------

// Number of rows and columns
rocsparse_int m = 4;

// Number of block rows and block columns
rocsparse_int mb = 2;
rocsparse_int nb = 2;

// BSR block dimension
rocsparse_int bsr_dim = 2;

// Number of right-hand-sides
rocsparse_int nrhs = 4;

// Number of non-zero blocks
rocsparse_int nnzb = 3;

// BSR row pointers
rocsparse_int hbsr_row_ptr[3] = {0, 1, 3};

// BSR column indices
rocsparse_int hbsr_col_ind[3] = {0, 0, 1};

// BSR values
double hbsr_val[12] = {1.0, 2.0, 0.0, 3.0, 4.0, 7.0, 5.0, 0.0, 6.0, 8.0, 0.0, 9.0};

// Storage scheme of the BSR blocks
rocsparse_direction dir = rocsparse_direction_column;

// Transposition of the matrix and rhs matrix
rocsparse_operation transA = rocsparse_operation_none;
rocsparse_operation transX = rocsparse_operation_none;

// Analysis policy
rocsparse_analysis_policy analysis_policy = rocsparse_analysis_policy_reuse;

// Solve policy
rocsparse_solve_policy solve_policy = rocsparse_solve_policy_auto;

// Scalar alpha and beta
double alpha = 3.7;

// rhs and solution matrix
rocsparse_int ldb = nb * bsr_dim;
rocsparse_int ldx = mb * bsr_dim;

double hB[16] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};
double hX[16];

// Offload data to device
rocsparse_int* dbsr_row_ptr;
rocsparse_int* dbsr_col_ind;
double*        dbsr_val;
double*        dB;
double*        dX;

hipMalloc((void**)&dbsr_row_ptr, sizeof(rocsparse_int) * (mb + 1));
hipMalloc((void**)&dbsr_col_ind, sizeof(rocsparse_int) * nnzb);
hipMalloc((void**)&dbsr_val, sizeof(double) * nnzb * bsr_dim * bsr_dim);
hipMalloc((void**)&dB, sizeof(double) * nb * bsr_dim * nrhs);
hipMalloc((void**)&dX, sizeof(double) * mb * bsr_dim * nrhs);

hipMemcpy(dbsr_row_ptr, hbsr_row_ptr, sizeof(rocsparse_int) * (mb + 1), hipMemcpyHostToDevice);
hipMemcpy(dbsr_col_ind, hbsr_col_ind, sizeof(rocsparse_int) * nnzb, hipMemcpyHostToDevice);
hipMemcpy(dbsr_val, hbsr_val, sizeof(double) * nnzb * bsr_dim * bsr_dim, hipMemcpyHostToDevice);
hipMemcpy(dB, hB, sizeof(double) * nb * bsr_dim * nrhs, hipMemcpyHostToDevice);

// Matrix descriptor
rocsparse_mat_descr descr;
rocsparse_create_mat_descr(&descr);

// Matrix fill mode
rocsparse_set_mat_fill_mode(descr, rocsparse_fill_mode_lower);

// Matrix diagonal type
rocsparse_set_mat_diag_type(descr, rocsparse_diag_type_non_unit);

// Matrix info structure
rocsparse_mat_info info;
rocsparse_create_mat_info(&info);

// Obtain required buffer size
size_t buffer_size;
rocsparse_dbsrsm_buffer_size(handle,
                          dir,
                          transA,
                          transX,
                          mb,
                          nrhs,
                          nnzb,
                          descr,
                          dbsr_val,
                          dbsr_row_ptr,
                          dbsr_col_ind,
                          bsr_dim,
                          info,
                          &buffer_size);

// Allocate temporary buffer
void* temp_buffer;
hipMalloc(&temp_buffer, buffer_size);

// Perform analysis step
rocsparse_dbsrsm_analysis(handle,
                      dir,
                      transA,
                      transX,
                      mb,
                      nrhs,
                      nnzb,
                      descr,
                      dbsr_val,
                      dbsr_row_ptr,
                      dbsr_col_ind,
                      bsr_dim,
                      info,
                      analysis_policy,
                      solve_policy,
                      temp_buffer);

// Call dbsrsm to perform lower triangular solve LX = B
rocsparse_dbsrsm_solve(handle,
                      dir,
                      transA,
                      transX,
                      mb,
                      nrhs,
                      nnzb,
                      &alpha,
                      descr,
                      dbsr_val,
                      dbsr_row_ptr,
                      dbsr_col_ind,
                      bsr_dim,
                      info,
                      dB,
                      ldb,
                      dX,
                      ldx,
                      solve_policy,
                      temp_buffer);

// Check for zero pivots
rocsparse_int    pivot;
rocsparse_status status = rocsparse_bsrsm_zero_pivot(handle, info, &pivot);

if(status == rocsparse_status_zero_pivot)
{
    std::cout << "Found zero pivot in matrix row " << pivot << std::endl;
}

// Copy result back to host
hipMemcpy(hX, dX, sizeof(double) * mb * bsr_dim * nrhs, hipMemcpyDeviceToHost);

// Clear rocSPARSE
rocsparse_destroy_mat_info(info);
rocsparse_destroy_mat_descr(descr);
rocsparse_destroy_handle(handle);

// Clear device memory
hipFree(dbsr_row_ptr);
hipFree(dbsr_col_ind);
hipFree(dbsr_val);
hipFree(dB);
hipFree(dX);
hipFree(temp_buffer);

Note

The sparse BSR matrix has to be sorted.

Note

Operation type of B and X must match, for example if \(op(B)=B\) then \(op(X)=X\).

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans_A != rocsparse_operation_conjugate_transpose and trans_X != rocsparse_operation_conjugate_transpose is supported.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • dir[in] matrix storage of BSR blocks.

  • trans_A[in] matrix A operation type.

  • trans_X[in] matrix X operation type.

  • mb[in] number of block rows of the sparse BSR matrix A.

  • nrhs[in] number of columns of the column-oriented dense matrix op(X).

  • nnzb[in] number of non-zero blocks of the sparse BSR matrix A.

  • alpha[in] scalar \(\alpha\).

  • descr[in] descriptor of the sparse BSR matrix A.

  • bsr_val[in] array of nnzb blocks of the sparse BSR matrix.

  • bsr_row_ptr[in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • bsr_col_ind[in] array of nnzb containing the block column indices of the sparse BSR matrix.

  • block_dim[in] block dimension of the sparse BSR matrix.

  • info[in] structure that holds the information collected during the analysis step.

  • B[in] column-oriented dense matrix B with leading dimension ldb.

  • ldb[in] leading dimension of rhs matrix B.

  • X[out] column-oriented dense solution matrix X with leading dimension ldx.

  • ldx[in] leading dimension of solution matrix X.

  • policy[in] rocsparse_solve_policy_auto.

  • temp_buffer[in] temporary storage buffer allocated by the user.

Return values:

rocsparse_bsrsm_clear()#

rocsparse_status rocsparse_bsrsm_clear(rocsparse_handle handle, rocsparse_mat_info info)#

Sparse triangular system solve using BSR storage format.

rocsparse_bsrsm_clear deallocates all memory that was allocated by rocsparse_sbsrsm_analysis(), rocsparse_dbsrsm_analysis(), rocsparse_cbsrsm_analysis() or rocsparse_zbsrsm_analysis(). This is especially useful, if memory is an issue and the analysis data is not required for further computation, e.g. when switching to another sparse matrix format. Calling rocsparse_bsrsm_clear is optional. All allocated resources will be cleared, when the opaque rocsparse_mat_info struct is destroyed using rocsparse_destroy_mat_info().

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • info[inout] structure that holds the information collected during the analysis step.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerinfo pointer is invalid.

  • rocsparse_status_memory_error – the buffer holding the meta data could not be deallocated.

  • rocsparse_status_internal_error – an internal error occurred.

rocsparse_gemmi()#

rocsparse_status rocsparse_sgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const float *alpha, const float *A, rocsparse_int lda, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *beta, float *C, rocsparse_int ldc)#
rocsparse_status rocsparse_dgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const double *alpha, const double *A, rocsparse_int lda, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *beta, double *C, rocsparse_int ldc)#
rocsparse_status rocsparse_cgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_float_complex *A, rocsparse_int lda, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *beta, rocsparse_float_complex *C, rocsparse_int ldc)#
rocsparse_status rocsparse_zgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_double_complex *A, rocsparse_int lda, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *beta, rocsparse_double_complex *C, rocsparse_int ldc)#

Dense matrix sparse matrix multiplication using CSR storage format.

rocsparse_gemmi multiplies the scalar \(\alpha\) with a column-oriented dense \(m \times k\) matrix \(op(A)\) and the sparse \(k \times n\) matrix \(op(B)\), defined in CSR storage format and adds the result to the column-oriented dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Example

This example multiplies a column-oriented dense matrix with a CSC matrix.

rocsparse_int m   = 2;
rocsparse_int n   = 5;
rocsparse_int k   = 3;
rocsparse_int nnz = 8;
rocsparse_int lda = m;
rocsparse_int ldc = m;

// Matrix A (m x k)
// (  9.0  10.0  11.0 )
// ( 12.0  13.0  14.0 )

// Matrix B (k x n)
// ( 1.0  2.0  0.0  3.0  0.0 )
// ( 0.0  4.0  5.0  0.0  0.0 )
// ( 6.0  0.0  0.0  7.0  8.0 )

// Matrix C (m x n)
// ( 15.0  16.0  17.0  18.0  19.0 )
// ( 20.0  21.0  22.0  23.0  24.0 )

A[lda * k]           = {9.0, 12.0, 10.0, 13.0, 11.0, 14.0};      // device memory
csc_col_ptr_B[n + 1] = {0, 2, 4, 5, 7, 8};                       // device memory
csc_row_ind_B[nnz]   = {0, 0, 1, 1, 2, 3, 3, 4};                 // device memory
csc_val_B[nnz]       = {1.0, 6.0, 2.0, 4.0, 5.0, 3.0, 7.0, 8.0}; // device memory
C[ldc * n]           = {15.0, 20.0, 16.0, 21.0, 17.0, 22.0,      // device memory
                        18.0, 23.0, 19.0, 24.0};

// alpha and beta
float alpha = 1.0f;
float beta  = 0.0f;

// Perform the matrix multiplication
rocsparse_sgemmi(handle,
                 rocsparse_operation_none,
                 rocsparse_operation_transpose,
                 m,
                 n,
                 k,
                 nnz,
                 &alpha,
                 A,
                 lda,
                 descr_B,
                 csc_val_B,
                 csc_col_ptr_B,
                 csc_row_ind_B,
                 &beta,
                 C,
                 ldc);

Note

Currently, only trans_A == rocsparse_operation_none is supported.

Note

Currently, only trans_B == rocsparse_operation_transpose is supported.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

This routine supports execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] matrix \(A\) operation type.

  • trans_B[in] matrix \(B\) operation type.

  • m[in] number of rows of the column-oriented dense matrix \(A\).

  • n[in] number of columns of the sparse CSR matrix \(op(B)\) and \(C\).

  • k[in] number of columns of the column-oriented dense matrix \(A\).

  • nnz[in] number of non-zero entries of the sparse CSR matrix \(B\).

  • alpha[in] scalar \(\alpha\).

  • A[in] array of dimension \(lda \times k\) ( \(op(A) == A\)) or \(lda \times m\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • lda[in] leading dimension of \(A\), must be at least \(m\) ( \(op(A) == A\)) or \(k\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • descr[in] descriptor of the sparse CSR matrix \(B\). Currently, only rocsparse_matrix_type_general is supported.

  • csr_val[in] array of nnz elements of the sparse CSR matrix \(B\).

  • csr_row_ptr[in] array of m+1 elements that point to the start of every row of the sparse CSR matrix \(B\).

  • csr_col_ind[in] array of nnz elements containing the column indices of the sparse CSR matrix \(B\).

  • beta[in] scalar \(\beta\).

  • C[inout] column-oriented dense matrix of dimension \(ldc \times n\) that holds the values of \(C\).

  • ldc[in] leading dimension of \(C\), must be at least \(m\).

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_sizem, n, k, nnz, lda or ldc is invalid.

  • rocsparse_status_invalid_pointeralpha, A, csr_val, csr_row_ptr, csr_col_ind, beta or C pointer is invalid.