Sparse Generic Functions#

This module holds all sparse generic routines.

The sparse generic routines describe operations that manipulate sparse matrices.

rocsparse_axpby()#

rocsparse_status rocsparse_axpby(rocsparse_handle handle, const void *alpha, const rocsparse_spvec_descr x, const void *beta, rocsparse_dnvec_descr y)#

Scale a sparse vector and add it to a scaled dense vector.

rocsparse_axpby multiplies the sparse vector \(x\) with scalar \(\alpha\) and adds the result to the dense vector \(y\) that is multiplied with scalar \(\beta\), such that

\[ y := \alpha \cdot x + \beta \cdot y \]

for(i = 0; i < nnz; ++i)
{
    y[x_ind[i]] = alpha * x_val[i] + beta * y[x_ind[i]]
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • alpha[in] scalar \(\alpha\).

  • x[in] sparse matrix descriptor.

  • beta[in] scalar \(\beta\).

  • y[inout] dense matrix descriptor.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha, x, beta or y pointer is invalid.

rocsparse_gather()#

rocsparse_status rocsparse_gather(rocsparse_handle handle, const rocsparse_dnvec_descr y, rocsparse_spvec_descr x)#

Gather elements from a dense vector and store them into a sparse vector.

rocsparse_gather gathers the elements from the dense vector \(y\) and stores them in the sparse vector \(x\).

for(i = 0; i < nnz; ++i)
{
    x_val[i] = y[x_ind[i]];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • y[in] dense vector \(y\).

  • x[out] sparse vector \(x\).

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerx or y pointer is invalid.

rocsparse_scatter()#

rocsparse_status rocsparse_scatter(rocsparse_handle handle, const rocsparse_spvec_descr x, rocsparse_dnvec_descr y)#

Scatter elements from a sparse vector into a dense vector.

rocsparse_scatter scatters the elements from the sparse vector \(x\) in the dense vector \(y\).

for(i = 0; i < nnz; ++i)
{
    y[x_ind[i]] = x_val[i];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • x[in] sparse vector \(x\).

  • y[out] dense vector \(y\).

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerx or y pointer is invalid.

rocsparse_rot()#

rocsparse_status rocsparse_rot(rocsparse_handle handle, const void *c, const void *s, rocsparse_spvec_descr x, rocsparse_dnvec_descr y)#

Apply Givens rotation to a dense and a sparse vector.

rocsparse_rot applies the Givens rotation matrix \(G\) to the sparse vector \(x\) and the dense vector \(y\), where

\[\begin{split} G = \begin{pmatrix} c & s \\ -s & c \end{pmatrix} \end{split}\]

for(i = 0; i < nnz; ++i)
{
    x_tmp = x_val[i];
    y_tmp = y[x_ind[i]];

    x_val[i]    = c * x_tmp + s * y_tmp;
    y[x_ind[i]] = c * y_tmp - s * x_tmp;
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • c[in] pointer to the cosine element of \(G\), can be on host or device.

  • s[in] pointer to the sine element of \(G\), can be on host or device.

  • x[inout] sparse vector \(x\).

  • y[inout] dense vector \(y\).

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerc, s, x or y pointer is invalid.

rocsparse_spvv()#

rocsparse_status rocsparse_spvv(rocsparse_handle handle, rocsparse_operation trans, const rocsparse_spvec_descr x, const rocsparse_dnvec_descr y, void *result, rocsparse_datatype compute_type, size_t *buffer_size, void *temp_buffer)#

Sparse vector inner dot product.

rocsparse_spvv computes the inner dot product of the sparse vecotr \(x\) with the dense vector \(y\), such that

\[ \text{result} := x^{'} \cdot y, \]
with
\[\begin{split} op(x) = \left\{ \begin{array}{ll} x, & \text{if trans == rocsparse_operation_none} \\ \bar{x}, & \text{if trans == rocsparse_operation_conjugate_transpose} \\ \end{array} \right. \end{split}\]

result = 0;
for(i = 0; i < nnz; ++i)
{
    result += x_val[i] * y[x_ind[i]];
}

Example
// Number of non-zeros of the sparse vector
int nnz = 3;

// Size of sparse and dense vector
int size = 9;

// Sparse index vector
std::vector<int> hx_ind = {0, 3, 5};

// Sparse value vector
std::vector<float> hx_val = {1.0f, 2.0f, 3.0f};

// Dense vector
std::vector<float> hy = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f, 9.0f};

// Offload data to device
int* dx_ind;
float* dx_val;
float* dy;
hipMalloc((void**)&dx_ind, sizeof(int) * nnz);
hipMalloc((void**)&dx_val, sizeof(float) * nnz);
hipMalloc((void**)&dy, sizeof(float) * size);

hipMemcpy(dx_ind, hx_ind.data(), sizeof(int) * nnz, hipMemcpyHostToDevice);
hipMemcpy(dx_val, hx_val.data(), sizeof(float) * nnz, hipMemcpyHostToDevice);
hipMemcpy(dy, hy.data(), sizeof(float) * size, hipMemcpyHostToDevice);

rocsparse_handle     handle;
rocsparse_spvec_descr vecX;
rocsparse_dnvec_descr vecY;

rocsparse_indextype idx_type = rocsparse_indextype_i32;
rocsparse_datatype  data_type = rocsparse_datatype_f32_r;
rocsparse_datatype  compute_type = rocsparse_datatype_f32_r;
rocsparse_operation trans = rocsparse_operation_none;
rocsparse_index_base idx_base = rocsparse_index_base_zero;

rocsparse_create_handle(&handle);

// Create sparse vector X
rocsparse_create_spvec_descr(&vecX,
                             size,
                             nnz,
                             dx_ind,
                             dx_val,
                             idx_type,
                             idx_base,
                             data_type);

// Create dense vector Y
rocsparse_create_dnvec_descr(&vecY,
                             size,
                             dy,
                             data_type);

// Obtain buffer size
float hresult = 0.0f;
size_t buffer_size;
rocsparse_spvv(handle,
               trans,
               vecX,
               vecY,
               &hresult,
               compute_type,
               &buffer_size,
               nullptr);

void* temp_buffer;
hipMalloc(&temp_buffer, buffer_size);

// SpVV
rocsparse_spvv(handle,
               trans,
               vecX,
               vecY,
               &hresult,
               compute_type,
               &buffer_size,
               temp_buffer);

hipDeviceSynchronize();

std::cout << "hresult: " << hresult << std::endl;

// Clear rocSPARSE
rocsparse_destroy_spvec_descr(vecX);
rocsparse_destroy_dnvec_descr(vecY);
rocsparse_destroy_handle(handle);

// Clear device memory
hipFree(dx_ind);
hipFree(dx_val);
hipFree(dy);
hipFree(temp_buffer);

Note

This function writes the required allocation size (in bytes) to buffer_size and returns without performing the SpVV operation, when a nullptr is passed for temp_buffer.

Note

This function is blocking with respect to the host.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans[in] sparse vector operation type.

  • x[in] sparse vector descriptor.

  • y[in] dense vector descriptor.

  • result[out] pointer to the result, can be host or device memory

  • compute_type[in] floating point precision for the SpVV computation.

  • buffer_size[out] number of bytes of the temporary storage buffer. buffer_size is set when temp_buffer is nullptr.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the SpVV operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointerx, y, result or buffer_size pointer is invalid.

  • rocsparse_status_not_implementedcompute_type is currently not supported.

rocsparse_spmv()#

rocsparse_status rocsparse_spmv(rocsparse_handle handle, rocsparse_operation trans, const void *alpha, const rocsparse_spmat_descr mat, const rocsparse_dnvec_descr x, const void *beta, const rocsparse_dnvec_descr y, rocsparse_datatype compute_type, rocsparse_spmv_alg alg, size_t *buffer_size, void *temp_buffer)#

Sparse matrix vector multiplication.

rocsparse_spmv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y := \alpha \cdot op(A) \cdot x + \beta \cdot y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

rocsparse_spmv supports multiple combinations of data types and compute types. The tables below indicate the currently supported different data types that can be used for for the sparse matrix A and the dense vectors X and Y and the compute type for \(\alpha\) and \(\beta\). The advantage of using different data types is to save on memory bandwidth and storage when a user application allows while performing the actual computation in a higher precision.

Uniform Precisions:

*  |----------------------------------------------------|
*  |             A / X / Y / compute_type               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f32_r               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f64_r               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f32_c               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f64_c               |
*  |----------------------------------------------------|
*

Mixed precisions:

*  |-------------------------|--------------------------|--------------------------|
*  |         A / X           |             Y            |       compute_type       |
*  |-------------------------|--------------------------|--------------------------|
*  | rocsparse_datatype_i8_r | rocsparse_datatype_i32_r | rocsparse_datatype_i32_r |
*  |-------------------------|--------------------------|--------------------------|
*  | rocsparse_datatype_i8_r | rocsparse_datatype_f32_r | rocsparse_datatype_f32_r |
*  |-------------------------|--------------------------|--------------------------|
*  | rocsparse_datatype_i8_r | rocsparse_datatype_i32_r | rocsparse_datatype_i32_r |
*  |-------------------------|--------------------------|--------------------------|
*

Mixed-regular real precisions

*  |----------------------------|----------------------------|
*  |              A             |    X / Y / compute_type    |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f32_r  |  rocsparse_datatype_f64_r  |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f32_c  |  rocsparse_datatype_f64_c  |
*  |----------------------------|----------------------------|
*

Mixed-regular Complex precisions

*  |----------------------------|----------------------------|
*  |              A             |    X / Y / compute_type    |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f32_r  |  rocsparse_datatype_f32_c  |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f64_r  |  rocsparse_datatype_f64_c  |
*  |----------------------------|----------------------------|
*

Note

This function writes the required allocation size (in bytes) to buffer_size and returns without performing the SpMV operation, when a nullptr is passed for temp_buffer.

Note

Only the rocsparse_spmv_stage_buffer_size stage and the rocsparse_spmv_stage_compute stage are non blocking and executed asynchronously with respect to the host. They may return before the actual computation has finished. The rocsparse_spmv_stage_preprocess stage is blocking with respect to the host.

Note

Only the rocsparse_spmv_stage_buffer_size stage and the rocsparse_spmv_stage_compute stage support execution in a hipGraph context. The rocsparse_spmv_stage_preprocess stage does not support hipGraph.

Note

The sparse matrix formats currently supported are: rocsparse_format_bsr, rocsparse_format_coo, rocsparse_format_coo_aos, rocsparse_format_csr, rocsparse_format_csc and rocsparse_format_ell.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • mat[in] matrix descriptor.

  • x[in] vector descriptor.

  • beta[in] scalar \(\beta\).

  • y[inout] vector descriptor.

  • compute_type[in] floating point precision for the SpMV computation.

  • alg[in] SpMV algorithm for the SpMV computation.

  • buffer_size[out] number of bytes of the temporary storage buffer. buffer_size is set when temp_buffer is nullptr.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the SpMV operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha, mat, x, beta, y or buffer_size pointer is invalid.

  • rocsparse_status_invalid_value – the value of trans, trans_B, compute_type, alg is incorrect.

  • rocsparse_status_not_implementedcompute_type or alg is currently not supported.

rocsparse_spmv_ex()#

rocsparse_status rocsparse_spmv_ex(rocsparse_handle handle, rocsparse_operation trans, const void *alpha, const rocsparse_spmat_descr mat, const rocsparse_dnvec_descr x, const void *beta, const rocsparse_dnvec_descr y, rocsparse_datatype compute_type, rocsparse_spmv_alg alg, rocsparse_spmv_stage stage, size_t *buffer_size, void *temp_buffer)#

Sparse matrix vector multiplication.

rocsparse_spmv_ex multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y := \alpha \cdot op(A) \cdot x + \beta \cdot y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

rocsparse_spmv supports multiple combinations of data types and compute types. The tables below indicate the currently supported different data types that can be used for for the sparse matrix A and the dense vectors X and Y and the compute type for \(\alpha\) and \(\beta\). The advantage of using different data types is to save on memory bandwidth and storage when a user application allows while performing the actual computation in a higher precision.

Uniform Precisions:

*  |----------------------------------------------------|
*  |             A / X / Y / compute_type               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f32_r               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f64_r               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f32_c               |
*  |----------------------------------------------------|
*  |             rocsparse_datatype_f64_c               |
*  |----------------------------------------------------|
*

Mixed-regular real precisions

*  |----------------------------|----------------------------|
*  |              A             |    X / Y / compute_type    |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f32_r  |  rocsparse_datatype_f64_r  |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f32_c  |  rocsparse_datatype_f64_c  |
*  |----------------------------|----------------------------|
*

Mixed precisions:

*  |-------------------------|--------------------------|--------------------------|
*  |         A / X           |             Y            |       compute_type       |
*  |-------------------------|--------------------------|--------------------------|
*  | rocsparse_datatype_i8_r | rocsparse_datatype_i32_r | rocsparse_datatype_i32_r |
*  |-------------------------|--------------------------|--------------------------|
*  | rocsparse_datatype_i8_r | rocsparse_datatype_f32_r | rocsparse_datatype_f32_r |
*  |-------------------------|--------------------------|--------------------------|
*  | rocsparse_datatype_i8_r | rocsparse_datatype_i32_r | rocsparse_datatype_i32_r |
*  |-------------------------|--------------------------|--------------------------|
*

Mixed-regular Complex precisions

*  |----------------------------|----------------------------|
*  |              A             |    X / Y / compute_type    |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f32_r  |  rocsparse_datatype_f32_c  |
*  |----------------------------|----------------------------|
*  |  rocsparse_datatype_f64_r  |  rocsparse_datatype_f64_c  |
*  |----------------------------|----------------------------|
*

Note

This function writes the required allocation size (in bytes) to buffer_size and returns without performing the SpMV operation, when a nullptr is passed for temp_buffer.

Note

The sparse matrix formats currently supported are: rocsparse_format_bsr, rocsparse_format_coo, rocsparse_format_coo_aos, rocsparse_format_csr, rocsparse_format_csc and rocsparse_format_ell.

Note

SpMV_ex requires three stages to complete. The first stage rocsparse_spmv_stage_buffer_size will return the size of the temporary storage buffer that is required for subsequent calls to rocsparse_spmv_ex. The second stage rocsparse_spmv_stage_preprocess will preprocess data that would be saved in the temporary storage buffer. In the final stage rocsparse_spmv_stage_compute, the actual computation is performed.

Note

If rocsparse_spmv_stage_auto is selected, rocSPARSE will automatically detect which stage is required based on the following indicators: If temp_buffer is equal to nullptr, the required buffer size will be returned. Else, the SpMV_ex preprocess and the SpMV algorithm will be executed.

Note

Only the rocsparse_spmv_stage_buffer_size stage and the rocsparse_spmv_stage_compute stage are non blocking and executed asynchronously with respect to the host. They may return before the actual computation has finished. The rocsparse_spmv_stage_preprocess stage is blocking with respect to the host.

Note

Only the rocsparse_spmv_stage_buffer_size stage and the rocsparse_spmv_stage_compute stage support execution in a hipGraph context. The rocsparse_spmv_stage_preprocess stage does not support hipGraph.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • mat[in] matrix descriptor.

  • x[in] vector descriptor.

  • beta[in] scalar \(\beta\).

  • y[inout] vector descriptor.

  • compute_type[in] floating point precision for the SpMV computation.

  • alg[in] SpMV algorithm for the SpMV computation.

  • stage[in] SpMV stage for the SpMV computation.

  • buffer_size[out] number of bytes of the temporary storage buffer. buffer_size is set when temp_buffer is nullptr.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the SpMV operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha, mat, x, beta, y or buffer_size pointer is invalid.

  • rocsparse_status_invalid_value – the value of trans, trans_B, compute_type, alg or stage is incorrect.

  • rocsparse_status_not_implementedcompute_type or alg is currently not supported.

rocsparse_spsv()#

rocsparse_status rocsparse_spsv(rocsparse_handle handle, rocsparse_operation trans, const void *alpha, const rocsparse_spmat_descr mat, const rocsparse_dnvec_descr x, const rocsparse_dnvec_descr y, rocsparse_datatype compute_type, rocsparse_spsv_alg alg, rocsparse_spsv_stage stage, size_t *buffer_size, void *temp_buffer)#

Sparse triangular solve.

rocsparse_spsv_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR or COO storage format, a dense solution vector \(y\) and the right-hand side \(x\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot y = \alpha \cdot x, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Note

SpSV requires three stages to complete. The first stage rocsparse_spsv_stage_buffer_size will return the size of the temporary storage buffer that is required for subsequent calls. The second stage rocsparse_spsv_stage_preprocess will preprocess data that would be saved in the temporary storage buffer. In the final stage rocsparse_spsv_stage_compute, the actual computation is performed.

Note

If rocsparse_spsv_stage_auto is selected, rocSPARSE will automatically detect which stage is required based on the following indicators: If temp_buffer is equal to nullptr, the required buffer size will be returned. If buffer_size is equal to nullptr, analysis will be performed. Otherwise, the SpSV preprocess and the SpSV algorithm will be executed.

Note

Only the rocsparse_spsv_stage_buffer_size stage and the rocsparse_spsv_stage_compute stage are non blocking and executed asynchronously with respect to the host. They may return before the actual computation has finished. The rocsparse_spsv_stage_preprocess stage is blocking with respect to the host.

Note

Currently, only trans == rocsparse_operation_none and trans == rocsparse_operation_transpose is supported.

Note

Only the rocsparse_spsv_stage_buffer_size stage and the rocsparse_spsv_stage_compute stage support execution in a hipGraph context. The rocsparse_spsv_stage_preprocess stage does not support hipGraph.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • mat[in] matrix descriptor.

  • x[in] vector descriptor.

  • y[inout] vector descriptor.

  • compute_type[in] floating point precision for the SpSV computation.

  • alg[in] SpSV algorithm for the SpSV computation.

  • stage[in] SpSV stage for the SpSV computation.

  • buffer_size[out] number of bytes of the temporary storage buffer.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the SpSV operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha, mat, x, y, descr or buffer_size pointer is invalid.

  • rocsparse_status_not_implementedtrans, compute_type, stage or alg is currently not supported.

rocsparse_spsm()#

rocsparse_status rocsparse_spsm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, const void *alpha, const rocsparse_spmat_descr matA, const rocsparse_dnmat_descr matB, const rocsparse_dnmat_descr matC, rocsparse_datatype compute_type, rocsparse_spsm_alg alg, rocsparse_spsm_stage stage, size_t *buffer_size, void *temp_buffer)#

Sparse triangular system solve.

rocsparse_spsm_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR or COO storage format, a dense solution matrix \(C\) and the right-hand side \(B\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot C = \alpha \cdot op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Note

SpSM requires three stages to complete. The first stage rocsparse_spsm_stage_buffer_size will return the size of the temporary storage buffer that is required for subsequent calls. The second stage rocsparse_spsm_stage_preprocess will preprocess data that would be saved in the temporary storage buffer. In the final stage rocsparse_spsm_stage_compute, the actual computation is performed.

Note

If rocsparse_spsm_stage_auto is selected, rocSPARSE will automatically detect which stage is required based on the following indicators: If temp_buffer is equal to nullptr, the required buffer size will be returned. If buffer_size is equal to nullptr, analysis will be performed. Otherwise, the SpSM preprocess and the SpSM algorithm will be executed.

Note

Only the rocsparse_spsm_stage_buffer_size stage and the rocsparse_spsm_stage_compute stage are non blocking and executed asynchronously with respect to the host. They may return before the actual computation has finished. The rocsparse_spsm_stage_preprocess stage is blocking with respect to the host.

Note

Currently, only trans_A == rocsparse_operation_none and trans_A == rocsparse_operation_transpose is supported. Currently, only trans_B == rocsparse_operation_none and trans_B == rocsparse_operation_transpose is supported.

Note

Only the rocsparse_spsm_stage_buffer_size stage and the rocsparse_spsm_stage_compute stage support execution in a hipGraph context. The rocsparse_spsm_stage_preprocess stage does not support hipGraph.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] matrix operation type for the sparse matrix A.

  • trans_B[in] matrix operation type for the dense matrix B.

  • alpha[in] scalar \(\alpha\).

  • matA[in] sparse matrix descriptor.

  • matB[in] dense matrix descriptor.

  • matC[inout] dense matrix descriptor.

  • compute_type[in] floating point precision for the SpSM computation.

  • alg[in] SpSM algorithm for the SpSM computation.

  • stage[in] SpSM stage for the SpSM computation.

  • buffer_size[out] number of bytes of the temporary storage buffer.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the SpSM operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha, matA, matB, matC, descr or buffer_size pointer is invalid.

  • rocsparse_status_not_implementedtrans_A, trans_B, compute_type, stage or alg is currently not supported.

rocsparse_spmm()#

rocsparse_status rocsparse_spmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, const void *alpha, const rocsparse_spmat_descr mat_A, const rocsparse_dnmat_descr mat_B, const void *beta, const rocsparse_dnmat_descr mat_C, rocsparse_datatype compute_type, rocsparse_spmm_alg alg, rocsparse_spmm_stage stage, size_t *buffer_size, void *temp_buffer)#

Sparse matrix dense matrix multiplication, extension routine.

rocsparse_spmm (or rocsparse_spmm_ex ) multiplies the scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in CSR or COO or Blocked ELL storage format, and the dense \(k \times n\) matrix \(B\) and adds the result to the dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Example

This example performs sparse matrix-dense matrix multiplication, C = alpha * A * B + beta * C

//     1 4 0 0 0 0
// A = 0 2 3 0 0 0
//     5 0 0 7 8 0
//     0 0 9 0 6 0

//     1 4 2
//     1 2 3
// B = 5 4 0
//     3 1 9
//     1 2 2
//     0 3 0

//     1 1 5
// C = 1 2 1
//     1 3 1
//     6 2 4

rocsparse_int m   = 4;
rocsparse_int k   = 6;
rocsparse_int n   = 3;

csr_row_ptr[m + 1] = {0, 1, 3};                                              // device memory
csr_col_ind[nnz]   = {0, 0, 1};                                              // device memory
csr_val[nnz]       = {1, 0, 4, 2, 0, 3, 5, 0, 0, 0, 0, 9, 7, 0, 8, 6, 0, 0}; // device memory

B[k * n]       = {1, 1, 5, 3, 1, 0, 4, 2, 4, 1, 2, 3, 2, 3, 0, 9, 2, 0};     // device memory
C[m * n]       = {1, 1, 1, 6, 1, 2, 3, 2, 5, 1, 1, 4};                       // device memory

rocsparse_int nnz = csr_row_ptr[m] - csr_row_ptr[0];

float alpha = 1.0f;
float beta  = 0.0f;

// Create CSR arrays on device
rocsparse_int* csr_row_ptr;
rocsparse_int* csr_col_ind;
float* csr_val;
float* B;
float* C;
hipMalloc((void**)&csr_row_ptr, sizeof(rocsparse_int) * (m + 1));
hipMalloc((void**)&csr_col_ind, sizeof(rocsparse_int) * nnz);
hipMalloc((void**)&csr_val, sizeof(float) * nnz);
hipMalloc((void**)&B, sizeof(float) * k * n);
hipMalloc((void**)&C, sizeof(float) * m * n);

// Create rocsparse handle
rocsparse_local_handle handle;

// Types
rocsparse_indextype itype = rocsparse_indextype_i32;
rocsparse_indextype jtype = rocsparse_indextype_i32;
rocsparse_datatype  ttype = rocsparse_datatype_f32_r;

// Create descriptors
rocsparse_spmat_descr mat_A;
rocsparse_dnmat_descr mat_B;
rocsparse_dnmat_descr mat_C;

rocsparse_create_csr_descr(&mat_A, m, k, nnz, csr_row_ptr, csr_col_ind, csr_val, itype, jtype, rocsparse_index_base_zero, ttype);
rocsparse_create_dnmat_descr(&mat_B, k, n, k, B, ttype, rocsparse_order_column);
rocsparse_create_dnmat_descr(&mat_C, m, n, m, C, ttype, rocsparse_order_column);

// Query SpMM buffer
size_t buffer_size;
rocsparse_spmm(handle,
               rocsparse_operation_none,
               rocsparse_operation_none,
               &alpha,
               mat_A,
               mat_B,
               &beta,
               mat_C,
               ttype,
               rocsparse_spmm_alg_default,
               rocsparse_spmm_stage_buffer_size,
               &buffer_size,
               nullptr));

// Allocate buffer
void* buffer;
hipMalloc(&buffer, buffer_size);

rocsparse_spmm(handle,
               rocsparse_operation_none,
               rocsparse_operation_none,
               &alpha,
               mat_A,
               mat_B,
               &beta,
               mat_C,
               ttype,
               rocsparse_spmm_alg_default,
               rocsparse_spmm_stage_preprocess,
               &buffer_size,
               buffer));

// Pointer mode host
rocsparse_spmm(handle,
               rocsparse_operation_none,
               rocsparse_operation_none,
               &alpha,
               mat_A,
               mat_B,
               &beta,
               mat_C,
               ttype,
               rocsparse_spmm_alg_default,
               rocsparse_spmm_stage_compute,
               &buffer_size,
               buffer));

// Clear up on device
hipFree(csr_row_ptr);
hipFree(csr_col_ind);
hipFree(csr_val);
hipFree(B);
hipFree(C);
hipFree(temp_buffer);

rocsparse_destroy_spmat_descr(mat_A);
rocsparse_destroy_dnmat_descr(mat_B);
rocsparse_destroy_dnmat_descr(mat_C);

Example

SpMM also supports batched computation for CSR and COO matrices. There are three supported batch modes: C_i = A * B_i C_i = A_i * B C_i = A_i * B_i The batch mode is determined by the batch count and stride passed for each matrix. For example to use the first batch mode (C_i = A * B_i) with 100 batches for non-transposed A, B, and C, one passes: batch_count_A = 1 batch_count_B = 100 batch_count_C = 100 offsets_batch_stride_A = 0 columns_values_batch_stride_A = 0 batch_stride_B = k * n batch_stride_C = m * n To use the second batch mode (C_i = A_i * B) one could use: batch_count_A = 100 batch_count_B = 1 batch_count_C = 100 offsets_batch_stride_A = m + 1 columns_values_batch_stride_A = nnz batch_stride_B = 0 batch_stride_C = m * n And to use the third batch mode (C_i = A_i * B_i) one could use: batch_count_A = 100 batch_count_B = 100 batch_count_C = 100 offsets_batch_stride_A = m + 1 columns_values_batch_stride_A = nnz batch_stride_B = k * n batch_stride_C = m * n An example of the first batch mode (C_i = A * B_i) is provided below.

//     1 4 0 0 0 0
// A = 0 2 3 0 0 0
//     5 0 0 7 8 0
//     0 0 9 0 6 0

rocsparse_int m   = 4;
rocsparse_int k   = 6;
rocsparse_int n   = 3;

csr_row_ptr[m + 1] = {0, 1, 3};                                              // device memory
csr_col_ind[nnz]   = {0, 0, 1};                                              // device memory
csr_val[nnz]       = {1, 0, 4, 2, 0, 3, 5, 0, 0, 0, 0, 9, 7, 0, 8, 6, 0, 0}; // device memory

B[batch_count_B * k * n]       = {...};     // device memory
C[batch_count_C * m * n]       = {...};     // device memory

rocsparse_int nnz = csr_row_ptr[m] - csr_row_ptr[0];

rocsparse_int batch_count_A = 1;
rocsparse_int batch_count_B = 100;
rocsparse_int batch_count_C = 100;

rocsparse_int offsets_batch_stride_A        = 0;
rocsparse_int columns_values_batch_stride_A = 0;
rocsparse_int batch_stride_B                = k * n;
rocsparse_int batch_stride_C                = m * n;

float alpha = 1.0f;
float beta  = 0.0f;

// Create CSR arrays on device
rocsparse_int* csr_row_ptr;
rocsparse_int* csr_col_ind;
float* csr_val;
float* B;
float* C;
hipMalloc((void**)&csr_row_ptr, sizeof(rocsparse_int) * (m + 1));
hipMalloc((void**)&csr_col_ind, sizeof(rocsparse_int) * nnz);
hipMalloc((void**)&csr_val, sizeof(float) * nnz);
hipMalloc((void**)&B, sizeof(float) * batch_count_B * k * n);
hipMalloc((void**)&C, sizeof(float) * batch_count_C * m * n);

// Create rocsparse handle
rocsparse_local_handle handle;

// Types
rocsparse_indextype itype = rocsparse_indextype_i32;
rocsparse_indextype jtype = rocsparse_indextype_i32;
rocsparse_datatype  ttype = rocsparse_datatype_f32_r;

// Create descriptors
rocsparse_spmat_descr mat_A;
rocsparse_dnmat_descr mat_B;
rocsparse_dnmat_descr mat_C;

rocsparse_create_csr_descr(&mat_A, m, k, nnz, csr_row_ptr, csr_col_ind, csr_val, itype, jtype, rocsparse_index_base_zero, ttype);
rocsparse_create_dnmat_descr(&mat_B, k, n, k, B, ttype, rocsparse_order_column);
rocsparse_create_dnmat_descr(&mat_C, m, n, m, C, ttype, rocsparse_order_column);

rocsparse_csr_set_strided_batch(mat_A, batch_count_A, offsets_batch_stride_A, columns_values_batch_stride_A);
rocsparse_dnmat_set_strided_batch(B, batch_count_B, batch_stride_B);
rocsparse_dnmat_set_strided_batch(C, batch_count_C, batch_stride_C);

// Query SpMM buffer
size_t buffer_size;
rocsparse_spmm(handle,
               rocsparse_operation_none,
               rocsparse_operation_none,
               &alpha,
               mat_A,
               mat_B,
               &beta,
               mat_C,
               ttype,
               rocsparse_spmm_alg_default,
               rocsparse_spmm_stage_buffer_size,
               &buffer_size,
               nullptr));

// Allocate buffer
void* buffer;
hipMalloc(&buffer, buffer_size);

rocsparse_spmm(handle,
               rocsparse_operation_none,
               rocsparse_operation_none,
               &alpha,
               mat_A,
               mat_B,
               &beta,
               mat_C,
               ttype,
               rocsparse_spmm_alg_default,
               rocsparse_spmm_stage_preprocess,
               &buffer_size,
               buffer));

// Pointer mode host
rocsparse_spmm(handle,
               rocsparse_operation_none,
               rocsparse_operation_none,
               &alpha,
               mat_A,
               mat_B,
               &beta,
               mat_C,
               ttype,
               rocsparse_spmm_alg_default,
               rocsparse_spmm_stage_compute,
               &buffer_size,
               buffer));

// Clear up on device
hipFree(csr_row_ptr);
hipFree(csr_col_ind);
hipFree(csr_val);
hipFree(B);
hipFree(C);
hipFree(temp_buffer);

rocsparse_destroy_spmat_descr(mat_A);
rocsparse_destroy_dnmat_descr(mat_B);
rocsparse_destroy_dnmat_descr(mat_C);

Note

Only the rocsparse_spmm_stage_buffer_size stage and the rocsparse_spmm_stage_compute stage are non blocking and executed asynchronously with respect to the host. They may return before the actual computation has finished. The rocsparse_spmm_stage_preprocess stage is blocking with respect to the host.

Note

Currently, only trans_A == rocsparse_operation_none is supported for COO and Blocked ELL formats.

Note

Only the rocsparse_spmm_stage_buffer_size stage and the rocsparse_spmm_stage_compute stage support execution in a hipGraph context. The rocsparse_spmm_stage_preprocess stage does not support hipGraph.

Note

Currently, only CSR, COO and Blocked ELL sparse formats are supported.

Note

Different algorithms are available which can provide better performance for different matrices. Currently, the available algorithms are rocsparse_spmm_alg_csr, rocsparse_spmm_alg_csr_row_split or rocsparse_spmm_alg_csr_merge for CSR matrices, rocsparse_spmm_alg_bell for Blocked ELL matrices and rocsparse_spmm_alg_coo_segmented or rocsparse_spmm_alg_coo_atomic for COO matrices. Additionally, one can specify the algorithm to be rocsparse_spmm_alg_default. In the case of CSR matrices this will set the algorithm to be rocsparse_spmm_alg_csr, in the case of Blocked ELL matrices this will set the algorithm to be rocsparse_spmm_alg_bell and for COO matrices it will set the algorithm to be rocsparse_spmm_alg_coo_atomic. When A is transposed, rocsparse_spmm will revert to using rocsparse_spmm_alg_csr for CSR format and rocsparse_spmm_alg_coo_atomic for COO format regardless of algorithm selected.

Note

This function writes the required allocation size (in bytes) to buffer_size and returns without performing the SpMM operation, when a nullptr is passed for temp_buffer.

Note

SpMM requires three stages to complete. The first stage rocsparse_spmm_stage_buffer_size will return the size of the temporary storage buffer that is required for subsequent calls to rocsparse_spmm (or rocsparse_spmm_ex). The second stage rocsparse_spmm_stage_preprocess will preprocess data that would be saved in the temporary storage buffer. In the final stage rocsparse_spmm_stage_compute, the actual computation is performed.

Note

If rocsparse_spmm_stage_auto is selected, rocSPARSE will automatically detect which stage is required based on the following indicators: If temp_buffer is equal to nullptr, the required buffer size will be returned. Else, the SpMM preprocess and the SpMM algorithm will be executed.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] matrix operation type.

  • trans_B[in] matrix operation type.

  • alpha[in] scalar \(\alpha\).

  • mat_A[in] matrix descriptor.

  • mat_B[in] matrix descriptor.

  • beta[in] scalar \(\beta\).

  • mat_C[in] matrix descriptor.

  • compute_type[in] floating point precision for the SpMM computation.

  • alg[in] SpMM algorithm for the SpMM computation.

  • stage[in] SpMM stage for the SpMM computation.

  • buffer_size[out] number of bytes of the temporary storage buffer. buffer_size is set when temp_buffer is nullptr.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the SpMM operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha, mat_A, mat_B, mat_C, beta, or buffer_size pointer is invalid.

  • rocsparse_status_not_implementedtrans_A, trans_B, compute_type or alg is currently not supported.

rocsparse_spgemm()#

rocsparse_status rocsparse_spgemm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, const void *alpha, const rocsparse_spmat_descr A, const rocsparse_spmat_descr B, const void *beta, const rocsparse_spmat_descr D, rocsparse_spmat_descr C, rocsparse_datatype compute_type, rocsparse_spgemm_alg alg, rocsparse_spgemm_stage stage, size_t *buffer_size, void *temp_buffer)#

Sparse matrix sparse matrix multiplication.

rocsparse_spgemm multiplies the scalar \(\alpha\) with the sparse \(m \times k\) matrix \(A\) and the sparse \(k \times n\) matrix \(B\) and adds the result to the sparse \(m \times n\) matrix \(D\) that is multiplied by \(\beta\). The final result is stored in the sparse \(m \times n\) matrix \(C\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot D, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Note

SpGEMM requires three stages to complete. The first stage rocsparse_spgemm_stage_buffer_size will return the size of the temporary storage buffer that is required for subsequent calls to rocsparse_spgemm. The second stage rocsparse_spgemm_stage_nnz will determine the number of non-zero elements of the resulting \(C\) matrix. If the sparsity pattern of \(C\) is already known, this stage can be skipped. In the final stage rocsparse_spgemm_stage_compute, the actual computation is performed.

Note

If rocsparse_spgemm_stage_auto is selected, rocSPARSE will automatically detect which stage is required based on the following indicators: If temp_buffer is equal to nullptr, the required buffer size will be returned. Else, if the number of non-zeros of \(C\) is zero, the number of non-zero entries will be computed. Else, the SpGEMM algorithm will be executed.

Note

If \(\alpha == 0\), then \(C = \beta \cdot D\) will be computed.

Note

If \(\beta == 0\), then \(C = \alpha \cdot op(A) \cdot op(B)\) will be computed.

Note

Currently only CSR and BSR formats are supported.

Note

If rocsparse_spgemm_stage_symbolic is selected then the symbolic computation is performed only.

Note

If rocsparse_spgemm_stage_numeric is selected then the numeric computation is performed only.

Note

For the rocsparse_spgemm_stage_symbolic and rocsparse_spgemm_stage_numeric stages, only CSR matrix format is currently supported.

Note

\(\alpha == beta == 0\) is invalid.

Note

It is allowed to pass the same sparse matrix for \(C\) and \(D\), if both matrices have the same sparsity pattern.

Note

Currently, only trans_A == rocsparse_operation_none is supported.

Note

Currently, only trans_B == rocsparse_operation_none is supported.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Please note, that for rare matrix products with more than 4096 non-zero entries per row, additional temporary storage buffer is allocated by the algorithm.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • trans_A[in] sparse matrix \(A\) operation type.

  • trans_B[in] sparse matrix \(B\) operation type.

  • alpha[in] scalar \(\alpha\).

  • A[in] sparse matrix \(A\) descriptor.

  • B[in] sparse matrix \(B\) descriptor.

  • beta[in] scalar \(\beta\).

  • D[in] sparse matrix \(D\) descriptor.

  • C[out] sparse matrix \(C\) descriptor.

  • compute_type[in] floating point precision for the SpGEMM computation.

  • alg[in] SpGEMM algorithm for the SpGEMM computation.

  • stage[in] SpGEMM stage for the SpGEMM computation.

  • buffer_size[out] number of bytes of the temporary storage buffer. buffer_size is set when temp_buffer is nullptr.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the SpGEMM operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha and beta are invalid, A, B, D, C or buffer_size pointer is invalid.

  • rocsparse_status_memory_error – additional buffer for long rows could not be allocated.

  • rocsparse_status_not_implementedtrans_A != rocsparse_operation_none or trans_B != rocsparse_operation_none.

rocsparse_sddmm_buffer_size()#

rocsparse_status rocsparse_sddmm_buffer_size(rocsparse_handle handle, rocsparse_operation opA, rocsparse_operation opB, const void *alpha, const rocsparse_dnmat_descr A, const rocsparse_dnmat_descr B, const void *beta, rocsparse_spmat_descr C, rocsparse_datatype compute_type, rocsparse_sddmm_alg alg, size_t *buffer_size)#

Calculate the size in bytes of the required buffer for the use of rocsparse_sddmm and rocsparse_sddmm_preprocess.

rocsparse_sddmm_buffer_size returns the size of the required buffer to execute the SDDMM operation from a given configuration.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • opA[in] dense matrix \(A\) operation type.

  • opB[in] dense matrix \(B\) operation type.

  • alpha[in] scalar \(\alpha\).

  • A[in] dense matrix \(A\) descriptor.

  • B[in] dense matrix \(B\) descriptor.

  • beta[in] scalar \(\beta\).

  • C[inout] sparse matrix \(C\) descriptor.

  • compute_type[in] floating point precision for the SDDMM computation.

  • alg[in] specification of the algorithm to use.

  • buffer_size[out] number of bytes of the temporary storage buffer.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_value – the value of trans_A or trans_B is incorrect.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha and beta are invalid, A, B, D, C or buffer_size pointer is invalid.

  • rocsparse_status_not_implementedopA == rocsparse_operation_conjugate_transpose or opB == rocsparse_operation_conjugate_transpose.

rocsparse_sddmm_preprocess()#

rocsparse_status rocsparse_sddmm_preprocess(rocsparse_handle handle, rocsparse_operation opA, rocsparse_operation opB, const void *alpha, const rocsparse_dnmat_descr A, const rocsparse_dnmat_descr B, const void *beta, rocsparse_spmat_descr C, rocsparse_datatype compute_type, rocsparse_sddmm_alg alg, void *temp_buffer)#

Preprocess data before the use of rocsparse_sddmm.

rocsparse_sddmm_preprocess executes a part of the algorithm that can be calculated once in the context of multiple calls of the rocsparse_sddmm with the same sparsity pattern.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • opA[in] dense matrix \(A\) operation type.

  • opB[in] dense matrix \(B\) operation type.

  • alpha[in] scalar \(\alpha\).

  • A[in] dense matrix \(A\) descriptor.

  • B[in] dense matrix \(B\) descriptor.

  • beta[in] scalar \(\beta\).

  • C[inout] sparse matrix \(C\) descriptor.

  • compute_type[in] floating point precision for the SDDMM computation.

  • alg[in] specification of the algorithm to use.

  • temp_buffer[in] temporary storage buffer allocated by the user. The size must be greater or equal to the size obtained with rocsparse_sddmm_buffer_size.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_value – the value of trans_A or trans_B is incorrect.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha and beta are invalid, A, B, D, C or temp_buffer pointer is invalid.

  • rocsparse_status_not_implementedopA == rocsparse_operation_conjugate_transpose or opB == rocsparse_operation_conjugate_transpose.

rocsparse_sddmm()#

rocsparse_status rocsparse_sddmm(rocsparse_handle handle, rocsparse_operation opA, rocsparse_operation opB, const void *alpha, const rocsparse_dnmat_descr A, const rocsparse_dnmat_descr B, const void *beta, rocsparse_spmat_descr C, rocsparse_datatype compute_type, rocsparse_sddmm_alg alg, void *temp_buffer)#

Sampled Dense-Dense Matrix Multiplication.

rocsparse_sddmm multiplies the scalar \(\alpha\) with the dense \(m \times k\) matrix \(A\), the dense \(k \times n\) matrix \(B\), filtered by the sparsity pattern of the \(m \times n\) sparse matrix \(C\) and adds the result to \(C\) scaled by \(\beta\). The final result is stored in the sparse \(m \times n\) matrix \(C\), such that

\[ C := \alpha ( opA(A) \cdot opB(B) ) \cdot spy(C) + \beta C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if opA == rocsparse_operation_none} \\ A^T, & \text{if opA == rocsparse_operation_transpose} \\ \end{array} \right. \end{split}\]
,
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if opB == rocsparse_operation_none} \\ B^T, & \text{if opB == rocsparse_operation_transpose} \\ \end{array} \right. \end{split}\]
and
\[\begin{split} spy(C)_ij = \left\{ \begin{array}{ll} 1 \text{if i == j}, & 0 \text{if i != j} \\ \end{array} \right. \end{split}\]

Note

opA == rocsparse_operation_conjugate_transpose is not supported.

Note

opB == rocsparse_operation_conjugate_transpose is not supported.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • opA[in] dense matrix \(A\) operation type.

  • opB[in] dense matrix \(B\) operation type.

  • alpha[in] scalar \(\alpha\).

  • A[in] dense matrix \(A\) descriptor.

  • B[in] dense matrix \(B\) descriptor.

  • beta[in] scalar \(\beta\).

  • C[inout] sparse matrix \(C\) descriptor.

  • compute_type[in] floating point precision for the SDDMM computation.

  • alg[in] specification of the algorithm to use.

  • temp_buffer[in] temporary storage buffer allocated by the user. The size must be greater or equal to the size obtained with rocsparse_sddmm_buffer_size.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_value – the value of trans_A, trans_B, compute_type or alg is incorrect.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointeralpha and beta are invalid, A, B, D, C or temp_buffer pointer is invalid.

  • rocsparse_status_not_implementedopA == rocsparse_operation_conjugate_transpose or opB == rocsparse_operation_conjugate_transpose.

rocsparse_dense_to_sparse()#

rocsparse_status rocsparse_dense_to_sparse(rocsparse_handle handle, const rocsparse_dnmat_descr mat_A, rocsparse_spmat_descr mat_B, rocsparse_dense_to_sparse_alg alg, size_t *buffer_size, void *temp_buffer)#

Dense matrix to sparse matrix conversion.

rocsparse_dense_to_sparse rocsparse_dense_to_sparse performs the conversion of a dense matrix to a sparse matrix in CSR, CSC, or COO format.

Note

This function writes the required allocation size (in bytes) to buffer_size and returns without performing the dense to sparse operation, when a nullptr is passed for temp_buffer.

Note

This function is blocking with respect to the host.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • mat_A[in] dense matrix descriptor.

  • mat_B[in] sparse matrix descriptor.

  • alg[in] algorithm for the sparse to dense computation.

  • buffer_size[out] number of bytes of the temporary storage buffer. buffer_size is set when temp_buffer is nullptr.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the dense to sparse operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointermat_A, mat_B, or buffer_size pointer is invalid.

rocsparse_sparse_to_dense()#

rocsparse_status rocsparse_sparse_to_dense(rocsparse_handle handle, const rocsparse_spmat_descr mat_A, rocsparse_dnmat_descr mat_B, rocsparse_sparse_to_dense_alg alg, size_t *buffer_size, void *temp_buffer)#

Sparse matrix to dense matrix conversion.

rocsparse_sparse_to_dense rocsparse_sparse_to_dense performs the conversion of a sparse matrix in CSR, CSC, or COO format to a dense matrix

Note

This function writes the required allocation size (in bytes) to buffer_size and returns without performing the sparse to dense operation, when a nullptr is passed for temp_buffer.

Note

This function is blocking with respect to the host.

Note

This routine does not support execution in a hipGraph context.

Parameters:
  • handle[in] handle to the rocsparse library context queue.

  • mat_A[in] sparse matrix descriptor.

  • mat_B[in] dense matrix descriptor.

  • alg[in] algorithm for the sparse to dense computation.

  • buffer_size[out] number of bytes of the temporary storage buffer. buffer_size is set when temp_buffer is nullptr.

  • temp_buffer[in] temporary storage buffer allocated by the user. When a nullptr is passed, the required allocation size (in bytes) is written to buffer_size and function returns without performing the sparse to dense operation.

Return values:
  • rocsparse_status_success – the operation completed successfully.

  • rocsparse_status_invalid_handle – the library context was not initialized.

  • rocsparse_status_invalid_pointermat_A, mat_B, or buffer_size pointer is invalid.