rocblas_ztrsm_strided_batched Interface Reference#
BLAS Level 3 API. More...
Public Member Functions | |
integer(kind(rocblas_status_success)) function | rocblas_ztrsm_strided_batched_ (handle, side, uplo, transA, diag, m, n, alpha, A, lda, stride_a, B, ldb, stride_b, batch_count) |
integer(kind(rocblas_status_success)) function | rocblas_ztrsm_strided_batched_full_rank (handle, side, uplo, transA, diag, m, n, alpha, A, lda, stride_a, B, ldb, stride_b, batch_count) |
integer(kind(rocblas_status_success)) function | rocblas_ztrsm_strided_batched_rank_0 (handle, side, uplo, transA, diag, m, n, alpha, A, lda, stride_a, B, ldb, stride_b, batch_count) |
integer(kind(rocblas_status_success)) function | rocblas_ztrsm_strided_batched_rank_1 (handle, side, uplo, transA, diag, m, n, alpha, A, lda, stride_a, B, ldb, stride_b, batch_count) |
Detailed Description
BLAS Level 3 API.
trsm_srided_batched performs the following strided batched operation:
op(A_i)*X_i = alpha*B_i or X_i*op(A_i) = alpha*B_i, for i = 1, ..., batch_count.
where alpha is a scalar, X and B are strided batched m by n matrices, A is triangular strided batched matrix and op(A) is one of
op( A ) = A or op( A ) = A^T or op( A ) = A^H.
Each matrix X_i is overwritten on B_i for i = 1, ..., batch_count.
Note about memory allocation: When trsm is launched with a k evenly divisible by the internal block size of 128, and is no larger than 10 of these blocks, the API takes advantage of utilizing pre-allocated memory found in the handle to increase overall performance. This memory can be managed by using the environment variable WORKBUF_TRSM_B_CHNK. When this variable is not set the device memory used for temporary storage will default to 1 MB and may result in chunking, which in turn may reduce performance. Under these circumstances it is recommended that WORKBUF_TRSM_B_CHNK be set to the desired chunk of right hand sides to be used at a time. (where k is m when rocblas_side_left and is n when rocblas_side_right)
- Parameters
-
[in] handle [rocblas_handle] handle to the rocblas library context queue. [in] side [rocblas_side] rocblas_side_left: op(A)*X = alpha*B. rocblas_side_right: X*op(A) = alpha*B. [in] uplo [rocblas_fill] rocblas_fill_upper: each A_i is an upper triangular matrix. rocblas_fill_lower: each A_i is a lower triangular matrix. [in] transA [rocblas_operation] transB: op(A) = A. rocblas_operation_transpose: op(A) = A^T. rocblas_operation_conjugate_transpose: op(A) = A^H. [in] diag [rocblas_diagonal] rocblas_diagonal_unit: each A_i is assumed to be unit triangular. rocblas_diagonal_non_unit: each A_i is not assumed to be unit triangular. [in] m [rocblas_int] m specifies the number of rows of each B_i. m >= 0. [in] n [rocblas_int] n specifies the number of columns of each B_i. n >= 0. [in] alpha device pointer or host pointer specifying the scalar alpha. When alpha is &zero then A is not referenced and B need not be set before entry. [in] A device pointer pointing to the first matrix A_1. of dimension ( lda, k ), where k is m when rocblas_side_left and is n when rocblas_side_right only the upper/lower triangular part is accessed. [in] lda [rocblas_int] lda specifies the first dimension of each A_i. if side = rocblas_side_left, lda >= max( 1, m ), if side = rocblas_side_right, lda >= max( 1, n ). [in] stride_a [rocblas_stride] stride from the start of one A_i matrix to the next A_(i + 1). [in,out] B device pointer pointing to the first matrix B_1. [in] ldb [rocblas_int] ldb specifies the first dimension of each B_i. ldb >= max( 1, m ). [in] stride_b [rocblas_stride] stride from the start of one B_i matrix to the next B_(i + 1). [in] batch_count [rocblas_int] number of trsm operatons in the batch.
Member Function/Subroutine Documentation
◆ rocblas_ztrsm_strided_batched_()
integer(kind(rocblas_status_success)) function hipfort_rocblas::rocblas_ztrsm_strided_batched::rocblas_ztrsm_strided_batched_ | ( | type(c_ptr), value | handle, |
integer(kind(rocblas_side_left)), value | side, | ||
integer(kind(rocblas_fill_upper)), value | uplo, | ||
integer(kind(rocblas_operation_none)), value | transA, | ||
integer(kind(rocblas_diagonal_non_unit)), value | diag, | ||
integer(c_int), value | m, | ||
integer(c_int), value | n, | ||
complex(c_double_complex) | alpha, | ||
type(c_ptr), value | A, | ||
integer(c_int), value | lda, | ||
integer(c_int64_t), value | stride_a, | ||
type(c_ptr), value | B, | ||
integer(c_int), value | ldb, | ||
integer(c_int64_t), value | stride_b, | ||
integer(c_int), value | batch_count | ||
) |
◆ rocblas_ztrsm_strided_batched_full_rank()
integer(kind(rocblas_status_success)) function hipfort_rocblas::rocblas_ztrsm_strided_batched::rocblas_ztrsm_strided_batched_full_rank | ( | type(c_ptr) | handle, |
integer(kind(rocblas_side_left)) | side, | ||
integer(kind(rocblas_fill_upper)) | uplo, | ||
integer(kind(rocblas_operation_none)) | transA, | ||
integer(kind(rocblas_diagonal_non_unit)) | diag, | ||
integer(c_int) | m, | ||
integer(c_int) | n, | ||
complex(c_double_complex) | alpha, | ||
complex(c_double_complex), dimension(:,:), target | A, | ||
integer(c_int) | lda, | ||
integer(c_int64_t) | stride_a, | ||
complex(c_double_complex), dimension(:,:), target | B, | ||
integer(c_int) | ldb, | ||
integer(c_int64_t) | stride_b, | ||
integer(c_int) | batch_count | ||
) |
◆ rocblas_ztrsm_strided_batched_rank_0()
integer(kind(rocblas_status_success)) function hipfort_rocblas::rocblas_ztrsm_strided_batched::rocblas_ztrsm_strided_batched_rank_0 | ( | type(c_ptr) | handle, |
integer(kind(rocblas_side_left)) | side, | ||
integer(kind(rocblas_fill_upper)) | uplo, | ||
integer(kind(rocblas_operation_none)) | transA, | ||
integer(kind(rocblas_diagonal_non_unit)) | diag, | ||
integer(c_int) | m, | ||
integer(c_int) | n, | ||
complex(c_double_complex) | alpha, | ||
complex(c_double_complex), target | A, | ||
integer(c_int) | lda, | ||
integer(c_int64_t) | stride_a, | ||
complex(c_double_complex), target | B, | ||
integer(c_int) | ldb, | ||
integer(c_int64_t) | stride_b, | ||
integer(c_int) | batch_count | ||
) |
◆ rocblas_ztrsm_strided_batched_rank_1()
integer(kind(rocblas_status_success)) function hipfort_rocblas::rocblas_ztrsm_strided_batched::rocblas_ztrsm_strided_batched_rank_1 | ( | type(c_ptr) | handle, |
integer(kind(rocblas_side_left)) | side, | ||
integer(kind(rocblas_fill_upper)) | uplo, | ||
integer(kind(rocblas_operation_none)) | transA, | ||
integer(kind(rocblas_diagonal_non_unit)) | diag, | ||
integer(c_int) | m, | ||
integer(c_int) | n, | ||
complex(c_double_complex) | alpha, | ||
complex(c_double_complex), dimension(:), target | A, | ||
integer(c_int) | lda, | ||
integer(c_int64_t) | stride_a, | ||
complex(c_double_complex), dimension(:), target | B, | ||
integer(c_int) | ldb, | ||
integer(c_int64_t) | stride_b, | ||
integer(c_int) | batch_count | ||
) |
The documentation for this interface was generated from the following file: