API#
This section provides a detailed list of the library API
Host Utility Functions#
-
template<typename DataType>
void rocalution::allocate_host(int size, DataType **ptr)# Allocate buffer on the host.
allocate_host
allocates a buffer on the host.- Parameters:
size – [in] number of elements the buffer need to be allocated for
ptr – [out] pointer to the position in memory where the buffer should be allocated, it is expected that
*ptr
==NULL
- Template Parameters:
DataType – can be char, int, unsigned int, float, double, std::complex<float> or std::complex<double>.
-
template<typename DataType>
void rocalution::free_host(DataType **ptr)# Free buffer on the host.
free_host
deallocates a buffer on the host.*ptr
will be set to NULL after successful deallocation.- Parameters:
ptr – [inout] pointer to the position in memory where the buffer should be deallocated, it is expected that
*ptr
!=NULL
- Template Parameters:
DataType – can be char, int, unsigned int, float, double, std::complex<float> or std::complex<double>.
-
template<typename DataType>
void rocalution::set_to_zero_host(int size, DataType *ptr)# Set a host buffer to zero.
set_to_zero_host
sets a host buffer to zero.- Parameters:
size – [in] number of elements
ptr – [inout] pointer to the host buffer
- Template Parameters:
DataType – can be char, int, unsigned int, float, double, std::complex<float> or std::complex<double>.
-
double rocalution::rocalution_time(void)#
Return current time in microseconds.
Backend Manager#
-
int rocalution::init_rocalution(int rank = -1, int dev_per_node = 1)#
Initialize rocALUTION platform.
init_rocalution
defines a backend descriptor with information about the hardware and its specifications. All objects created after that contain a copy of this descriptor. If the specifications of the global descriptor are changed (e.g. set different number of threads) and new objects are created, only the new objects will use the new configurations.For control, the library provides the following functions
set_device_rocalution() is a unified function to select a specific device. If you have compiled the library with a backend and for this backend there are several available devices, you can use this function to select a particular one. This function has to be called before init_rocalution().
set_omp_threads_rocalution() sets the number of OpenMP threads. This function has to be called after init_rocalution().
- Example
#include <rocalution.hpp> using namespace rocalution; int main(int argc, char* argv[]) { init_rocalution(); // ... stop_rocalution(); return 0; }
- Parameters:
rank – [in] specifies MPI rank when multi-node environment
dev_per_node – [in] number of accelerator devices per node, when in multi-GPU environment
-
int rocalution::stop_rocalution(void)#
Shutdown rocALUTION platform.
stop_rocalution
shuts down the rocALUTION platform.
-
void rocalution::set_device_rocalution(int dev)#
Set the accelerator device.
set_device_rocalution
lets the user select the accelerator device that is supposed to be used for the computation.- Parameters:
dev – [in] accelerator device ID for computation
-
void rocalution::set_omp_threads_rocalution(int nthreads)#
Set number of OpenMP threads.
The number of threads which rocALUTION will use can be set with
set_omp_threads_rocalution
or by the global OpenMP environment variable (for Unix-like OS this isOMP_NUM_THREADS
). During the initialization phase, the library provides affinity thread-core mapping:If the number of cores (including SMT cores) is greater or equal than two times the number of threads, then all the threads can occupy every second core ID (e.g. 0, 2, 4, \(\ldots\)). This is to avoid having two threads working on the same physical core, when SMT is enabled.
If the number of threads is less or equal to the number of cores (including SMT), and the previous clause is false, then the threads can occupy every core ID (e.g. 0, 1, 2, 3, \(\ldots\)).
If non of the above criteria is matched, then the default thread-core mapping is used (typically set by the OS).
Note
The thread-core mapping is available only for Unix-like OS.
Note
The user can disable the thread affinity by calling set_omp_affinity_rocalution(), before initializing the library (i.e. before init_rocalution()).
- Parameters:
nthreads – [in] number of OpenMP threads
-
void rocalution::set_omp_affinity_rocalution(bool affinity)#
Enable/disable OpenMP host affinity.
set_omp_affinity_rocalution
enables / disables OpenMP host affinity.- Parameters:
affinity – [in] boolean to turn on/off OpenMP host affinity
-
void rocalution::set_omp_threshold_rocalution(int threshold)#
Set OpenMP threshold size.
Whenever you want to work on a small problem, you might observe that the OpenMP host backend is (slightly) slower than using no OpenMP. This is mainly attributed to the small amount of work, which every thread should perform and the large overhead of forking/joining threads. This can be avoid by the OpenMP threshold size parameter in rocALUTION. The default threshold is set to 10000, which means that all matrices under (and equal) this size will use only one thread (disregarding the number of OpenMP threads set in the system). The threshold can be modified with
set_omp_threshold_rocalution
.- Parameters:
threshold – [in] OpenMP threshold size
-
void rocalution::info_rocalution(void)#
Print info about rocALUTION.
info_rocalution
prints information about the rocALUTION platform
-
void rocalution::info_rocalution(const struct Rocalution_Backend_Descriptor backend_descriptor)#
Print info about specific rocALUTION backend descriptor.
info_rocalution
prints information about the rocALUTION platform of the specific backend descriptor.- Parameters:
backend_descriptor – [in] rocALUTION backend descriptor
-
void rocalution::disable_accelerator_rocalution(bool onoff = true)#
Disable/Enable the accelerator.
If you want to disable the accelerator (without re-compiling the code), you need to call
disable_accelerator_rocalution
before init_rocalution().- Parameters:
onoff – [in] boolean to turn on/off the accelerator
-
void rocalution::_rocalution_sync(void)#
Sync rocALUTION.
_rocalution_sync
blocks the host until all active asynchronous transfers are completed.
Base Rocalution#
-
template<typename ValueType>
class BaseRocalution : public rocalution::RocalutionObj# Base class for all operators and vectors.
- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Subclassed by rocalution::Operator< ValueType >, rocalution::Vector< ValueType >
Public Functions
-
virtual void MoveToAccelerator(void) = 0#
Move the object to the accelerator backend.
-
virtual void MoveToHost(void) = 0#
Move the object to the host backend.
-
virtual void MoveToAcceleratorAsync(void)#
Move the object to the accelerator backend with async move.
-
virtual void MoveToHostAsync(void)#
Move the object to the host backend with async move.
-
virtual void Sync(void)#
Sync (the async move)
-
virtual void CloneBackend(const BaseRocalution<ValueType> &src)#
Clone the Backend descriptor from another object.
With
CloneBackend
, the backend can be cloned without copying any data. This is especially useful, if several objects should reside on the same backend, but keep their original data.- Example
LocalVector<ValueType> vec; LocalMatrix<ValueType> mat; // Allocate and initialize vec and mat // ... LocalVector<ValueType> tmp; // By cloning backend, tmp and vec will have the same backend as mat tmp.CloneBackend(mat); vec.CloneBackend(mat); // The following matrix vector multiplication will be performed on the backend // selected in mat mat.Apply(vec, &tmp);
- Parameters:
src – [in] Object, where the backend should be cloned from.
-
virtual void Info(void) const = 0#
Print object information.
Info
can print object information about any rocALUTION object. This information consists of object properties and backend data.- Example
mat.Info(); vec.Info();
-
virtual void Clear(void) = 0#
Clear (free all data) the object.
Operator#
-
template<typename ValueType>
class Operator : public rocalution::BaseRocalution<ValueType># Operator class.
The Operator class defines the generic interface for applying an operator (e.g. matrix or stencil) from/to global and local vectors.
- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Subclassed by rocalution::GlobalMatrix< ValueType >, rocalution::LocalMatrix< ValueType >, rocalution::LocalStencil< ValueType >
Public Functions
-
virtual IndexType2 GetM(void) const = 0#
Return the number of rows in the matrix/stencil.
-
virtual IndexType2 GetN(void) const = 0#
Return the number of columns in the matrix/stencil.
-
virtual IndexType2 GetNnz(void) const = 0#
Return the number of non-zeros in the matrix/stencil.
-
virtual int GetLocalM(void) const#
Return the number of rows in the local matrix/stencil.
-
virtual int GetLocalN(void) const#
Return the number of columns in the local matrix/stencil.
-
virtual int GetLocalNnz(void) const#
Return the number of non-zeros in the local matrix/stencil.
-
virtual int GetGhostM(void) const#
Return the number of rows in the ghost matrix/stencil.
-
virtual int GetGhostN(void) const#
Return the number of columns in the ghost matrix/stencil.
-
virtual int GetGhostNnz(void) const#
Return the number of non-zeros in the ghost matrix/stencil.
-
virtual void Apply(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Apply the operator, out = Operator(in), where in and out are local vectors.
-
virtual void ApplyAdd(const LocalVector<ValueType> &in, ValueType scalar, LocalVector<ValueType> *out) const#
Apply and add the operator, out += scalar * Operator(in), where in and out are local vectors.
-
virtual void Apply(const GlobalVector<ValueType> &in, GlobalVector<ValueType> *out) const#
Apply the operator, out = Operator(in), where in and out are global vectors.
-
virtual void ApplyAdd(const GlobalVector<ValueType> &in, ValueType scalar, GlobalVector<ValueType> *out) const#
Apply and add the operator, out += scalar * Operator(in), where in and out are global vectors.
Vector#
-
template<typename ValueType>
class Vector : public rocalution::BaseRocalution<ValueType># Vector class.
The Vector class defines the generic interface for local and global vectors.
- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Subclassed by rocalution::LocalVector< int >, rocalution::GlobalVector< ValueType >, rocalution::LocalVector< ValueType >
Unnamed Group
-
virtual void CopyFrom(const LocalVector<ValueType> &src)#
Copy vector from another vector.
CopyFrom
copies values from another vector.- Example
LocalVector<ValueType> vec1, vec2; // Allocate and initialize vec1 and vec2 // ... // Move vec1 to accelerator // vec1.MoveToAccelerator(); // Now, vec1 is on the accelerator (if available) // and vec2 is on the host // Copy vec1 to vec2 (or vice versa) will move data between host and // accelerator backend vec1.CopyFrom(vec2);
Note
This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.
- Parameters:
src – [in] Vector, where values should be copied from.
-
virtual void CopyFrom(const GlobalVector<ValueType> &src)#
Copy vector from another vector.
CopyFrom
copies values from another vector.- Example
LocalVector<ValueType> vec1, vec2; // Allocate and initialize vec1 and vec2 // ... // Move vec1 to accelerator // vec1.MoveToAccelerator(); // Now, vec1 is on the accelerator (if available) // and vec2 is on the host // Copy vec1 to vec2 (or vice versa) will move data between host and // accelerator backend vec1.CopyFrom(vec2);
Note
This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.
- Parameters:
src – [in] Vector, where values should be copied from.
Unnamed Group
-
virtual void CloneFrom(const LocalVector<ValueType> &src)#
Clone the vector.
CloneFrom
clones the entire vector, with data and backend descriptor from another Vector.- Example
LocalVector<ValueType> vec; // Allocate and initialize vec (host or accelerator) // ... LocalVector<ValueType> tmp; // By cloning vec, tmp will have identical values and will be on the same // backend as vec tmp.CloneFrom(vec);
- Parameters:
src – [in] Vector to clone from.
-
virtual void CloneFrom(const GlobalVector<ValueType> &src)#
Clone the vector.
CloneFrom
clones the entire vector, with data and backend descriptor from another Vector.- Example
LocalVector<ValueType> vec; // Allocate and initialize vec (host or accelerator) // ... LocalVector<ValueType> tmp; // By cloning vec, tmp will have identical values and will be on the same // backend as vec tmp.CloneFrom(vec);
- Parameters:
src – [in] Vector to clone from.
Public Functions
-
virtual IndexType2 GetSize(void) const = 0#
Return the size of the vector.
-
virtual int GetLocalSize(void) const#
Return the size of the local vector.
-
virtual int GetGhostSize(void) const#
Return the size of the ghost vector.
-
virtual bool Check(void) const = 0#
Perform a sanity check of the vector.
Checks, if the vector contains valid data, i.e. if the values are not infinity and not NaN (not a number).
- Return values:
true – if the vector is ok (empty vector is also ok).
false – if there is something wrong with the values.
-
virtual void Clear(void) = 0#
Clear (free all data) the object.
-
virtual void Zeros(void) = 0#
Set all values of the vector to 0.
-
virtual void Ones(void) = 0#
Set all values of the vector to 1.
-
virtual void SetRandomUniform(unsigned long long seed, ValueType a = static_cast<ValueType>(-1), ValueType b = static_cast<ValueType>(1)) = 0#
Fill the vector with random values from interval [a,b].
-
virtual void SetRandomNormal(unsigned long long seed, ValueType mean = static_cast<ValueType>(0), ValueType var = static_cast<ValueType>(1)) = 0#
Fill the vector with random values from normal distribution.
-
virtual void ReadFileASCII(const std::string filename) = 0#
Read vector from ASCII file.
Read a vector from ASCII file.
- Example
LocalVector<ValueType> vec; vec.ReadFileASCII("my_vector.dat");
- Parameters:
filename – [in] name of the file containing the ASCII data.
-
virtual void WriteFileASCII(const std::string filename) const = 0#
Write vector to ASCII file.
Write a vector to ASCII file.
- Example
LocalVector<ValueType> vec; // Allocate and fill vec // ... vec.WriteFileASCII("my_vector.dat");
- Parameters:
filename – [in] name of the file to write the ASCII data to.
-
virtual void ReadFileBinary(const std::string filename) = 0#
Read vector from binary file.
Read a vector from binary file. For details on the format, see WriteFileBinary().
- Example
LocalVector<ValueType> vec; vec.ReadFileBinary("my_vector.bin");
- Parameters:
filename – [in] name of the file containing the data.
-
virtual void WriteFileBinary(const std::string filename) const = 0#
Write vector to binary file.
Write a vector to binary file.
The binary format contains a header, the rocALUTION version and the vector data as follows
// Header out << "#rocALUTION binary vector file" << std::endl; // rocALUTION version out.write((char*)&version, sizeof(int)); // Vector data out.write((char*)&size, sizeof(int)); out.write((char*)vec_val, size * sizeof(double));
- Example
LocalVector<ValueType> vec; // Allocate and fill vec // ... vec.WriteFileBinary("my_vector.bin");
Note
Vector values array is always stored in double precision (e.g. double or std::complex<double>).
- Parameters:
filename – [in] name of the file to write the data to.
-
virtual void CopyFromAsync(const LocalVector<ValueType> &src)#
Async copy from another local vector.
-
virtual void CopyFromFloat(const LocalVector<float> &src)#
Copy values from another local float vector.
-
virtual void CopyFromDouble(const LocalVector<double> &src)#
Copy values from another local double vector.
-
virtual void CopyFrom(const LocalVector<ValueType> &src, int src_offset, int dst_offset, int size)#
Copy vector from another vector with offsets and size.
CopyFrom
copies values with specific source and destination offsets and sizes from another vector.Note
This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.
- Parameters:
src – [in] Vector, where values should be copied from.
src_offset – [in] source offset.
dst_offset – [in] destination offset.
size – [in] number of entries to be copied.
-
virtual void AddScale(const LocalVector<ValueType> &x, ValueType alpha)#
Perform vector update of type this = this + alpha * x.
-
virtual void AddScale(const GlobalVector<ValueType> &x, ValueType alpha)#
Perform vector update of type this = this + alpha * x.
-
virtual void ScaleAdd(ValueType alpha, const LocalVector<ValueType> &x)#
Perform vector update of type this = alpha * this + x.
-
virtual void ScaleAdd(ValueType alpha, const GlobalVector<ValueType> &x)#
Perform vector update of type this = alpha * this + x.
-
virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta)#
Perform vector update of type this = alpha * this + x * beta.
-
virtual void ScaleAddScale(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta)#
Perform vector update of type this = alpha * this + x * beta.
-
virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, int src_offset, int dst_offset, int size)#
Perform vector update of type this = alpha * this + x * beta with offsets.
-
virtual void ScaleAddScale(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta, int src_offset, int dst_offset, int size)#
Perform vector update of type this = alpha * this + x * beta with offsets.
-
virtual void ScaleAdd2(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, const LocalVector<ValueType> &y, ValueType gamma)#
Perform vector update of type this = alpha * this + x * beta + y * gamma.
-
virtual void ScaleAdd2(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta, const GlobalVector<ValueType> &y, ValueType gamma)#
Perform vector update of type this = alpha * this + x * beta + y * gamma.
-
virtual ValueType Dot(const LocalVector<ValueType> &x) const#
Compute dot (scalar) product, return this^T y.
-
virtual ValueType Dot(const GlobalVector<ValueType> &x) const#
Compute dot (scalar) product, return this^T y.
-
virtual ValueType DotNonConj(const LocalVector<ValueType> &x) const#
Compute non-conjugate dot (scalar) product, return this^T y.
-
virtual ValueType DotNonConj(const GlobalVector<ValueType> &x) const#
Compute non-conjugate dot (scalar) product, return this^T y.
-
virtual ValueType Norm(void) const = 0#
Compute \(L_2\) norm of the vector, return = srqt(this^T this)
-
virtual ValueType Asum(void) const = 0#
Compute the sum of absolute values of the vector, return = sum(|this|)
-
virtual int Amax(ValueType &value) const = 0#
Compute the absolute max of the vector, return = index(max(|this|))
-
virtual void PointWiseMult(const LocalVector<ValueType> &x)#
Perform point-wise multiplication (element-wise) of this = this * x.
-
virtual void PointWiseMult(const GlobalVector<ValueType> &x)#
Perform point-wise multiplication (element-wise) of this = this * x.
-
virtual void PointWiseMult(const LocalVector<ValueType> &x, const LocalVector<ValueType> &y)#
Perform point-wise multiplication (element-wise) of this = x * y.
-
virtual void PointWiseMult(const GlobalVector<ValueType> &x, const GlobalVector<ValueType> &y)#
Perform point-wise multiplication (element-wise) of this = x * y.
-
virtual void Power(double power) = 0#
Perform power operation to a vector.
Local Matrix#
-
template<typename ValueType>
class LocalMatrix : public rocalution::Operator<ValueType># LocalMatrix class.
A LocalMatrix is called local, because it will always stay on a single system. The system can contain several CPUs via UMA or NUMA memory system or it can contain an accelerator.
A number of matrix formats are supported. These are CSR, BCSR, MCSR, COO, DIA, ELL, HYB, and DENSE.
Note
For CSR type matrices, the column indices must be sorted in increasing order. For COO matrices, the row indices must be sorted in increasing order. The function
Check
can be used to check whether a matrix contains valid data. For CSR and COO matrices, the functionSort
can be used to sort the row or column indices respectively.- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Unnamed Group
-
void AllocateCSR(const std::string name, int nnz, int nrow, int ncol)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
-
void AllocateBCSR(const std::string name, int nnzb, int nrowb, int ncolb, int blockdim)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
-
void AllocateMCSR(const std::string name, int nnz, int nrow, int ncol)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
-
void AllocateCOO(const std::string name, int nnz, int nrow, int ncol)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
-
void AllocateDIA(const std::string name, int nnz, int nrow, int ncol, int ndiag)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
-
void AllocateELL(const std::string name, int nnz, int nrow, int ncol, int max_row)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
-
void AllocateHYB(const std::string name, int ell_nnz, int coo_nnz, int ell_max_row, int nrow, int ncol)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
-
void AllocateDENSE(const std::string name, int nrow, int ncol)#
Allocate a local matrix with name and sizes.
The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.
- Example
LocalMatrix<ValueType> mat; mat.AllocateCSR("my CSR matrix", 456, 100, 100); mat.Clear(); mat.AllocateCOO("my COO matrix", 200, 100, 100); mat.Clear();
Unnamed Group
-
void SetDataPtrCOO(int **row, int **col, ValueType **val, std::string name, int nnz, int nrow, int ncol)#
Initialize a LocalMatrix on the host with externally allocated data.
SetDataPtr
functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.- Example
// Allocate a CSR matrix int* csr_row_ptr = new int[100 + 1]; int* csr_col_ind = new int[345]; ValueType* csr_val = new ValueType[345]; // Fill the CSR matrix // ... // rocALUTION local matrix object LocalMatrix<ValueType> mat; // Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become // invalid mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);
Note
Setting data pointers will leave the original pointers empty (set to
NULL
).
-
void SetDataPtrCSR(int **row_offset, int **col, ValueType **val, std::string name, int nnz, int nrow, int ncol)#
Initialize a LocalMatrix on the host with externally allocated data.
SetDataPtr
functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.- Example
// Allocate a CSR matrix int* csr_row_ptr = new int[100 + 1]; int* csr_col_ind = new int[345]; ValueType* csr_val = new ValueType[345]; // Fill the CSR matrix // ... // rocALUTION local matrix object LocalMatrix<ValueType> mat; // Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become // invalid mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);
Note
Setting data pointers will leave the original pointers empty (set to
NULL
).
-
void SetDataPtrBCSR(int **row_offset, int **col, ValueType **val, std::string name, int nnzb, int nrowb, int ncolb, int blockdim)#
Initialize a LocalMatrix on the host with externally allocated data.
SetDataPtr
functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.- Example
// Allocate a CSR matrix int* csr_row_ptr = new int[100 + 1]; int* csr_col_ind = new int[345]; ValueType* csr_val = new ValueType[345]; // Fill the CSR matrix // ... // rocALUTION local matrix object LocalMatrix<ValueType> mat; // Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become // invalid mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);
Note
Setting data pointers will leave the original pointers empty (set to
NULL
).
-
void SetDataPtrMCSR(int **row_offset, int **col, ValueType **val, std::string name, int nnz, int nrow, int ncol)#
Initialize a LocalMatrix on the host with externally allocated data.
SetDataPtr
functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.- Example
// Allocate a CSR matrix int* csr_row_ptr = new int[100 + 1]; int* csr_col_ind = new int[345]; ValueType* csr_val = new ValueType[345]; // Fill the CSR matrix // ... // rocALUTION local matrix object LocalMatrix<ValueType> mat; // Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become // invalid mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);
Note
Setting data pointers will leave the original pointers empty (set to
NULL
).
-
void SetDataPtrELL(int **col, ValueType **val, std::string name, int nnz, int nrow, int ncol, int max_row)#
Initialize a LocalMatrix on the host with externally allocated data.
SetDataPtr
functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.- Example
// Allocate a CSR matrix int* csr_row_ptr = new int[100 + 1]; int* csr_col_ind = new int[345]; ValueType* csr_val = new ValueType[345]; // Fill the CSR matrix // ... // rocALUTION local matrix object LocalMatrix<ValueType> mat; // Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become // invalid mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);
Note
Setting data pointers will leave the original pointers empty (set to
NULL
).
-
void SetDataPtrDIA(int **offset, ValueType **val, std::string name, int nnz, int nrow, int ncol, int num_diag)#
Initialize a LocalMatrix on the host with externally allocated data.
SetDataPtr
functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.- Example
// Allocate a CSR matrix int* csr_row_ptr = new int[100 + 1]; int* csr_col_ind = new int[345]; ValueType* csr_val = new ValueType[345]; // Fill the CSR matrix // ... // rocALUTION local matrix object LocalMatrix<ValueType> mat; // Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become // invalid mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);
Note
Setting data pointers will leave the original pointers empty (set to
NULL
).
-
void SetDataPtrDENSE(ValueType **val, std::string name, int nrow, int ncol)#
Initialize a LocalMatrix on the host with externally allocated data.
SetDataPtr
functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.- Example
// Allocate a CSR matrix int* csr_row_ptr = new int[100 + 1]; int* csr_col_ind = new int[345]; ValueType* csr_val = new ValueType[345]; // Fill the CSR matrix // ... // rocALUTION local matrix object LocalMatrix<ValueType> mat; // Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become // invalid mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);
Note
Setting data pointers will leave the original pointers empty (set to
NULL
).
Unnamed Group
-
void LeaveDataPtrCOO(int **row, int **col, ValueType **val)#
Leave a LocalMatrix to host pointers.
LeaveDataPtr
functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.- Example
// rocALUTION CSR matrix object LocalMatrix<ValueType> mat; // Allocate the CSR matrix mat.AllocateCSR("my_matrix", 345, 100, 100); // Fill CSR matrix // ... int* csr_row_ptr = NULL; int* csr_col_ind = NULL; ValueType* csr_val = NULL; // Get (steal) the data from the matrix, this will leave the local matrix // object empty mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);
-
void LeaveDataPtrCSR(int **row_offset, int **col, ValueType **val)#
Leave a LocalMatrix to host pointers.
LeaveDataPtr
functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.- Example
// rocALUTION CSR matrix object LocalMatrix<ValueType> mat; // Allocate the CSR matrix mat.AllocateCSR("my_matrix", 345, 100, 100); // Fill CSR matrix // ... int* csr_row_ptr = NULL; int* csr_col_ind = NULL; ValueType* csr_val = NULL; // Get (steal) the data from the matrix, this will leave the local matrix // object empty mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);
-
void LeaveDataPtrBCSR(int **row_offset, int **col, ValueType **val, int &blockdim)#
Leave a LocalMatrix to host pointers.
LeaveDataPtr
functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.- Example
// rocALUTION CSR matrix object LocalMatrix<ValueType> mat; // Allocate the CSR matrix mat.AllocateCSR("my_matrix", 345, 100, 100); // Fill CSR matrix // ... int* csr_row_ptr = NULL; int* csr_col_ind = NULL; ValueType* csr_val = NULL; // Get (steal) the data from the matrix, this will leave the local matrix // object empty mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);
-
void LeaveDataPtrMCSR(int **row_offset, int **col, ValueType **val)#
Leave a LocalMatrix to host pointers.
LeaveDataPtr
functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.- Example
// rocALUTION CSR matrix object LocalMatrix<ValueType> mat; // Allocate the CSR matrix mat.AllocateCSR("my_matrix", 345, 100, 100); // Fill CSR matrix // ... int* csr_row_ptr = NULL; int* csr_col_ind = NULL; ValueType* csr_val = NULL; // Get (steal) the data from the matrix, this will leave the local matrix // object empty mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);
-
void LeaveDataPtrELL(int **col, ValueType **val, int &max_row)#
Leave a LocalMatrix to host pointers.
LeaveDataPtr
functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.- Example
// rocALUTION CSR matrix object LocalMatrix<ValueType> mat; // Allocate the CSR matrix mat.AllocateCSR("my_matrix", 345, 100, 100); // Fill CSR matrix // ... int* csr_row_ptr = NULL; int* csr_col_ind = NULL; ValueType* csr_val = NULL; // Get (steal) the data from the matrix, this will leave the local matrix // object empty mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);
-
void LeaveDataPtrDIA(int **offset, ValueType **val, int &num_diag)#
Leave a LocalMatrix to host pointers.
LeaveDataPtr
functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.- Example
// rocALUTION CSR matrix object LocalMatrix<ValueType> mat; // Allocate the CSR matrix mat.AllocateCSR("my_matrix", 345, 100, 100); // Fill CSR matrix // ... int* csr_row_ptr = NULL; int* csr_col_ind = NULL; ValueType* csr_val = NULL; // Get (steal) the data from the matrix, this will leave the local matrix // object empty mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);
-
void LeaveDataPtrDENSE(ValueType **val)#
Leave a LocalMatrix to host pointers.
LeaveDataPtr
functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.- Example
// rocALUTION CSR matrix object LocalMatrix<ValueType> mat; // Allocate the CSR matrix mat.AllocateCSR("my_matrix", 345, 100, 100); // Fill CSR matrix // ... int* csr_row_ptr = NULL; int* csr_col_ind = NULL; ValueType* csr_val = NULL; // Get (steal) the data from the matrix, this will leave the local matrix // object empty mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);
Public Functions
-
virtual void Info(void) const#
Print object information.
Info
can print object information about any rocALUTION object. This information consists of object properties and backend data.- Example
mat.Info(); vec.Info();
-
unsigned int GetFormat(void) const#
Return the matrix format id (see matrix_formats.hpp)
-
virtual IndexType2 GetM(void) const#
Return the number of rows in the matrix/stencil.
-
virtual IndexType2 GetN(void) const#
Return the number of columns in the matrix/stencil.
-
virtual IndexType2 GetNnz(void) const#
Return the number of non-zeros in the matrix/stencil.
-
bool Check(void) const#
Perform a sanity check of the matrix.
Checks, if the matrix contains valid data, i.e. if the values are not infinity and not NaN (not a number) and if the structure of the matrix is correct (e.g. indices cannot be negative, CSR and COO matrices have to be sorted, etc.).
- Return values:
true – if the matrix is ok (empty matrix is also ok).
false – if there is something wrong with the structure or values.
-
virtual void Clear(void)#
Clear (free all data) the object.
-
void Zeros(void)#
Set all matrix values to zero.
-
void ScaleDiagonal(ValueType alpha)#
Scale the diagonal entries of the matrix with alpha, all diagonal elements must exist.
-
void ScaleOffDiagonal(ValueType alpha)#
Scale the off-diagonal entries of the matrix with alpha, all diagonal elements must exist.
-
void AddScalarDiagonal(ValueType alpha)#
Add alpha to the diagonal entries of the matrix, all diagonal elements must exist.
-
void AddScalarOffDiagonal(ValueType alpha)#
Add alpha to the off-diagonal entries of the matrix, all diagonal elements must exist.
-
void ExtractSubMatrix(int row_offset, int col_offset, int row_size, int col_size, LocalMatrix<ValueType> *mat) const#
Extract a sub-matrix with row/col_offset and row/col_size.
-
void ExtractSubMatrices(int row_num_blocks, int col_num_blocks, const int *row_offset, const int *col_offset, LocalMatrix<ValueType> ***mat) const#
Extract array of non-overlapping sub-matrices (row/col_num_blocks define the blocks for rows/columns; row/col_offset have sizes col/row_num_blocks+1, where [i+1]-[i] defines the i-th size of the sub-matrix)
-
void ExtractDiagonal(LocalVector<ValueType> *vec_diag) const#
Extract the diagonal values of the matrix into a LocalVector.
-
void ExtractInverseDiagonal(LocalVector<ValueType> *vec_inv_diag) const#
Extract the inverse (reciprocal) diagonal values of the matrix into a LocalVector.
-
void ExtractU(LocalMatrix<ValueType> *U, bool diag) const#
Extract the upper triangular matrix.
-
void ExtractL(LocalMatrix<ValueType> *L, bool diag) const#
Extract the lower triangular matrix.
-
void Permute(const LocalVector<int> &permutation)#
Perform (forward) permutation of the matrix.
-
void PermuteBackward(const LocalVector<int> &permutation)#
Perform (backward) permutation of the matrix.
-
void CMK(LocalVector<int> *permutation) const#
Create permutation vector for CMK reordering of the matrix.
The Cuthill-McKee ordering minimize the bandwidth of a given sparse matrix.
- Example
LocalVector<int> cmk; mat.CMK(&cmk); mat.Permute(cmk);
- Parameters:
permutation – [out] permutation vector for CMK reordering
-
void RCMK(LocalVector<int> *permutation) const#
Create permutation vector for reverse CMK reordering of the matrix.
The Reverse Cuthill-McKee ordering minimize the bandwidth of a given sparse matrix.
- Example
LocalVector<int> rcmk; mat.RCMK(&rcmk); mat.Permute(rcmk);
- Parameters:
permutation – [out] permutation vector for reverse CMK reordering
-
void ConnectivityOrder(LocalVector<int> *permutation) const#
Create permutation vector for connectivity reordering of the matrix.
Connectivity ordering returns a permutation, that sorts the matrix by non-zero entries per row.
- Example
LocalVector<int> conn; mat.ConnectivityOrder(&conn); mat.Permute(conn);
- Parameters:
permutation – [out] permutation vector for connectivity reordering
-
void MultiColoring(int &num_colors, int **size_colors, LocalVector<int> *permutation) const#
Perform multi-coloring decomposition of the matrix.
The Multi-Coloring algorithm builds a permutation (coloring of the matrix) in a way such that no two adjacent nodes in the sparse matrix have the same color.
- Example
LocalVector<int> mc; int num_colors; int* block_colors = NULL; mat.MultiColoring(num_colors, &block_colors, &mc); mat.Permute(mc);
- Parameters:
num_colors – [out] number of colors
size_colors – [out] pointer to array that holds the number of nodes for each color
permutation – [out] permutation vector for multi-coloring reordering
-
void MaximalIndependentSet(int &size, LocalVector<int> *permutation) const#
Perform maximal independent set decomposition of the matrix.
The Maximal Independent Set algorithm finds a set with maximal size, that contains elements that do not depend on other elements in this set.
- Example
LocalVector<int> mis; int size; mat.MaximalIndependentSet(size, &mis); mat.Permute(mis);
- Parameters:
size – [out] number of independent sets
permutation – [out] permutation vector for maximal independent set reordering
-
void ZeroBlockPermutation(int &size, LocalVector<int> *permutation) const#
Return a permutation for saddle-point problems (zero diagonal entries)
For Saddle-Point problems, (i.e. matrices with zero diagonal entries), the Zero Block Permutation maps all zero-diagonal elements to the last block of the matrix.
- Example
LocalVector<int> zbp; int size; mat.ZeroBlockPermutation(size, &zbp); mat.Permute(zbp);
- Parameters:
size – [out]
permutation – [out] permutation vector for zero block permutation
-
void ILU0Factorize(void)#
Perform ILU(0) factorization.
-
void ILUTFactorize(double t, int maxrow)#
Perform ILU(t,m) factorization based on threshold and maximum number of elements per row.
-
void ILUpFactorize(int p, bool level = true)#
Perform ILU(p) factorization based on power.
-
void LUAnalyse(void)#
Analyse the structure (level-scheduling)
-
void LUAnalyseClear(void)#
Delete the analysed data (see LUAnalyse)
-
void LUSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Solve LU out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.
-
void ICFactorize(LocalVector<ValueType> *inv_diag)#
Perform IC(0) factorization.
-
void LLAnalyse(void)#
Analyse the structure (level-scheduling)
-
void LLAnalyseClear(void)#
Delete the analysed data (see LLAnalyse)
-
void LLSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Solve LL^T out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.
-
void LLSolve(const LocalVector<ValueType> &in, const LocalVector<ValueType> &inv_diag, LocalVector<ValueType> *out) const#
Solve LL^T out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.
-
void LAnalyse(bool diag_unit = false)#
Analyse the structure (level-scheduling) L-part.
diag_unit == true the diag is 1;
diag_unit == false the diag is 0;
-
void LAnalyseClear(void)#
Delete the analysed data (see LAnalyse) L-part.
-
void LSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Solve L out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.
-
void UAnalyse(bool diag_unit = false)#
Analyse the structure (level-scheduling) U-part;.
diag_unit == true the diag is 1;
diag_unit == false the diag is 0;
-
void UAnalyseClear(void)#
Delete the analysed data (see UAnalyse) U-part.
-
void USolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Solve U out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.
-
void Householder(int idx, ValueType &beta, LocalVector<ValueType> *vec) const#
Compute Householder vector.
-
void QRSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Solve QR out = in.
-
void ReadFileMTX(const std::string filename)#
Read matrix from MTX (Matrix Market Format) file.
Read a matrix from Matrix Market Format file.
- Example
LocalMatrix<ValueType> mat; mat.ReadFileMTX("my_matrix.mtx");
- Parameters:
filename – [in] name of the file containing the MTX data.
-
void WriteFileMTX(const std::string filename) const#
Write matrix to MTX (Matrix Market Format) file.
Write a matrix to Matrix Market Format file.
- Example
LocalMatrix<ValueType> mat; // Allocate and fill mat // ... mat.WriteFileMTX("my_matrix.mtx");
- Parameters:
filename – [in] name of the file to write the MTX data to.
-
void ReadFileCSR(const std::string filename)#
Read matrix from CSR (rocALUTION binary format) file.
Read a CSR matrix from binary file. For details on the format, see WriteFileCSR().
- Example
LocalMatrix<ValueType> mat; mat.ReadFileCSR("my_matrix.csr");
- Parameters:
filename – [in] name of the file containing the data.
-
void WriteFileCSR(const std::string filename) const#
Write CSR matrix to binary file.
Write a CSR matrix to binary file.
The binary format contains a header, the rocALUTION version and the matrix data as follows
// Header out << "#rocALUTION binary csr file" << std::endl; // rocALUTION version out.write((char*)&version, sizeof(int)); // CSR matrix data out.write((char*)&m, sizeof(int)); out.write((char*)&n, sizeof(int)); out.write((char*)&nnz, sizeof(int)); out.write((char*)csr_row_ptr, (m + 1) * sizeof(int)); out.write((char*)csr_col_ind, nnz * sizeof(int)); out.write((char*)csr_val, nnz * sizeof(double));
- Example
LocalMatrix<ValueType> mat; // Allocate and fill mat // ... mat.WriteFileCSR("my_matrix.csr");
Note
Vector values array is always stored in double precision (e.g. double or std::complex<double>).
- Parameters:
filename – [in] name of the file to write the data to.
-
virtual void MoveToAccelerator(void)#
Move the object to the accelerator backend.
-
virtual void MoveToAcceleratorAsync(void)#
Move the object to the accelerator backend with async move.
-
virtual void MoveToHost(void)#
Move the object to the host backend.
-
virtual void MoveToHostAsync(void)#
Move the object to the host backend with async move.
-
virtual void Sync(void)#
Sync (the async move)
-
void CopyFrom(const LocalMatrix<ValueType> &src)#
Copy matrix from another LocalMatrix.
CopyFrom
copies values and structure from another local matrix. Source and destination matrix should be in the same format.- Example
LocalMatrix<ValueType> mat1, mat2; // Allocate and initialize mat1 and mat2 // ... // Move mat1 to accelerator // mat1.MoveToAccelerator(); // Now, mat1 is on the accelerator (if available) // and mat2 is on the host // Copy mat1 to mat2 (or vice versa) will move data between host and // accelerator backend mat1.CopyFrom(mat2);
Note
This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.
- Parameters:
src – [in] Local matrix where values and structure should be copied from.
-
void CopyFromAsync(const LocalMatrix<ValueType> &src)#
Async copy matrix (values and structure) from another LocalMatrix.
-
void CloneFrom(const LocalMatrix<ValueType> &src)#
Clone the matrix.
CloneFrom
clones the entire matrix, including values, structure and backend descriptor from another LocalMatrix.- Example
LocalMatrix<ValueType> mat; // Allocate and initialize mat (host or accelerator) // ... LocalMatrix<ValueType> tmp; // By cloning mat, tmp will have identical values and structure and will be on // the same backend as mat tmp.CloneFrom(mat);
- Parameters:
src – [in] LocalMatrix to clone from.
-
void UpdateValuesCSR(ValueType *val)#
Update CSR matrix entries only, structure will remain the same.
-
void CopyFromCSR(const int *row_offsets, const int *col, const ValueType *val)#
Copy (import) CSR matrix described in three arrays (offsets, columns, values). The object data has to be allocated (call AllocateCSR first)
-
void CopyToCSR(int *row_offsets, int *col, ValueType *val) const#
Copy (export) CSR matrix described in three arrays (offsets, columns, values). The output arrays have to be allocated.
-
void CopyFromCOO(const int *row, const int *col, const ValueType *val)#
Copy (import) COO matrix described in three arrays (rows, columns, values). The object data has to be allocated (call AllocateCOO first)
-
void CopyToCOO(int *row, int *col, ValueType *val) const#
Copy (export) COO matrix described in three arrays (rows, columns, values). The output arrays have to be allocated.
-
void CopyFromHostCSR(const int *row_offset, const int *col, const ValueType *val, const std::string name, int nnz, int nrow, int ncol)#
Allocates and copies (imports) a host CSR matrix.
If the CSR matrix data pointers are only accessible as constant, the user can create a LocalMatrix object and pass const CSR host pointers. The LocalMatrix will then be allocated and the data will be copied to the corresponding backend, where the original object was located at.
- Parameters:
row_offset – [in] CSR matrix row offset pointers.
col – [in] CSR matrix column indices.
val – [in] CSR matrix values array.
name – [in] Matrix object name.
nnz – [in] Number of non-zero elements.
nrow – [in] Number of rows.
ncol – [in] Number of columns.
-
void CreateFromMap(const LocalVector<int> &map, int n, int m)#
Create a restriction matrix operator based on an int vector map.
-
void CreateFromMap(const LocalVector<int> &map, int n, int m, LocalMatrix<ValueType> *pro)#
Create a restriction and prolongation matrix operator based on an int vector map.
-
void ConvertToCSR(void)#
Convert the matrix to CSR structure.
-
void ConvertToMCSR(void)#
Convert the matrix to MCSR structure.
-
void ConvertToBCSR(int blockdim)#
Convert the matrix to BCSR structure.
-
void ConvertToCOO(void)#
Convert the matrix to COO structure.
-
void ConvertToELL(void)#
Convert the matrix to ELL structure.
-
void ConvertToDIA(void)#
Convert the matrix to DIA structure.
-
void ConvertToHYB(void)#
Convert the matrix to HYB structure.
-
void ConvertToDENSE(void)#
Convert the matrix to DENSE structure.
-
void ConvertTo(unsigned int matrix_format, int blockdim = 1)#
Convert the matrix to specified matrix ID format.
-
virtual void Apply(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Apply the operator, out = Operator(in), where in and out are local vectors.
-
virtual void ApplyAdd(const LocalVector<ValueType> &in, ValueType scalar, LocalVector<ValueType> *out) const#
Apply and add the operator, out += scalar * Operator(in), where in and out are local vectors.
-
void SymbolicPower(int p)#
Perform symbolic computation (structure only) of \(|this|^p\).
-
void MatrixAdd(const LocalMatrix<ValueType> &mat, ValueType alpha = static_cast<ValueType>(1), ValueType beta = static_cast<ValueType>(1), bool structure = false)#
Perform matrix addition, this = alpha*this + beta*mat;.
if structure==false the sparsity pattern of the matrix is not changed;
if structure==true a new sparsity pattern is computed
-
void MatrixMult(const LocalMatrix<ValueType> &A, const LocalMatrix<ValueType> &B)#
Multiply two matrices, this = A * B.
-
void DiagonalMatrixMult(const LocalVector<ValueType> &diag)#
Multiply the matrix with diagonal matrix (stored in LocalVector), as DiagonalMatrixMultR()
-
void DiagonalMatrixMultL(const LocalVector<ValueType> &diag)#
Multiply the matrix with diagonal matrix (stored in LocalVector), this=diag*this.
-
void DiagonalMatrixMultR(const LocalVector<ValueType> &diag)#
Multiply the matrix with diagonal matrix (stored in LocalVector), this=this*diag.
-
void Gershgorin(ValueType &lambda_min, ValueType &lambda_max) const#
Compute the spectrum approximation with Gershgorin circles theorem.
-
void Compress(double drop_off)#
Delete all entries in the matrix which abs(a_ij) <= drop_off; the diagonal elements are never deleted.
-
void Transpose(void)#
Transpose the matrix.
-
void Sort(void)#
Sort the matrix indices.
Sorts the matrix by indices.
For CSR matrices, column values are sorted.
For COO matrices, row indices are sorted.
-
void Key(long int &row_key, long int &col_key, long int &val_key) const#
Compute a unique hash key for the matrix arrays.
Typically, it is hard to compare if two matrices have the same structure (and values). To do so, rocALUTION provides a keying function, that generates three keys, for the row index, column index and values array.
- Parameters:
row_key – [out] row index array key
col_key – [out] column index array key
val_key – [out] values array key
-
void ReplaceColumnVector(int idx, const LocalVector<ValueType> &vec)#
Replace a column vector of a matrix.
-
void ReplaceRowVector(int idx, const LocalVector<ValueType> &vec)#
Replace a row vector of a matrix.
-
void ExtractColumnVector(int idx, LocalVector<ValueType> *vec) const#
Extract values from a column of a matrix to a vector.
-
void ExtractRowVector(int idx, LocalVector<ValueType> *vec) const#
Extract values from a row of a matrix to a vector.
-
void AMGConnect(ValueType eps, LocalVector<int> *connections) const#
Strong couplings for aggregation-based AMG.
-
void AMGAggregate(const LocalVector<int> &connections, LocalVector<int> *aggregates) const#
Plain aggregation - Modification of a greedy aggregation scheme from Vanek (1996)
-
void AMGSmoothedAggregation(ValueType relax, const LocalVector<int> &aggregates, const LocalVector<int> &connections, LocalMatrix<ValueType> *prolong, LocalMatrix<ValueType> *restrict) const#
Interpolation scheme based on smoothed aggregation from Vanek (1996)
-
void AMGAggregation(const LocalVector<int> &aggregates, LocalMatrix<ValueType> *prolong, LocalMatrix<ValueType> *restrict) const#
Aggregation-based interpolation scheme.
-
void RugeStueben(ValueType eps, LocalMatrix<ValueType> *prolong, LocalMatrix<ValueType> *restrict) const#
Ruge Stueben coarsening.
-
void FSAI(int power, const LocalMatrix<ValueType> *pattern)#
Factorized Sparse Approximate Inverse assembly for given system matrix power pattern or external sparsity pattern.
-
void SPAI(void)#
SParse Approximate Inverse assembly for given system matrix pattern.
-
void InitialPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#
Initial Pairwise Aggregation scheme.
-
void InitialPairwiseAggregation(const LocalMatrix<ValueType> &mat, ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#
Initial Pairwise Aggregation scheme for split matrices.
-
void FurtherPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#
Further Pairwise Aggregation scheme.
-
void FurtherPairwiseAggregation(const LocalMatrix<ValueType> &mat, ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#
Further Pairwise Aggregation scheme for split matrices.
-
void CoarsenOperator(LocalMatrix<ValueType> *Ac, ParallelManager *pm, int nrow, int ncol, const LocalVector<int> &G, int Gsize, const int *rG, int rGsize) const#
Build coarse operator for pairwise aggregation scheme.
Local Stencil#
-
template<typename ValueType>
class LocalStencil : public rocalution::Operator<ValueType># LocalStencil class.
A LocalStencil is called local, because it will always stay on a single system. The system can contain several CPUs via UMA or NUMA memory system or it can contain an accelerator.
- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Public Functions
-
LocalStencil(unsigned int type)#
Initialize a local stencil with a type.
-
virtual void Info() const#
Print object information.
Info
can print object information about any rocALUTION object. This information consists of object properties and backend data.- Example
mat.Info(); vec.Info();
-
int GetNDim(void) const#
Return the dimension of the stencil.
-
virtual IndexType2 GetM(void) const#
Return the number of rows in the matrix/stencil.
-
virtual IndexType2 GetN(void) const#
Return the number of columns in the matrix/stencil.
-
virtual IndexType2 GetNnz(void) const#
Return the number of non-zeros in the matrix/stencil.
-
void SetGrid(int size)#
Set the stencil grid size.
-
virtual void Clear()#
Clear (free all data) the object.
-
virtual void Apply(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#
Apply the operator, out = Operator(in), where in and out are local vectors.
-
virtual void ApplyAdd(const LocalVector<ValueType> &in, ValueType scalar, LocalVector<ValueType> *out) const#
Apply and add the operator, out += scalar * Operator(in), where in and out are local vectors.
-
virtual void MoveToAccelerator(void)#
Move the object to the accelerator backend.
-
virtual void MoveToHost(void)#
Move the object to the host backend.
Global Matrix#
-
template<typename ValueType>
class GlobalMatrix : public rocalution::Operator<ValueType># GlobalMatrix class.
A GlobalMatrix is called global, because it can stay on a single or on multiple nodes in a network. For this type of communication, MPI is used.
A number of matrix formats are supported. These are CSR, BCSR, MCSR, COO, DIA, ELL, HYB, and DENSE.
Note
For CSR type matrices, the column indices must be sorted in increasing order. For COO matrices, the row indices must be sorted in increasing order. The function
Check
can be used to check whether a matrix contains valid data. For CSR and COO matrices, the functionSort
can be used to sort the row or column indices respectively.- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Public Functions
-
GlobalMatrix(const ParallelManager &pm)#
Initialize a global matrix with a parallel manager.
-
virtual IndexType2 GetM(void) const#
Return the number of rows in the matrix/stencil.
-
virtual IndexType2 GetN(void) const#
Return the number of columns in the matrix/stencil.
-
virtual IndexType2 GetNnz(void) const#
Return the number of non-zeros in the matrix/stencil.
-
virtual int GetLocalM(void) const#
Return the number of rows in the local matrix/stencil.
-
virtual int GetLocalN(void) const#
Return the number of columns in the local matrix/stencil.
-
virtual int GetLocalNnz(void) const#
Return the number of non-zeros in the local matrix/stencil.
-
virtual int GetGhostM(void) const#
Return the number of rows in the ghost matrix/stencil.
-
virtual int GetGhostN(void) const#
Return the number of columns in the ghost matrix/stencil.
-
virtual int GetGhostNnz(void) const#
Return the number of non-zeros in the ghost matrix/stencil.
-
virtual void MoveToAccelerator(void)#
Move the object to the accelerator backend.
-
virtual void MoveToHost(void)#
Move the object to the host backend.
-
virtual void Info(void) const#
Print object information.
Info
can print object information about any rocALUTION object. This information consists of object properties and backend data.- Example
mat.Info(); vec.Info();
-
virtual bool Check(void) const#
Perform a sanity check of the matrix.
Checks, if the matrix contains valid data, i.e. if the values are not infinity and not NaN (not a number) and if the structure of the matrix is correct (e.g. indices cannot be negative, CSR and COO matrices have to be sorted, etc.).
- Return values:
true – if the matrix is ok (empty matrix is also ok).
false – if there is something wrong with the structure or values.
-
void AllocateCSR(std::string name, int local_nnz, int ghost_nnz)#
Allocate CSR Matrix.
-
void AllocateCOO(std::string name, int local_nnz, int ghost_nnz)#
Allocate COO Matrix.
-
virtual void Clear(void)#
Clear (free all data) the object.
-
void SetParallelManager(const ParallelManager &pm)#
Set the parallel manager of a global vector.
-
void SetDataPtrCSR(int **local_row_offset, int **local_col, ValueType **local_val, int **ghost_row_offset, int **ghost_col, ValueType **ghost_val, std::string name, int local_nnz, int ghost_nnz)#
Initialize a CSR matrix on the host with externally allocated data.
-
void SetDataPtrCOO(int **local_row, int **local_col, ValueType **local_val, int **ghost_row, int **ghost_col, ValueType **ghost_val, std::string name, int local_nnz, int ghost_nnz)#
Initialize a COO matrix on the host with externally allocated data.
-
void SetLocalDataPtrCSR(int **row_offset, int **col, ValueType **val, std::string name, int nnz)#
Initialize a CSR matrix on the host with externally allocated local data.
-
void SetLocalDataPtrCOO(int **row, int **col, ValueType **val, std::string name, int nnz)#
Initialize a COO matrix on the host with externally allocated local data.
-
void SetGhostDataPtrCSR(int **row_offset, int **col, ValueType **val, std::string name, int nnz)#
Initialize a CSR matrix on the host with externally allocated ghost data.
-
void SetGhostDataPtrCOO(int **row, int **col, ValueType **val, std::string name, int nnz)#
Initialize a COO matrix on the host with externally allocated ghost data.
-
void LeaveDataPtrCSR(int **local_row_offset, int **local_col, ValueType **local_val, int **ghost_row_offset, int **ghost_col, ValueType **ghost_val)#
Leave a CSR matrix to host pointers.
-
void LeaveDataPtrCOO(int **local_row, int **local_col, ValueType **local_val, int **ghost_row, int **ghost_col, ValueType **ghost_val)#
Leave a COO matrix to host pointers.
-
void LeaveLocalDataPtrCSR(int **row_offset, int **col, ValueType **val)#
Leave a local CSR matrix to host pointers.
-
void LeaveLocalDataPtrCOO(int **row, int **col, ValueType **val)#
Leave a local COO matrix to host pointers.
-
void LeaveGhostDataPtrCSR(int **row_offset, int **col, ValueType **val)#
Leave a CSR ghost matrix to host pointers.
-
void LeaveGhostDataPtrCOO(int **row, int **col, ValueType **val)#
Leave a COO ghost matrix to host pointers.
-
void CloneFrom(const GlobalMatrix<ValueType> &src)#
Clone the entire matrix (values,structure+backend descr) from another GlobalMatrix.
-
void CopyFrom(const GlobalMatrix<ValueType> &src)#
Copy matrix (values and structure) from another GlobalMatrix.
-
void ConvertToCSR(void)#
Convert the matrix to CSR structure.
-
void ConvertToMCSR(void)#
Convert the matrix to MCSR structure.
-
void ConvertToBCSR(int blockdim)#
Convert the matrix to BCSR structure.
-
void ConvertToCOO(void)#
Convert the matrix to COO structure.
-
void ConvertToELL(void)#
Convert the matrix to ELL structure.
-
void ConvertToDIA(void)#
Convert the matrix to DIA structure.
-
void ConvertToHYB(void)#
Convert the matrix to HYB structure.
-
void ConvertToDENSE(void)#
Convert the matrix to DENSE structure.
-
void ConvertTo(unsigned int matrix_format, int blockdim = 1)#
Convert the matrix to specified matrix ID format.
-
virtual void Apply(const GlobalVector<ValueType> &in, GlobalVector<ValueType> *out) const#
Apply the operator, out = Operator(in), where in and out are global vectors.
-
virtual void ApplyAdd(const GlobalVector<ValueType> &in, ValueType scalar, GlobalVector<ValueType> *out) const#
Apply and add the operator, out += scalar * Operator(in), where in and out are global vectors.
-
void ReadFileMTX(const std::string filename)#
Read matrix from MTX (Matrix Market Format) file.
-
void WriteFileMTX(const std::string filename) const#
Write matrix to MTX (Matrix Market Format) file.
-
void ReadFileCSR(const std::string filename)#
Read matrix from CSR (ROCALUTION binary format) file.
-
void WriteFileCSR(const std::string filename) const#
Write matrix to CSR (ROCALUTION binary format) file.
-
void Sort(void)#
Sort the matrix indices.
Sorts the matrix by indices.
For CSR matrices, column values are sorted.
For COO matrices, row indices are sorted.
-
void ExtractInverseDiagonal(GlobalVector<ValueType> *vec_inv_diag) const#
Extract the inverse (reciprocal) diagonal values of the matrix into a GlobalVector.
-
void InitialPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#
Initial Pairwise Aggregation scheme.
-
void FurtherPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#
Further Pairwise Aggregation scheme.
-
void CoarsenOperator(GlobalMatrix<ValueType> *Ac, ParallelManager *pm, int nrow, int ncol, const LocalVector<int> &G, int Gsize, const int *rG, int rGsize) const#
Build coarse operator for pairwise aggregation scheme.
Local Vector#
-
template<typename ValueType>
class LocalVector : public rocalution::Vector<ValueType># LocalVector class.
A LocalVector is called local, because it will always stay on a single system. The system can contain several CPUs via UMA or NUMA memory system or it can contain an accelerator.
- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Unnamed Group
-
ValueType &operator[](int i)#
Access operator (only for host data)
The elements in the vector can be accessed via [] operators, when the vector is allocated on the host.
- Example
// rocALUTION local vector object LocalVector<ValueType> vec; // Allocate vector vec.Allocate("my_vector", 100); // Initialize vector with 1 vec.Ones(); // Set even elements to -1 for(int i = 0; i < vec.GetSize(); i += 2) { vec[i] = -1; }
- Parameters:
i – [in] access data at index
i
- Returns:
value at index
i
-
const ValueType &operator[](int i) const#
Access operator (only for host data)
The elements in the vector can be accessed via [] operators, when the vector is allocated on the host.
- Example
// rocALUTION local vector object LocalVector<ValueType> vec; // Allocate vector vec.Allocate("my_vector", 100); // Initialize vector with 1 vec.Ones(); // Set even elements to -1 for(int i = 0; i < vec.GetSize(); i += 2) { vec[i] = -1; }
- Parameters:
i – [in] access data at index
i
- Returns:
value at index
i
Public Functions
-
virtual void MoveToAccelerator(void)#
Move the object to the accelerator backend.
-
virtual void MoveToAcceleratorAsync(void)#
Move the object to the accelerator backend with async move.
-
virtual void MoveToHost(void)#
Move the object to the host backend.
-
virtual void MoveToHostAsync(void)#
Move the object to the host backend with async move.
-
virtual void Sync(void)#
Sync (the async move)
-
virtual void Info(void) const#
Print object information.
Info
can print object information about any rocALUTION object. This information consists of object properties and backend data.- Example
mat.Info(); vec.Info();
-
virtual IndexType2 GetSize(void) const#
Return the size of the vector.
-
virtual bool Check(void) const#
Perform a sanity check of the vector.
Checks, if the vector contains valid data, i.e. if the values are not infinity and not NaN (not a number).
- Return values:
true – if the vector is ok (empty vector is also ok).
false – if there is something wrong with the values.
-
void Allocate(std::string name, IndexType2 size)#
Allocate a local vector with name and size.
The local vector allocation function requires a name of the object (this is only for information purposes) and corresponding size description for vector objects.
- Example
LocalVector<ValueType> vec; vec.Allocate("my vector", 100); vec.Clear();
- Parameters:
name – [in] object name
size – [in] number of elements in the vector
-
void SetDataPtr(ValueType **ptr, std::string name, int size)#
Initialize a LocalVector on the host with externally allocated data.
SetDataPtr
has direct access to the raw data via pointers. Already allocated data can be set by passing the pointer.- Example
// Allocate vector ValueType* ptr_vec = new ValueType[200]; // Fill vector // ... // rocALUTION local vector object LocalVector<ValueType> vec; // Set the vector data, ptr_vec will become invalid vec.SetDataPtr(&ptr_vec, "my_vector", 200);
Note
Setting data pointer will leave the original pointer empty (set to
NULL
).
-
void LeaveDataPtr(ValueType **ptr)#
Leave a LocalVector to host pointers.
LeaveDataPtr
has direct access to the raw data via pointers. A LocalVector object can leave its raw data to a host pointer. This will leave the LocalVector empty.- Example
// rocALUTION local vector object LocalVector<ValueType> vec; // Allocate the vector vec.Allocate("my_vector", 100); // Fill vector // ... ValueType* ptr_vec = NULL; // Get (steal) the data from the vector, this will leave the local vector object empty vec.LeaveDataPtr(&ptr_vec);
-
virtual void Clear()#
Clear (free all data) the object.
-
virtual void Zeros()#
Set all values of the vector to 0.
-
virtual void Ones()#
Set all values of the vector to 1.
-
virtual void SetRandomUniform(unsigned long long seed, ValueType a = static_cast<ValueType>(-1), ValueType b = static_cast<ValueType>(1))#
Fill the vector with random values from interval [a,b].
-
virtual void SetRandomNormal(unsigned long long seed, ValueType mean = static_cast<ValueType>(0), ValueType var = static_cast<ValueType>(1))#
Fill the vector with random values from normal distribution.
-
virtual void ReadFileASCII(const std::string filename)#
Read vector from ASCII file.
Read a vector from ASCII file.
- Example
LocalVector<ValueType> vec; vec.ReadFileASCII("my_vector.dat");
- Parameters:
filename – [in] name of the file containing the ASCII data.
-
virtual void WriteFileASCII(const std::string filename) const#
Write vector to ASCII file.
Write a vector to ASCII file.
- Example
LocalVector<ValueType> vec; // Allocate and fill vec // ... vec.WriteFileASCII("my_vector.dat");
- Parameters:
filename – [in] name of the file to write the ASCII data to.
-
virtual void ReadFileBinary(const std::string filename)#
Read vector from binary file.
Read a vector from binary file. For details on the format, see WriteFileBinary().
- Example
LocalVector<ValueType> vec; vec.ReadFileBinary("my_vector.bin");
- Parameters:
filename – [in] name of the file containing the data.
-
virtual void WriteFileBinary(const std::string filename) const#
Write vector to binary file.
Write a vector to binary file.
The binary format contains a header, the rocALUTION version and the vector data as follows
// Header out << "#rocALUTION binary vector file" << std::endl; // rocALUTION version out.write((char*)&version, sizeof(int)); // Vector data out.write((char*)&size, sizeof(int)); out.write((char*)vec_val, size * sizeof(double));
- Example
LocalVector<ValueType> vec; // Allocate and fill vec // ... vec.WriteFileBinary("my_vector.bin");
Note
Vector values array is always stored in double precision (e.g. double or std::complex<double>).
- Parameters:
filename – [in] name of the file to write the data to.
-
virtual void CopyFrom(const LocalVector<ValueType> &src)#
Copy vector from another vector.
CopyFrom
copies values from another vector.- Example
LocalVector<ValueType> vec1, vec2; // Allocate and initialize vec1 and vec2 // ... // Move vec1 to accelerator // vec1.MoveToAccelerator(); // Now, vec1 is on the accelerator (if available) // and vec2 is on the host // Copy vec1 to vec2 (or vice versa) will move data between host and // accelerator backend vec1.CopyFrom(vec2);
Note
This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.
- Parameters:
src – [in] Vector, where values should be copied from.
-
virtual void CopyFromAsync(const LocalVector<ValueType> &src)#
Async copy from another local vector.
-
virtual void CopyFromFloat(const LocalVector<float> &src)#
Copy values from another local float vector.
-
virtual void CopyFromDouble(const LocalVector<double> &src)#
Copy values from another local double vector.
-
virtual void CopyFrom(const LocalVector<ValueType> &src, int src_offset, int dst_offset, int size)#
Copy vector from another vector with offsets and size.
CopyFrom
copies values with specific source and destination offsets and sizes from another vector.Note
This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.
- Parameters:
src – [in] Vector, where values should be copied from.
src_offset – [in] source offset.
dst_offset – [in] destination offset.
size – [in] number of entries to be copied.
-
void CopyFromPermute(const LocalVector<ValueType> &src, const LocalVector<int> &permutation)#
Copy a vector under permutation (forward permutation)
-
void CopyFromPermuteBackward(const LocalVector<ValueType> &src, const LocalVector<int> &permutation)#
Copy a vector under permutation (backward permutation)
-
virtual void CloneFrom(const LocalVector<ValueType> &src)#
Clone the vector.
CloneFrom
clones the entire vector, with data and backend descriptor from another Vector.- Example
LocalVector<ValueType> vec; // Allocate and initialize vec (host or accelerator) // ... LocalVector<ValueType> tmp; // By cloning vec, tmp will have identical values and will be on the same // backend as vec tmp.CloneFrom(vec);
- Parameters:
src – [in] Vector to clone from.
-
void CopyFromData(const ValueType *data)#
Copy (import) vector.
Copy (import) vector data that is described in one array (values). The object data has to be allocated with Allocate(), using the corresponding size of the data, first.
- Parameters:
data – [in] data to be imported.
-
void CopyToData(ValueType *data) const#
Copy (export) vector.
Copy (export) vector data that is described in one array (values). The output array has to be allocated, using the corresponding size of the data, first. Size can be obtain by GetSize().
- Parameters:
data – [out] exported data.
-
void Permute(const LocalVector<int> &permutation)#
Perform in-place permutation (forward) of the vector.
-
void PermuteBackward(const LocalVector<int> &permutation)#
Perform in-place permutation (backward) of the vector.
-
void Restriction(const LocalVector<ValueType> &vec_fine, const LocalVector<int> &map)#
Restriction operator based on restriction mapping vector.
-
void Prolongation(const LocalVector<ValueType> &vec_coarse, const LocalVector<int> &map)#
Prolongation operator based on restriction mapping vector.
-
virtual void AddScale(const LocalVector<ValueType> &x, ValueType alpha)#
Perform vector update of type this = this + alpha * x.
-
virtual void ScaleAdd(ValueType alpha, const LocalVector<ValueType> &x)#
Perform vector update of type this = alpha * this + x.
-
virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta)#
Perform vector update of type this = alpha * this + x * beta.
-
virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, int src_offset, int dst_offset, int size)#
Perform vector update of type this = alpha * this + x * beta with offsets.
-
virtual void ScaleAdd2(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, const LocalVector<ValueType> &y, ValueType gamma)#
Perform vector update of type this = alpha * this + x * beta + y * gamma.
-
virtual ValueType Dot(const LocalVector<ValueType> &x) const#
Compute dot (scalar) product, return this^T y.
-
virtual ValueType DotNonConj(const LocalVector<ValueType> &x) const#
Compute non-conjugate dot (scalar) product, return this^T y.
-
virtual ValueType Asum(void) const#
Compute the sum of absolute values of the vector, return = sum(|this|)
-
virtual int Amax(ValueType &value) const#
Compute the absolute max of the vector, return = index(max(|this|))
-
virtual void PointWiseMult(const LocalVector<ValueType> &x)#
Perform point-wise multiplication (element-wise) of this = this * x.
-
virtual void PointWiseMult(const LocalVector<ValueType> &x, const LocalVector<ValueType> &y)#
Perform point-wise multiplication (element-wise) of this = x * y.
-
virtual void Power(double power)#
Perform power operation to a vector.
-
void SetIndexArray(int size, const int *index)#
Set index array.
-
void GetContinuousValues(int start, int end, ValueType *values) const#
Get continuous indexed values.
-
void SetContinuousValues(int start, int end, const ValueType *values)#
Set continuous indexed values.
-
void ExtractCoarseMapping(int start, int end, const int *index, int nc, int *size, int *map) const#
Extract coarse boundary mapping.
-
void ExtractCoarseBoundary(int start, int end, const int *index, int nc, int *size, int *boundary) const#
Extract coarse boundary index.
Global Vector#
-
template<typename ValueType>
class GlobalVector : public rocalution::Vector<ValueType># GlobalVector class.
A GlobalVector is called global, because it can stay on a single or on multiple nodes in a network. For this type of communication, MPI is used.
- Template Parameters:
ValueType – - can be int, float, double, std::complex<float> and std::complex<double>
Public Functions
-
GlobalVector(const ParallelManager &pm)#
Initialize a global vector with a parallel manager.
-
virtual void MoveToAccelerator(void)#
Move the object to the accelerator backend.
-
virtual void MoveToHost(void)#
Move the object to the host backend.
-
virtual void Info(void) const#
Print object information.
Info
can print object information about any rocALUTION object. This information consists of object properties and backend data.- Example
mat.Info(); vec.Info();
-
virtual bool Check(void) const#
Perform a sanity check of the vector.
Checks, if the vector contains valid data, i.e. if the values are not infinity and not NaN (not a number).
- Return values:
true – if the vector is ok (empty vector is also ok).
false – if there is something wrong with the values.
-
virtual IndexType2 GetSize(void) const#
Return the size of the vector.
-
virtual int GetLocalSize(void) const#
Return the size of the local vector.
-
virtual int GetGhostSize(void) const#
Return the size of the ghost vector.
-
virtual void Allocate(std::string name, IndexType2 size)#
Allocate a global vector with name and size.
-
virtual void Clear(void)#
Clear (free all data) the object.
-
void SetParallelManager(const ParallelManager &pm)#
Set the parallel manager of a global vector.
-
virtual void Zeros(void)#
Set all values of the vector to 0.
-
virtual void Ones(void)#
Set all values of the vector to 1.
-
virtual void SetRandomUniform(unsigned long long seed, ValueType a = static_cast<ValueType>(-1), ValueType b = static_cast<ValueType>(1))#
Fill the vector with random values from interval [a,b].
-
virtual void SetRandomNormal(unsigned long long seed, ValueType mean = static_cast<ValueType>(0), ValueType var = static_cast<ValueType>(1))#
Fill the vector with random values from normal distribution.
-
virtual void CloneFrom(const GlobalVector<ValueType> &src)#
Clone the vector.
CloneFrom
clones the entire vector, with data and backend descriptor from another Vector.- Example
LocalVector<ValueType> vec; // Allocate and initialize vec (host or accelerator) // ... LocalVector<ValueType> tmp; // By cloning vec, tmp will have identical values and will be on the same // backend as vec tmp.CloneFrom(vec);
- Parameters:
src – [in] Vector to clone from.
-
void SetDataPtr(ValueType **ptr, std::string name, IndexType2 size)#
Initialize the local part of a global vector with externally allocated data.
-
void LeaveDataPtr(ValueType **ptr)#
Get a pointer to the data from the local part of a global vector and free the global vector object.
-
virtual void CopyFrom(const GlobalVector<ValueType> &src)#
Copy vector from another vector.
CopyFrom
copies values from another vector.- Example
LocalVector<ValueType> vec1, vec2; // Allocate and initialize vec1 and vec2 // ... // Move vec1 to accelerator // vec1.MoveToAccelerator(); // Now, vec1 is on the accelerator (if available) // and vec2 is on the host // Copy vec1 to vec2 (or vice versa) will move data between host and // accelerator backend vec1.CopyFrom(vec2);
Note
This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.
- Parameters:
src – [in] Vector, where values should be copied from.
-
virtual void ReadFileASCII(const std::string filename)#
Read vector from ASCII file.
Read a vector from ASCII file.
- Example
LocalVector<ValueType> vec; vec.ReadFileASCII("my_vector.dat");
- Parameters:
filename – [in] name of the file containing the ASCII data.
-
virtual void WriteFileASCII(const std::string filename) const#
Write vector to ASCII file.
Write a vector to ASCII file.
- Example
LocalVector<ValueType> vec; // Allocate and fill vec // ... vec.WriteFileASCII("my_vector.dat");
- Parameters:
filename – [in] name of the file to write the ASCII data to.
-
virtual void ReadFileBinary(const std::string filename)#
Read vector from binary file.
Read a vector from binary file. For details on the format, see WriteFileBinary().
- Example
LocalVector<ValueType> vec; vec.ReadFileBinary("my_vector.bin");
- Parameters:
filename – [in] name of the file containing the data.
-
virtual void WriteFileBinary(const std::string filename) const#
Write vector to binary file.
Write a vector to binary file.
The binary format contains a header, the rocALUTION version and the vector data as follows
// Header out << "#rocALUTION binary vector file" << std::endl; // rocALUTION version out.write((char*)&version, sizeof(int)); // Vector data out.write((char*)&size, sizeof(int)); out.write((char*)vec_val, size * sizeof(double));
- Example
LocalVector<ValueType> vec; // Allocate and fill vec // ... vec.WriteFileBinary("my_vector.bin");
Note
Vector values array is always stored in double precision (e.g. double or std::complex<double>).
- Parameters:
filename – [in] name of the file to write the data to.
-
virtual void AddScale(const GlobalVector<ValueType> &x, ValueType alpha)#
Perform vector update of type this = this + alpha * x.
-
virtual void ScaleAdd(ValueType alpha, const GlobalVector<ValueType> &x)#
Perform vector update of type this = alpha * this + x.
-
virtual void ScaleAdd2(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta, const GlobalVector<ValueType> &y, ValueType gamma)#
Perform vector update of type this = alpha * this + x * beta + y * gamma.
-
virtual void ScaleAddScale(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta)#
Perform vector update of type this = alpha * this + x * beta.
-
virtual ValueType Dot(const GlobalVector<ValueType> &x) const#
Compute dot (scalar) product, return this^T y.
-
virtual ValueType DotNonConj(const GlobalVector<ValueType> &x) const#
Compute non-conjugate dot (scalar) product, return this^T y.
-
virtual ValueType Asum(void) const#
Compute the sum of absolute values of the vector, return = sum(|this|)
-
virtual int Amax(ValueType &value) const#
Compute the absolute max of the vector, return = index(max(|this|))
-
virtual void PointWiseMult(const GlobalVector<ValueType> &x)#
Perform point-wise multiplication (element-wise) of this = this * x.
-
virtual void PointWiseMult(const GlobalVector<ValueType> &x, const GlobalVector<ValueType> &y)#
Perform point-wise multiplication (element-wise) of this = x * y.
-
virtual void Power(double power)#
Perform power operation to a vector.
-
void Restriction(const GlobalVector<ValueType> &vec_fine, const LocalVector<int> &map)#
Restriction operator based on restriction mapping vector.
-
void Prolongation(const GlobalVector<ValueType> &vec_coarse, const LocalVector<int> &map)#
Prolongation operator based on restriction mapping vector.
Base Classes#
-
template<typename ValueType>
class BaseMatrix#
-
template<typename ValueType>
class BaseStencil#
-
template<typename ValueType>
class BaseVector#
-
template<typename ValueType>
class HostMatrix#
-
template<typename ValueType>
class HostStencil#
-
template<typename ValueType>
class HostVector#
-
template<typename ValueType>
class AcceleratorMatrix#
-
template<typename ValueType>
class AcceleratorStencil#
-
template<typename ValueType>
class AcceleratorVector#
Parallel Manager#
-
class ParallelManager : public rocalution::RocalutionObj#
Parallel Manager class.
The parallel manager class handles the communication and the mapping of the global operators. Each global operator and vector need to be initialized with a valid parallel manager in order to perform any operation. For many distributed simulations, the underlying operator is already distributed. This information need to be passed to the parallel manager.
Public Functions
-
void SetMPICommunicator(const void *comm)#
Set the MPI communicator.
-
void Clear(void)#
Clear all allocated resources.
-
inline int GetRank(void) const#
Return rank.
-
IndexType2 GetGlobalSize(void) const#
Return the global size.
-
int GetLocalSize(void) const#
Return the local size.
-
int GetNumReceivers(void) const#
Return the number of receivers.
-
int GetNumSenders(void) const#
Return the number of senders.
-
int GetNumProcs(void) const#
Return the number of involved processes.
-
void SetGlobalSize(IndexType2 size)#
Initialize the global size.
-
void SetLocalSize(int size)#
Initialize the local size.
-
void SetBoundaryIndex(int size, const int *index)#
Set all boundary indices of this ranks process.
-
void SetReceivers(int nrecv, const int *recvs, const int *recv_offset)#
Number of processes, the current process is receiving data from, array of the processes, the current process is receiving data from and offsets, where the boundary for process ‘receiver’ starts.
-
void SetSenders(int nsend, const int *sends, const int *send_offset)#
Number of processes, the current process is sending data to, array of the processes, the current process is sending data to and offsets where the ghost part for process ‘sender’ starts.
-
void LocalToGlobal(int proc, int local, int &global)#
Mapping local to global.
-
void GlobalToLocal(int global, int &proc, int &local)#
Mapping global to local.
-
bool Status(void) const#
Check sanity status of parallel manager.
-
void ReadFileASCII(const std::string filename)#
Read file that contains all relevant parallel manager data.
-
void WriteFileASCII(const std::string filename) const#
Write file that contains all relevant parallel manager data.
-
void SetMPICommunicator(const void *comm)#
Solvers#
-
template<class OperatorType, class VectorType, typename ValueType>
class Solver : public rocalution::RocalutionObj# Base class for all solvers and preconditioners.
Most of the solvers can be performed on linear operators LocalMatrix, LocalStencil and GlobalMatrix - i.e. the solvers can be performed locally (on a shared memory system) or in a distributed manner (on a cluster) via MPI. The only exception is the AMG (Algebraic Multigrid) solver which has two versions (one for LocalMatrix and one for GlobalMatrix class). The only pure local solvers (which do not support global/MPI operations) are the mixed-precision defect-correction solver and all direct solvers.
All solvers need three template parameters - Operators, Vectors and Scalar type.
The Solver class is purely virtual and provides an interface for
SetOperator() to set the operator \(A\), i.e. the user can pass the matrix here.
Build() to build the solver (including preconditioners, sub-solvers, etc.). The user need to specify the operator first before calling Build().
Solve() to solve the system \(Ax = b\). The user need to pass a right-hand-side \(b\) and a vector \(x\), where the solution will be obtained.
Print() to show solver information.
ReBuildNumeric() to only re-build the solver numerically (if possible).
MoveToHost() and MoveToAccelerator() to offload the solver (including preconditioners and sub-solvers) to the host/accelerator.
- Template Parameters:
OperatorType – - can be LocalMatrix, GlobalMatrix or LocalStencil
VectorType – - can be LocalVector or GlobalVector
ValueType – - can be float, double, std::complex<float> or std::complex<double>
Subclassed by rocalution::IterativeLinearSolver< OperatorTypeH, VectorTypeH, ValueTypeH >, rocalution::DirectLinearSolver< OperatorType, VectorType, ValueType >, rocalution::IterativeLinearSolver< OperatorType, VectorType, ValueType >, rocalution::Preconditioner< OperatorType, VectorType, ValueType >
Public Functions
-
void SetOperator(const OperatorType &op)#
Set the Operator of the solver.
-
virtual void ResetOperator(const OperatorType &op)#
Reset the operator; see ReBuildNumeric()
-
virtual void Print(void) const = 0#
Print information about the solver.
-
virtual void Solve(const VectorType &rhs, VectorType *x) = 0#
Solve Operator x = rhs.
-
virtual void SolveZeroSol(const VectorType &rhs, VectorType *x)#
Solve Operator x = rhs, setting initial x = 0.
-
virtual void Clear(void)#
Clear (free all local data) the solver.
-
virtual void Build(void)#
Build the solver (data allocation, structure and numerical computation)
-
virtual void BuildMoveToAcceleratorAsync(void)#
Build the solver and move it to the accelerator asynchronously.
-
virtual void Sync(void)#
Synchronize the solver.
-
virtual void ReBuildNumeric(void)#
Rebuild the solver only with numerical computation (no allocation or data structure computation)
-
virtual void MoveToHost(void)#
Move all data (i.e. move the solver) to the host.
-
virtual void MoveToAccelerator(void)#
Move all data (i.e. move the solver) to the accelerator.
-
virtual void Verbose(int verb = 1)#
Provide verbose output of the solver.
verb = 0 -> no output
verb = 1 -> print info about the solver (start, end);
verb = 2 -> print (iter, residual) via iteration control;
Iterative Linear Solvers#
-
template<class OperatorType, class VectorType, typename ValueType>
class IterativeLinearSolver : public rocalution::Solver<OperatorType, VectorType, ValueType># Base class for all linear iterative solvers.
The iterative solvers are controlled by an iteration control object, which monitors the convergence properties of the solver, i.e. maximum number of iteration, relative tolerance, absolute tolerance and divergence tolerance. The iteration control can also record the residual history and store it in an ASCII file.
Init(), InitMinIter(), InitMaxIter() and InitTol() initialize the solver and set the stopping criteria.
RecordResidualHistory() and RecordHistory() start the recording of the residual and write it into a file.
Verbose() sets the level of verbose output of the solver (0 - no output, 2 - detailed output, including residual and iteration information).
SetPreconditioner() sets the preconditioning.
All iterative solvers are controlled based on
Absolute stopping criteria, when \(|r_{k}|_{L_{p}} < \epsilon_{abs}\)
Relative stopping criteria, when \(|r_{k}|_{L_{p}} / |r_{1}|_{L_{p}} \leq \epsilon_{rel}\)
Divergence stopping criteria, when \(|r_{k}|_{L_{p}} / |r_{1}|_{L_{p}} \geq \epsilon_{div}\)
Maximum number of iteration \(N\), when \(k = N\)
where \(k\) is the current iteration, \(r_{k}\) the residual for the current iteration \(k\) (i.e. \(r_{k} = b - Ax_{k}\)) and \(r_{1}\) the starting residual (i.e. \(r_{1} = b - Ax_{init}\)). In addition, the minimum number of iterations \(M\) can be specified. In this case, the solver will not stop to iterate, before \(k \geq M\).
The \(L_{p}\) norm is used for the computation, where \(p\) could be 1, 2 and \(\infty\). The norm computation can be set with SetResidualNorm() with 1 for \(L_{1}\), 2 for \(L_{2}\) and 3 for \(L_{\infty}\). For the computation with \(L_{\infty}\), the index of the maximum value can be obtained with GetAmaxResidualIndex(). If this function is called and \(L_{\infty}\) was not selected, this function will return -1.
The reached criteria can be obtained with GetSolverStatus(), returning
0, if no criteria has been reached yet
1, if absolute tolerance has been reached
2, if relative tolerance has been reached
3, if divergence tolerance has been reached
4, if maximum number of iteration has been reached
- Template Parameters:
OperatorType – - can be LocalMatrix, GlobalMatrix or LocalStencil
VectorType – - can be LocalVector or GlobalVector
ValueType – - can be float, double, std::complex<float> or std::complex<double>
Subclassed by rocalution::BaseMultiGrid< OperatorType, VectorType, ValueType >, rocalution::BiCGStab< OperatorType, VectorType, ValueType >, rocalution::BiCGStabl< OperatorType, VectorType, ValueType >, rocalution::CG< OperatorType, VectorType, ValueType >, rocalution::CR< OperatorType, VectorType, ValueType >, rocalution::Chebyshev< OperatorType, VectorType, ValueType >, rocalution::FCG< OperatorType, VectorType, ValueType >, rocalution::FGMRES< OperatorType, VectorType, ValueType >, rocalution::FixedPoint< OperatorType, VectorType, ValueType >, rocalution::GMRES< OperatorType, VectorType, ValueType >, rocalution::IDR< OperatorType, VectorType, ValueType >, rocalution::QMRCGStab< OperatorType, VectorType, ValueType >
Public Functions
-
void Init(double abs_tol, double rel_tol, double div_tol, int max_iter)#
Initialize the solver with absolute/relative/divergence tolerance and maximum number of iterations.
-
void Init(double abs_tol, double rel_tol, double div_tol, int min_iter, int max_iter)#
Initialize the solver with absolute/relative/divergence tolerance and minimum/maximum number of iterations.
-
void InitMinIter(int min_iter)#
Set the minimum number of iterations.
-
void InitMaxIter(int max_iter)#
Set the maximum number of iterations.
-
void InitTol(double abs, double rel, double div)#
Set the absolute/relative/divergence tolerance.
-
void SetResidualNorm(int resnorm)#
Set the residual norm to \(L_1\), \(L_2\) or \(L_\infty\) norm.
resnorm = 1 ->