Functions

cudss

CUDSS.cudss — Function

cudss(phase::String, solver::CudssSolver{T}, x::CuVector{T}, b::CuVector{T})
cudss(phase::String, solver::CudssSolver{T}, X::CuMatrix{T}, B::CuMatrix{T})
cudss(phase::String, solver::CudssSolver{T}, X::CudssMatrix{T}, B::CudssMatrix{T})
cudss(phase::String, solver::CudssBatchedSolver{T}, x::Vector{CuVector{T}}, b::Vector{CuVector{T}})
cudss(phase::String, solver::CudssBatchedSolver{T}, X::Vector{CuMatrix{T}}, B::Vector{CuMatrix{T}})
cudss(phase::String, solver::CudssBatchedSolver{T}, X::CudssBatchedMatrix{T}, B::CudssBatchedMatrix{T})

The type T can be Float32, Float64, ComplexF32 or ComplexF64.

The available phases are "reordering", "symbolic_factorization", "analysis", "factorization", "refactorization" and "solve". The phases "solve_fwd", "solve_diag" and "solve_bwd" are available but not yet functional.

source

cudss_set

CUDSS.cudss_set — Function

cudss_set(solver::CudssSolver, parameter::String, value)
cudss_set(solver::CudssSolver{T}, A::CuSparseMatrixCSR{T,Cint})
cudss_set(solver::CudssBatchedSolver, parameter::String, value)
cudss_set(solver::CudssBatchedSolver{T}, A::Vector{CuSparseMatrixCSR{T,Cint}})
cudss_set(config::CudssConfig, parameter::String, value)
cudss_set(data::CudssData, parameter::String, value)
cudss_set(matrix::CudssMatrix{T}, b::CuVector{T})
cudss_set(matrix::CudssMatrix{T}, B::CuMatrix{T})
cudss_set(matrix::CudssMatrix{T}, A::CuSparseMatrixCSR{T,Cint})
cudss_set(matrix::CudssBatchedMatrix{T}, b::Vector{CuVector{T}})
cudss_set(matrix::CudssBatchedMatrix{T}, B::Vector{CuMatrix{T}})
cudss_set(matrix::CudssBatchedMatrix{T}, A::Vector{CuSparseMatrixCSR{T,Cint}})

The type T can be Float32, Float64, ComplexF32 or ComplexF64.

The available configuration parameters are:

"reordering_alg": Algorithm for the reordering phase ("default", "algo1", "algo2", "algo3", "algo4", or "algo5");
"factorization_alg": Algorithm for the factorization phase ("default", "algo1", "algo2", "algo3", "algo4", or "algo5");
"solve_alg": Algorithm for the solving phase ("default", "algo1", "algo2", "algo3", "algo4", or "algo5");
"use_matching": A flag to enable (1) or disable (0) the matching;
"matching_alg": Algorithm for the matching;
"solve_mode": Potential modificator on the system matrix (transpose or adjoint);
"ir_n_steps": Number of steps during the iterative refinement;
"ir_tol": Iterative refinement tolerance;
"pivot_type": Type of pivoting ('C', 'R' or 'N');
"pivot_threshold": Pivoting threshold which is used to determine if digonal element is subject to pivoting;
"pivot_epsilon": Pivoting epsilon, absolute value to replace singular diagonal elements;
"max_lu_nnz": Upper limit on the number of nonzero entries in LU factors for non-symmetric matrices;
"hybrid_mode": Hybrid memory mode – 0 (default = device-only) or 1 (hybrid = host/device);
"hybrid_device_memory_limit": User-defined device memory limit (number of bytes) for the hybrid memory mode;
"use_cuda_register_memory": A flag to enable (1) or disable (0) usage of cudaHostRegister() by the hybrid memory mode;
"host_nthreads": Number of threads to be used by cuDSS in multi-threaded mode;
"hybrid_execute_mode": Hybrid execute mode – 0 (default = device-only) or 1 (hybrid = host/device);
"pivot_epsilon_alg": Algorithm for the pivot epsilon calculation;
"nd_nlevels": Minimum number of levels for the nested dissection reordering;
"ubatch_size": The number of matrices in a uniform batch of systems to be processed by cuDSS;
"ubatch_index": Specify cuDSS to process all matrices in the uniform batch at once.

The available data parameters are:

"info": Device-side error information;
"user_perm": User permutation to be used instead of running the reordering algorithms;
"comm": Communicator for Multi-GPU multi-node mode.

The data parameter "info" must be restored to 0 if a Cholesky factorization fails due to indefiniteness and refactorization is performed on an updated matrix.

source

cudss_get

CUDSS.cudss_get — Function

value = cudss_get(solver::CudssSolver, parameter::String)
value = cudss_get(solver::CudssBatchedSolver, parameter::String)
value = cudss_get(config::CudssConfig, parameter::String)
value = cudss_get(data::CudssData, parameter::String)

The available configuration parameters are:

"reordering_alg": Algorithm for the reordering phase;
"factorization_alg": Algorithm for the factorization phase;
"solve_alg": Algorithm for the solving phase;
"use_matching": A flag to enable (1) or disable (0) the matching;
"matching_alg": Algorithm for the matching;
"solve_mode": Potential modificator on the system matrix (transpose or adjoint);
"ir_n_steps": Number of steps during the iterative refinement;
"ir_tol": Iterative refinement tolerance;
"pivot_type": Type of pivoting;
"pivot_threshold": Pivoting threshold which is used to determine if digonal element is subject to pivoting;
"pivot_epsilon": Pivoting epsilon, absolute value to replace singular diagonal elements;
"max_lu_nnz": Upper limit on the number of nonzero entries in LU factors for non-symmetric matrices;
"hybrid_mode": Hybrid memory mode – 0 (default = device-only) or 1 (hybrid = host/device);
"hybrid_device_memory_limit": User-defined device memory limit (number of bytes) for the hybrid memory mode;
"use_cuda_register_memory": A flag to enable (1) or disable (0) usage of cudaHostRegister() by the hybrid memory mode;
"host_nthreads": Number of threads to be used by cuDSS in multi-threaded mode;
"hybrid_execute_mode": Hybrid execute mode – 0 (default = device-only) or 1 (hybrid = host/device);
"pivot_epsilon_alg": Algorithm for the pivot epsilon calculation;
"nd_nlevels": Minimum number of levels for the nested dissection reordering;
"ubatch_size": The number of matrices in a uniform batch of systems to be processed by cuDSS;
"ubatch_index": Specify cuDSS to process all matrices in the uniform batch at once.

The available data parameters are:

"info": Device-side error information;
"lu_nnz": Number of non-zero entries in LU factors;
"npivots": Number of pivots encountered during factorization;
"inertia": Tuple of positive and negative indices of inertia for symmetric and hermitian non positive-definite matrix types;
"perm_reorder_row": Reordering permutation for the rows;
"perm_reorder_col": Reordering permutation for the columns;
"perm_row": Final row permutation (which includes effects of both reordering and pivoting);
"perm_col": Final column permutation (which includes effects of both reordering and pivoting);
"diag": Diagonal of the factorized matrix;
"hybrid_device_memory_min": Minimal amount of device memory (number of bytes) required in the hybrid memory mode;
"memory_estimates": Memory estimates (in bytes) for host and device memory required for the chosen memory mode.

The data parameters "info", "lu_nnz", "perm_reorder_row", "perm_reorder_col", "hybrid_device_memory_min" and "memory_estimates" require the phase "analyse" performed by cudss. The data parameters "npivots", "inertia" and "diag" require the phases "analyse" and "factorization" performed by cudss. The data parameters "perm_row" and "perm_col" are available but not yet functional.

source