Functions

cudss

CUDSS.cudssFunction
cudss(phase::String, solver::CudssSolver{T}, x::CuVector{T}, b::CuVector{T})
cudss(phase::String, solver::CudssSolver{T}, X::CuMatrix{T}, B::CuMatrix{T})
cudss(phase::String, solver::CudssSolver{T}, X::CudssMatrix{T}, B::CudssMatrix{T})
cudss(phase::String, solver::CudssBatchedSolver{T}, x::Vector{CuVector{T}}, b::Vector{CuVector{T}})
cudss(phase::String, solver::CudssBatchedSolver{T}, X::Vector{CuMatrix{T}}, B::Vector{CuMatrix{T}})
cudss(phase::String, solver::CudssBatchedSolver{T}, X::CudssBatchedMatrix{T}, B::CudssBatchedMatrix{T})

The type T can be Float32, Float64, ComplexF32 or ComplexF64.

The available phases are "analysis", "factorization", "refactorization" and "solve". The phases "solve_fwd", "solve_diag" and "solve_bwd" are available but not yet functional.

source

cudss_set

CUDSS.cudss_setFunction
cudss_set(solver::CudssSolver, parameter::String, value)
cudss_set(solver::CudssSolver{T}, A::CuSparseMatrixCSR{T,Cint})
cudss_set(solver::CudssBatchedSolver, parameter::String, value)
cudss_set(solver::CudssBatchedSolver{T}, A::Vector{CuSparseMatrixCSR{T,Cint}})
cudss_set(config::CudssConfig, parameter::String, value)
cudss_set(data::CudssData, parameter::String, value)
cudss_set(matrix::CudssMatrix{T}, b::CuVector{T})
cudss_set(matrix::CudssMatrix{T}, B::CuMatrix{T})
cudss_set(matrix::CudssMatrix{T}, A::CuSparseMatrixCSR{T,Cint})
cudss_set(matrix::CudssBatchedMatrix{T}, b::Vector{CuVector{T}})
cudss_set(matrix::CudssBatchedMatrix{T}, B::Vector{CuMatrix{T}})
cudss_set(matrix::CudssBatchedMatrix{T}, A::Vector{CuSparseMatrixCSR{T,Cint}})

The type T can be Float32, Float64, ComplexF32 or ComplexF64.

The available configuration parameters are:

  • "reordering_alg": Algorithm for the reordering phase ("default", "algo1", "algo2" or "algo3");
  • "factorization_alg": Algorithm for the factorization phase ("default", "algo1", "algo2" or "algo3");
  • "solve_alg": Algorithm for the solving phase ("default", "algo1", "algo2" or "algo3");
  • "matching_type": Type of matching;
  • "solve_mode": Potential modificator on the system matrix (transpose or adjoint);
  • "ir_n_steps": Number of steps during the iterative refinement;
  • "ir_tol": Iterative refinement tolerance;
  • "pivot_type": Type of pivoting ('C', 'R' or 'N');
  • "pivot_threshold": Pivoting threshold which is used to determine if digonal element is subject to pivoting;
  • "pivot_epsilon": Pivoting epsilon, absolute value to replace singular diagonal elements;
  • "max_lu_nnz": Upper limit on the number of nonzero entries in LU factors for non-symmetric matrices;
  • "hybrid_mode": Memory mode – 0 (default = device-only) or 1 (hybrid = host/device);
  • "hybrid_device_memory_limit": User-defined device memory limit (number of bytes) for the hybrid memory mode;
  • "use_cuda_register_memory": A flag to enable (1) or disable (0) usage of cudaHostRegister() by the hybrid memory mode.

The available data parameters are:

  • "info": Device-side error information;
  • "user_perm": User permutation to be used instead of running the reordering algorithms;
  • "comm": Communicator for Multi-GPU multi-node mode.

The data parameter "info" must be restored to 0 if a Cholesky factorization fails due to indefiniteness and refactorization is performed on an updated matrix.

source

cudss_get

CUDSS.cudss_getFunction
value = cudss_get(solver::CudssSolver, parameter::String)
value = cudss_get(solver::CudssBatchedSolver, parameter::String)
value = cudss_get(config::CudssConfig, parameter::String)
value = cudss_get(data::CudssData, parameter::String)

The available configuration parameters are:

  • "reordering_alg": Algorithm for the reordering phase;
  • "factorization_alg": Algorithm for the factorization phase;
  • "solve_alg": Algorithm for the solving phase;
  • "matching_type": Type of matching;
  • "solve_mode": Potential modificator on the system matrix (transpose or adjoint);
  • "ir_n_steps": Number of steps during the iterative refinement;
  • "ir_tol": Iterative refinement tolerance;
  • "pivot_type": Type of pivoting;
  • "pivot_threshold": Pivoting threshold which is used to determine if digonal element is subject to pivoting;
  • "pivot_epsilon": Pivoting epsilon, absolute value to replace singular diagonal elements;
  • "max_lu_nnz": Upper limit on the number of nonzero entries in LU factors for non-symmetric matrices;
  • "hybrid_mode": Memory mode – 0 (default = device-only) or 1 (hybrid = host/device);
  • "hybrid_device_memory_limit": User-defined device memory limit (number of bytes) for the hybrid memory mode;
  • "use_cuda_register_memory": A flag to enable (1) or disable (0) usage of cudaHostRegister() by the hybrid memory mode.

The available data parameters are:

  • "info": Device-side error information;
  • "lu_nnz": Number of non-zero entries in LU factors;
  • "npivots": Number of pivots encountered during factorization;
  • "inertia": Tuple of positive and negative indices of inertia for symmetric and hermitian non positive-definite matrix types;
  • "perm_reorder_row": Reordering permutation for the rows;
  • "perm_reorder_col": Reordering permutation for the columns;
  • "perm_row": Final row permutation (which includes effects of both reordering and pivoting);
  • "perm_col": Final column permutation (which includes effects of both reordering and pivoting);
  • "diag": Diagonal of the factorized matrix;
  • "hybrid_device_memory_min": Minimal amount of device memory (number of bytes) required in the hybrid memory mode;
  • "memory_estimates": Memory estimates (in bytes) for host and device memory required for the chosen memory mode.

The data parameters "info", "lu_nnz", "perm_reorder_row", "perm_reorder_col", "hybrid_device_memory_min" and "memory_estimates" require the phase "analyse" performed by cudss. The data parameters "npivots", "inertia" and "diag" require the phases "analyse" and "factorization" performed by cudss. The data parameters "perm_row" and "perm_col" are available but not yet functional.

source