Direct solvers for power flow

ExaPF implements a power flow solver in the function run_pf. Under the hood, the function run_pf calls the function nlsolve! which uses a Newton-Raphson algorithm to solve iteratively the system of nonlinear equations

\[g(x, p) = 0\]

where $g: \mathbb{R}^{n_x} \times \mathbb{R}^{n_p} \to \mathbb{R}^{n_x}$ is a nonlinear function encoding the power flow equations.

At a fixed $p$, solving the power flow amounts to find a state $x$ such that $g(x, p) = 0$. At iteration $$k$, the Newton-Raphson algorithm finds the next iterate by solving the linear system

\[(\nabla_x g_k) \Delta x = - g_k\]

and setting $x_{k+1} = x_{k} + \Delta x_k$. The Jacobian $\nabla_x g_k = \nabla_x (x_k, p)$ is computed automatically in sparse format using AutoDiff.

Hence, solving the power flow equations amounts to solve a sequence of sparse linear systems. When a direct solver is employed, the system is solved in two steps. First, a LU factorization of the matrix $\nabla_x g$ is computed: we find a lower and an upper triangular matrices $L$ and $U$ as well as two permutation matrices $P$ and $Q$ such that

\[P (\nabla_x g) Q = LU\]

Once the matrix factorized, solving the linear system just translates to perform two backsolves with the triangular matrices $L$ and $U$.

This method is usually efficient, as the power flow Jacobian is super sparse (less than 1% of nonzeroes) and its sparsity pattern is fixed, so we have to compute the symbolic factorization of the system only once.

UMFPACK (CPU, default)

By default, ExaPF employs the linear solver UMFPACK to solve the linear system, as UMFPACK is shipped automatically in Julia.

In the LinearSolvers submodule, this is how the wrapper is implemented:

struct DirectSolver{Fac} <: AbstractLinearSolver
    factorization::Fac
end
DirectSolver(J::AbstractMatrix) = DirectSolver(lu(J))

By default, the constructor takes as input the initial Jacobian J and factorizes it by calling lu(J), which in Julia translates to a factorization with UMFPACK. Then, inside the function nlsolve! we refactorize the matrix at each iteration by calling the function LinearSolvers.update!

function update!(s::DirectSolver, J::AbstractMatrix)
    LinearAlgebra.lu!(s.factorization, J) # Update factorization inplace
end

This function uses the function LinearAlgebra.lu! to update the factorization inplace. The backsolve is computed by calling the LinearAlgebra.ldiv! function:

function ldiv!(s::DirectSolver, y::AbstractVector, J::AbstractMatrix, x::AbstractVector)
    LinearAlgebra.ldiv!(y, s.factorization, x) # Forward-backward solve
    return 0
end

We notice that the code has been designed to support any factorization routines implementing the two routines LinearAlgebra.lu! and LinearAlgebra.ldiv!.

Before comparing with other linear solvers, we solve a large scale power flow instance with UMFPACK to give us a reference.

polar = ExaPF.load_polar("case9241pegase.m")
stack = ExaPF.NetworkStack(polar)
pf_solver = NewtonRaphson(tol=1e-10, verbose=2)  # power flow solver
func = ExaPF.PowerFlowBalance(polar) ∘ ExaPF.PolarBasis(polar) # power flow func
jx = ExaPF.Jacobian(polar, func, State()) # init AD
ExaPF.nlsolve!(pf_solver, jx, stack)

Power flow has converged: true
  * #iterations: 7
  * Time Jacobian (s) ........: 0.8406
  * Time linear solver (s) ...: 0.0342
     * update (s) ............: 0.0298
     * ldiv (s) ..............: 0.0044
  * Time total (s) ...........: 0.8774

KLU (CPU)

KLU is an efficient sparse linear solver, initially designed for circuit simulation problems. It is often considered as one of the state-of-the-art linear solver to solve power flow problems. Conveniently, KLU is wrapped in Julia with the package KLU.jl. KLU.jl implements a proper interface to use KLU. We just have to implement a forgiving function for LinearAlgebra.lu!

LinearAlgebra.lu!(K::KLU.KLUFactorization, J) = KLU.klu!(K, J)

Then, we are ready to solve a power flow with KLU using our current abstraction. One has just to create a new instance of LS.DirectSolver:

klu_factorization = KLU.klu(jx.J)
klu_solver = LS.DirectSolver(klu_factorization)

ExaPF.LinearSolvers.DirectSolver{KLU.KLUFactorization{Float64, Int64}}(KLU.KLUFactorization{Float64, Int64}(KLU.klu_l_common_struct(0.001, 1.2, 1.2, 10.0, 0.0, 1, 0, 2, Ptr{Nothing} @0x0000000000000000, Ptr{Nothing} @0x0000000000000000, 1, 0, 0, 17036, 17036, 17036, 0, -1.0, -1.0, -1.0, -1.0, 0.0, 0x00000000005a0030, 0x0000000000760150), Ptr{Nothing} @0x000000004a6673d0, Ptr{Nothing} @0x0000000043d03ad0, 17036, [0, 4, 12, 19, 24, 27, 37, 40, 44, 54  …  129350, 129359, 129363, 129369, 129380, 129384, 129390, 129402, 129408, 129412], [0, 383, 5548, 13344, 1, 458, 4027, 5306, 6117, 11823  …  6009, 6175, 9238, 13805, 13971, 17034, 8879, 9239, 16675, 17035], [1050.4551624894714, -629.5373899630948, -421.251258755022, 88.40062041224643, 995.2128749049238, -479.4136240982223, -138.29482949968133, -145.95963146384614, -230.91975549270308, 22.972728768572996  …  -480.47843944268664, -39.71137052069115, 520.1642960016878, -3360.900457400674, -208.11090120000475, 3568.982258670069, -39.40403545216686, 39.40522422897141, -242.8021538639664, 242.80196093598863]))

and pass it to nlsolve!:

ExaPF.init!(polar, stack) # reinit stack
ExaPF.nlsolve!(pf_solver, jx, stack; linear_solver=klu_solver)

Power flow has converged: true
  * #iterations: 7
  * Time Jacobian (s) ........: 0.2090
  * Time linear solver (s) ...: 0.0339
     * update (s) ............: 0.0296
     * ldiv (s) ..............: 0.0043
  * Time total (s) ...........: 0.2454

We observe KLU reduces considerably the time spent in the linear solver.

cusolverRF (CUDA)

cusolverRF is an efficient LU refactorization routine implemented in CUDA. It is wrapped in Julia inside the package CUSOLVERRF.jl:

using CUSOLVERRF

The principle is the following: the initial symbolic factorization is computed on the CPU with the routine chosen by the user. Then, each time we have to refactorize a matrix with the same sparsity pattern, we can recompute the numerical factorization entirely on the GPU. In practice, this solver is efficient at refactorizing a given matrix if the sparsity is significant.

This is of direct relevance for us, as (i) the sparsity of the power flow Jacobian doesn't change along the Newton iterations and (ii) the Jacobian is super-sparse. In ExaPF, it is the linear solver of choice when it comes to solve the power flow entirely on the GPU.

CUSOLVERRF.jl follows the LinearAlgebra's interface, so we can use it directly in ExaPF. We first have to instantiate everything on the GPU:

using CUDA
polar_gpu = ExaPF.load_polar("case9241pegase.m", CUDABackend())
stack_gpu = ExaPF.NetworkStack(polar_gpu)
func_gpu = ExaPF.PowerFlowBalance(polar_gpu) ∘ ExaPF.PolarBasis(polar_gpu)
jx_gpu = ExaPF.Jacobian(polar_gpu, func_gpu, State()) # init AD

A AutoDiff Jacobian for ExaPF.ComposedExpressions{ExaPF.PolarBasis{CUDA.CuArray{Int64, 1}, CUDA.CUSPARSE.CuSparseMatrixCSR{Float64, Int32}}, ExaPF.PowerFlowBalance{CUDA.CuArray{Float64, 1}, CUDA.CUSPARSE.CuSparseMatrixCSR{Float64, Int32}}}(PolarBasis (AbstractExpression), PowerFlowBalance (AbstractExpression))
Number of Jacobian colors: 85

We can instantiate a new cusolverRF's instance as

rf_fac = CUSOLVERRF.RFLU(jx_gpu.J)
rf_solver = LS.DirectSolver(rf_fac)

ExaPF.LinearSolvers.DirectSolver{CUSOLVERRF.RFLU{Float64, CUSOLVERRF.CuSparseSVDeprecated, CUSOLVERRF.CuSparseSMDeprecated}}(CUSOLVERRF.RFLU{Float64, CUSOLVERRF.CuSparseSVDeprecated, CUSOLVERRF.CuSparseSMDeprecated}
)

Then, we are able to solve the power flow entirely on the GPU, simply as

ExaPF.nlsolve!(pf_solver, jx_gpu, stack_gpu; linear_solver=rf_solver)

Power flow has converged: true
  * #iterations: 7
  * Time Jacobian (s) ........: 1.4038
  * Time linear solver (s) ...: 0.1018
     * update (s) ............: 0.1000
     * ldiv (s) ..............: 0.0018
  * Time total (s) ...........: 1.5083