Abstract
The paper describes a sparse direct solver for the linear systems that arise from the discretization of an elliptic PDE on a two-dimensional domain. The scheme decomposes the domain into thin subdomains, or “slabs” and uses a two-level approach that is designed with parallelization in mind. The scheme takes advantage of \(\varvec{\mathcal {H}}^\textbf{2}\)-matrix structure emerging during factorization and utilizes randomized algorithms to efficiently recover this structure. As opposed to multi-level nested dissection schemes that incorporate the use of \(\varvec{\mathcal {H}}\) or \(\varvec{\mathcal {H}}^\textbf{2}\) matrices for a hierarchy of front sizes, SlabLU is a two-level scheme which only uses \(\varvec{\mathcal {H}}^\textbf{2}\)-matrix algebra for fronts of roughly the same size. The simplicity allows the scheme to be easily tuned for performance on modern architectures and GPUs. The solver described is compatible with a range of different local discretizations, and numerical experiments demonstrate its performance for regular discretizations of rectangular and curved geometries. The technique becomes particularly efficient when combined with very high-order accurate multidomain spectral collocation schemes. With this discretization, a Helmholtz problem on a domain of size \(\textbf{1000} \varvec{\lambda } \times \textbf{1000} \varvec{\lambda }\) (for which \(\varvec{N}~\mathbf {=100} \textbf{M}\)) is solved in 15 min to 6 correct digits on a high-powered desktop with GPU acceleration.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Martinsson, P.-G.: Fast direct solvers for elliptic PDEs. SIAM, Philadelphia PA (2019)
Engquist, B., Ying, L.: Sweeping preconditioner for the Helmholtz equation: hierarchical matrix representation. Commun. Pure Appl. Math. 64(5), 697–735 (2011). https://doi.org/10.1002/cpa.20358
Gander, M.J., Zhang, H.: Restrictions on the use of sweeping type preconditioners for Helmholtz problems. In: International Conference on Domain Decomposition Methods, pp. 321–332 (2017). Springer
Vion, A., Bélanger-Rioux, R., Demanet, L., Geuzaine, C.: A DDM double sweep preconditioner for the Helmholtz equation with matrix probing of the DtN map. Mathematical and Numerical Aspects of Wave Propagation WAVES 2013 (2013)
Gillman, A., Young, P., Martinsson, P.-G.: A direct solver \(O(N)\) complexity for integral equations on one-dimensional domains. Front. Math. China 7, 217–247 (2012). https://doi.org/10.1007/s11464-012-0188-3
Wang, S., Li, X.S., Xia, J., Situ, Y., De Hoop, M.V.: Efficient scalable algorithms for solving dense linear systems with hierarchically semiseparable structures. SIAM J. Sci. Comput. 35(6), 519–544 (2013)
Xia, J., Chandrasekaran, S., Gu, M., Li, X.S.: Fast algorithms for hierarchically semiseparable matrices. Numer. Linear Algebra Appl. 17(6), 953–976 (2010)
Levitt, J., Martinsson, P.-G.: Linear-complexity black-box randomized compression of rank-structured matrices. SIAM J. Sci. Comput. 46(3), 1747–1763 (2024)
Martinsson, P.-G.: A direct solver for variable coefficient elliptic PDEs discretized via a composite spectral collocation method. J. Comput. Phys. 242, 460–479 (2013). https://doi.org/10.1016/j.jcp.2013.02.019
Briggs, W.L., Henson, V.E., McCormick, S.F.: A multigrid tutorial, 2nd edn., p. 193. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2000)
Ruge, J.W., Stüben, K.: Algebraic multigrid. In: Multigrid Methods, pp. 73–130. SIAM, Philadelphia PA (1987)
Xu, J., Zikatanov, L.: Algebraic multigrid methods. Acta Numer. 26, 591–721 (2017)
Ernst, O.G., Gander, M.J.: Why it is difficult to solve Helmholtz problems with classical iterative methods. Numer. Anal. Multiscale Prob., 325–363 (2012)
Gander, M.J., Zhang, H.: A class of iterative solvers for the Helmholtz equation: factorizations, sweeping preconditioners, source transfer, single layer potentials, polarized traces, and optimized schwarz methods. SIAM Rev. 61(1), 3–76 (2019)
Gander, M.J., Halpern, L., Magoules, F.: An optimized Schwarz method with two-sided Robin transmission conditions for the Helmholtz equation. Int. J. Numer. Meth. Fluids 55(2), 163–175 (2007)
Erlangga, Y.A., Vuik, C., Oosterlee, C.W.: On a class of preconditioners for solving the Helmholtz equation. Appl. Numer. Math. 50(3–4), 409–425 (2004)
Erlangga, Y.A., Vuik, C., Oosterlee, C.W.: Comparison of multigrid and incomplete LU shifted-Laplace preconditioners for the inhomogeneous Helmholtz equation. Appl. Numer. Math. 56(5), 648–666 (2006)
Engquist, B., Ying, L.: Sweeping preconditioner for the Helmholtz equation: moving perfectly matched layers. Multiscale Model. Simul. 9(2), 686–710 (2011). https://doi.org/10.1137/100804644
Davis, T.A., Rajamanickam, S., Sid-Lakhdar, W.M.: A survey of direct methods for sparse linear systems. Acta Numerica 25, 383–566 (2016). https://doi.org/10.1017/S0962492916000076
Amestoy, P.R., Davis, T.A., Duff, I.S.: An approximate minimum degree ordering algorithm. SIAM J. Matrix Anal. Appl. 17(4), 886–905 (1996)
George, A.: Nested dissection of a regular finite element mesh. SIAM J Numer. Anal. 10, 345–363 (1973)
Davis, T.A.: Direct methods for sparse linear systems, vol. 2. SIAM, Philadelphia PA (2006)
Duff, I.S., Erisman, A.M., Reid, J.K.: Direct methods for sparse matrices. Oxford United Kingdom, Oxford (1989)
Bebendorf, M.: Hierarchical matrices. Lecture Notes in Computational Science and Engineering, vol. 63, p. 290. Springer, Berlin (2008). A means to efficiently solve elliptic boundary value problems
Börm, S., Grasedyck, L., Hackbusch, W.: Introduction to hierarchical matrices with applications. Eng. Anal. Boundary Elem. 27(5), 405–422 (2003)
Hackbusch, W.: Hierarchical matrices: algorithms and analysis, vol. 49. Springer, New York NY (2015)
Amestoy, P., Buttari, A., l’Excellent, J.-Y., Mary, T.: On the complexity of the block low-rank multifrontal factorization. SIAM J. Sci. Comput. 39(4), 1710–1740 (2017)
Chávez, G., Turkiyyah, G., Zampini, S., Ltaief, H., Keyes, D.: Accelerated cyclic reduction: a distributed-memory fast solver for structured linear systems. Parallel Comput. 74, 65–83 (2018)
Ghysels, P., Li, X., Rouet, F., Williams, S., Napov, A.: An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling. SIAM J. Sci. Comput. 38(5), 358–384 (2016). https://doi.org/10.1137/15M1010117
Xia, J., Chandrasekaran, S., Gu, M., Li, X.S.: Superfast multifrontal method for large structured linear systems of equations. SIAM J. Matrix Anal. Appl. 31(3), 1382–1411 (2010). https://doi.org/10.1137/09074543X
Gillman, A., Martinsson, P.-G.: A direct solver with O(N) complexity for variable coefficient elliptic PDEs discretized via a high-order composite spectral collocation method. SIAM J. Sci. Comput. 36(4), 2023–2046 (2014)
Pichon, G., Darve, E., Faverge, M., Ramet, P., Roman, J.: Sparse supernodal solver using block low-rank compression. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1138–1147 (2017). IEEE
Levitt, J., Martinsson, P.-G.: Linear-complexity black-box randomized compression of rank-structured matrices (2022)
Lin, L., Lu, J., Ying, L.: Fast construction of hierarchical matrix representation from matrix-vector multiplication. J. Comput. Phys. 230(10), 4071–4087 (2011). https://doi.org/10.1016/j.jcp.2011.02.033
Martinsson, P.-G.: Compressing rank-structured matrices via randomized sampling. SIAM J. Sci. Comput. 38(4), 1959–1986 (2016). https://doi.org/10.1137/15M1016679
Martinsson, P.-G.: A fast randomized algorithm for computing a hierarchically semiseparable representation of a matrix. SIAM J. Matrix Anal. Appl. 32(4), 1251–1274 (2011). https://doi.org/10.1137/100786617
Michielssen, E., Boag, A., Chew, W.C.: Scattering from elongated objects: direct solution in \({O}({N}\log ^{2}{N})\) operations. IEE Proc. Microw. Antennas Propag. 143(4), 277–283 (1996)
Martinsson, P.G., Rokhlin, V.: A fast direct solver for scattering problems involving elongated structures. J. Comput. Phys. 221, 288–302 (2007)
Engquist, B., Zhao, H.: Approximate separability of the Green’s function of the Helmholtz equation in the high frequency limit. Commun. Pure Appl. Math. 71(11), 2220–2274 (2018)
Banjai, L., Hackbusch, W.: Hierarchical matrix techniques for low- and high-frequency Helmholtz problems. IMA J. Numer. Anal. 28(1), 46–79 (2008)
Betcke, T., Wout, E., Gélat, P.: Computationally efficient boundary element methods for high-frequency Helmholtz problems in unbounded domains. Modern Solvers for Helmholtz Problems, 215–243 (2017)
Wang, S., Hoop, M.V., Xia, J.: On 3D modeling of seismic wave propagation via a structured parallel multifrontal direct Helmholtz solver. Geophys. Prospect. 59(5), 857–873 (2011). https://doi.org/10.1111/j.1365-2478.2011.00982.x
Babb, T., Gillman, A., Hao, S., Martinsson, P.-G.: An accelerated Poisson solver based on multidomain spectral discretization. BIT Numer. Math. 58, 851–879 (2018)
Gillman, A., Barnett, A., Martinsson, P.-G.: A spectrally accurate direct solution technique for frequency-domain scattering problems with variable media. BIT Numer. Math. 55(1), 141–170 (2015). https://doi.org/10.1007/s10543-014-0499-8
Hao, S., Martinsson, P.-G.: A direct solver for elliptic PDEs in three dimensions based on hierarchical merging of Poincaré-Steklov operators. J. Comput. Appl. Math. 308, 419–434 (2016)
Beams, N.N., Gillman, A., Hewett, R.J.: A parallel shared-memory implementation of a high-order accurate solution technique for variable coefficient Helmholtz problems. Comput. Math. Appl. 79(4), 996–1011 (2020)
Geldermans, P., Gillman, A.: An adaptive high order direct solution technique for elliptic boundary value problems. SIAM J. Sci. Comput. 41(1), 292–315 (2019). https://doi.org/10.1137/17M1156320
Yesypenko, A., Martinsson, P.-G.: GPU optimizations for the hierarchical Poincaré-Steklov scheme. In: International Conference on Domain Decomposition Methods, pp. 519–528 (2022). Springer
Olver, S., Townsend, A.: A fast and well-conditioned spectral method. SIAM Rev. 55(3), 462–489 (2013). https://doi.org/10.1137/120865458
Fortunato, D., Townsend, A.: Fast Poisson solvers for spectral methods. IMA J. Numer. Anal. 40(3), 1994–2018 (2020)
Fortunato, D., Hale, N., Townsend, A.: The ultraspherical spectral element method. J. Comput. Phys. 436, 110087 (2021)
Aurentz, J.L., Slevinsky, R.M.: On symmetrizing the ultraspherical spectral method for self-adjoint problems. J. Comput. Phys. 410, 109383 (2020)
Fortunato, D.: A high-order fast direct solver for surface PDEs. arXiv preprint arXiv:2210.00022 (2022)
Martinsson, P.-G., Tropp, J.A.: Randomized numerical linear algebra: foundations and algorithms. Acta Numer 29, 403–572 (2020)
Yesypenko, A.: SlabLU: a two-level sparse direct solver for elliptic PDEs in Python. https://doi.org/10.5281/zenodo.11238664
Abdelfattah, A., Costa, T., Dongarra, J., Gates, M., Haidar, A., Hammarling, S., Higham, N.J., Kurzak, J., Luszczek, P., Tomov, S., et al.: A set of batched basic linear algebra subprograms and lapack routines. ACM Transactions on Mathematical Software (TOMS) 47(3), 1–23 (2021)
Ghysels, P., Synk, R.: High performance sparse multifrontal solvers on modern GPUs. Parallel Comput. 110, 102897 (2022)
Abdelfattah, A., Ghysels, P., Boukaram, W., Tomov, S., Li, X.S., Dongarra, J.: Addressing irregular patterns of matrix computations on GPUs and their impact on applications powered by sparse direct solvers. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–14 (2022)
Ghysels, P., Chávez, G., Guo, L., Gorman, C., Li, X.S., Liu, Y., Rebrova, L., Rouet, F.-H., Mary, T., Actor, J.: STRUMPACK
Vuduc, R., Chandramowlishwaran, A., Choi, J., Guney, M., Shringarpure, A.: On the limits of GPU acceleration. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Parallelism, vol. 13 (2010)
Kim, K., Eijkhout, V.: Scheduling a parallel sparse direct solver to multiple GPUs. In: 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, pp. 1401–1408 (2013). IEEE
Bollhöfer, M., Schenk, O., Janalik, R., Hamm, S., Gullapalli, K.: State-of-the-art sparse direct solvers. In: Parallel Algorithms in Computational Science and Engineering, pp. 3–33. Springer, New York NY (2020)
Li, X.S., Shao, M.: A supernodal approach to incomplete LU factorization with partial pivoting. ACM Transactions on Mathematical Software (TOMS) 37(4), 1–20 (2011)
Li, X.S., Demmel, J.W.: SuperLU_DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems. ACM Transactions on Mathematical Software (TOMS) 29(2), 110–140 (2003)
Davis, T.A.: Algorithm 832: UMFPACK V4. 3—an unsymmetric-pattern multifrontal method. ACM Transactions on Mathematical Software (TOMS) 30(2), 196–199 (2004)
Amestoy, P.R., Duff, I.S., L’Excellent, J.-Y., Koster, J.: MUMPS: a general purpose distributed memory sparse solver. In: International Workshop on Applied Parallel Computing, pp. 121–130 (2000). Springer
Bériot, H., Prinn, A., Gabard, G.: Efficient implementation of high-order finite elements for Helmholtz problems. Int. J. Numer. Meth. Eng. 106(3), 213–240 (2016)
Deraemaeker, A., Babuška, I., Bouillard, P.: Dispersion and pollution of the FEM solution for the Helmholtz equation in one, two and three dimensions. Int. J. Numer. Meth. Eng. 46(4), 471–499 (1999)
Martinsson, P.-G.: A direct solver for variable coefficient elliptic PDEs discretized via a composite spectral collocation method. J. Comput. Phys. 242, 460–479 (2013)
Yesypenko, A., Martinsson, P.-G.: Randomized strong recursive skeletonization: simultaneous compression and factorization of \(\cal{H}\)-matrices in the Black-Box Setting. arXiv:2311.01451 (2023)
Acknowledgements
Anna would like to thank her dad, Andriy, for gifting her the RTX-3090 GPU.
Funding
The work reported was supported by the Office of Naval Research (N00014-18-1-2354), by the National Science Foundation (DMS-2313434 and DMS-1952735), and by the Department of Energy ASCR (DE-SC0022251).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Communicated by: Zydrunas Gimbutas
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix 1. Rank property of thin slabs
Appendix 1. Rank property of thin slabs
In this appendix, we prove Proposition 1, which makes a claim on the rank structure of \(\varvec{\textsf{T}}_{11}\), defined in (17).
Proposition 1
(Rank Property) Let \(J_B\) be a contiguous set of points on the slab interface \(I_1\), and let \(J_F\) be the rest of the points \(J_{F} = I_1 \setminus J_{B}\). The sub-matrices \((\varvec{\textsf{T}}_{11})_{BF},\ (\varvec{\textsf{T}}_{11})_{FB}\) of the matrix \(\varvec{\textsf{T}}\) have exact rank at most 2b.
Recall that \(\varvec{\textsf{T}}_{11} = \varvec{\textsf{A}}_{11} - \varvec{\textsf{A}}_{12} \varvec{\textsf{A}}_{22}^{-1} \varvec{\textsf{A}}_{21}\). The proof relies on the sparsity structure of the matrices in the Schur complement. As stated in the proposition, the slab interface \(I_1\) is partitioned into indices \(J_B\) and \(J_F\). The proof relies on partitioning \(I_2\) as well, into the indices \(J_{\alpha },J_{\beta }, J_{\gamma }\) shown in Fig. 12, where \(|J_{\gamma }| = 2b\).
The matrix \(\varvec{\textsf{A}}_{22}\) is sparse and can be factorized as
The formula for \({\left( \varvec{\textsf{T}}_{11} \right) }_{FB}\) can be re-written as
The factors \(\varvec{\textsf{X}}_{F2}\) and \(\varvec{\textsf{Y}}_{2B}\) have sparse structure, due the sparsity in the factorization (A1) and the sparsity of \(\varvec{\textsf{A}}_{F2}\) and \(\varvec{\textsf{A}}_{2B}\).
The factors \(\varvec{\textsf{X}}_{F2}\) and \(\varvec{\textsf{Y}}_{2B}\) have the same sparsity pattern as \(\varvec{\textsf{A}}_{F2}\) and \(\varvec{\textsf{A}}_{2B}\), respectively. As a result,
Similar reasoning can be used to show the result for \(\left( \varvec{\textsf{T}}_{11} \right) _{BF}\).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yesypenko, A., Martinsson, PG. SlabLU: a two-level sparse direct solver for elliptic PDEs. Adv Comput Math 50, 90 (2024). https://doi.org/10.1007/s10444-024-10176-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10444-024-10176-x
Keywords
- Direct solver
- Sparse direct solver
- Randomized linear algebra
- Multifrontal solver
- High-order discretization
- GPU
- Helmholtz equation