Keywords

1 Introduction

Solving large problems on a computer often involves the use of huge computational resources; therefore, for solving large problems, computing systems with distributed memory are used, as well as special numerical approximation methods for dense matrices. In this paper, using the example of solving the problem of diffraction of electromagnetic waves, the features of the application of low-rank approximation methods for supercomputers are presented.

Solving problem of diffraction of electromagnetic waves is equivalent to solving a hypersingular integral equation (as in [2,3,4]). A low-rank approximation method has been developed for supercomputers in order to solve a system with a large dense matrix. The system with the compressed matrix is solved by the GMRES method. In order to reduce the number of iterations in an iterative algorithm in this paper is proposed a preconditioner using a mosaic structure of the original matrix.

2 Electrodynamics Problem

Let us consider the diffraction problem on a perfectly conducting surface \(\varSigma \), which can be either closed or open.

A monochrome wave with a frequency \(\omega \) satisfies the Maxwell equations,

$$\begin{aligned} \nabla \times \varvec{E} = i \mu \mu _0\omega \varvec{H}; \nabla \times \varvec{H} = -i \varepsilon \varepsilon _0\omega \varvec{E}. \end{aligned}$$

On a perfectly conducting surface the following boundary condition holds,

$$\begin{aligned} \varvec{n}\times (\varvec{E}_0+\varvec{E}) = 0, \end{aligned}$$

where \(\varvec{E}_0\) is a given function, defined by the incident wave (we assume that the incident wave is planar), and \(\varvec{n}\) is a normal vector to the surface.

To find a unique solution it is necessary to pose additional conditions

$$\begin{aligned} \varvec{E}\in L_2^{\text{ loc }}(\varOmega ) \end{aligned}$$

and

$$\begin{aligned} \frac{d}{d\tau }\left( \begin{array}{c} \varvec{E}\\ \varvec{H} \end{array}\right) - ik \left( \begin{array}{c}\varvec{E}\\ \varvec{H} \end{array}\right) = o\left( \frac{1}{|\varvec{x}|}\right) ,\ \tau = |\varvec{x}|,\ \tau \rightarrow \infty . \end{aligned}$$

In accordance with [2], the problem can be reduced to the electric field integral equation on the unknown \(\varvec{j}(y)\):

$$\begin{aligned} \varvec{n}\times \iint \limits _{\varSigma } \varvec{j}(y) \Bigl ( \text {grad } \mathrm{div}\,F(x-y)+k^2 F(x-y)\Bigr )d\sigma _y = - \varvec{n}\times \varvec{E}_0(x),\ x\in \varSigma , \end{aligned}$$
(1)

where \(k=\omega \sqrt{\varepsilon \varepsilon _0\mu \mu _0}\) is the wave number, and

$$\begin{aligned} F(R)=\frac{\exp {(ikR)}}{R},\ R=|x-y|. \end{aligned}$$

In Eq. (1) the integral can be understood in the sense of the Hadamard finite part.

For the numerical solution of the Eq. (1) we use a numerical scheme presented in [5]. In this scheme the surface is uniformly divided into cells \(\sigma _i,\) \(i=1,n\), and for each cell an orthonormal basis \(\varvec{e}_{i1}\), \(\varvec{e}_{i2}\) is introduced. For each cell \(\sigma _i\) it is assumed that \(\varvec{j}_i=\varvec{j}(x_i)\), where \(x_i\) is the center of mass of the cell. Each cell is considered to be planar. Discretization of the integral operator produces a matrix that consists of \(2\times 2\) blocks:

$$\begin{aligned} \nonumber&A_{ij} = \begin{pmatrix} \varvec{E}_{1j}(x_i)\cdot \varvec{e}_{1i}&{} \varvec{E}_{2j}(x_i)\cdot \varvec{e}_{1i}\\ \varvec{E}_{1j}(x_i)\cdot \varvec{e}_{2i}&{} \varvec{E}_{2j}(x_i)\cdot \varvec{e}_{2i} \end{pmatrix}, \\&\varvec{E}_{1j}(x_i)=\int \limits _{\partial \sigma _j} \varvec{Q}(x_i) de_2 + k^2 \varvec{e}_{1j} \int \limits _{\sigma _j}\frac{\exp {(ikR)}}{R} d\sigma ;\\ \nonumber&\varvec{E}_{2j}(x_i)=-\int \limits _{\partial \sigma _j} \varvec{Q}(x_i) de_1 + k^2 \varvec{e}_{2j} \int \limits _{\sigma _j}\frac{\exp {(ikR)}}{R} d\sigma , \varvec{Q}(x) = \nabla _y\frac{\exp {(ik|x-y|)}}{|x-y|}. \end{aligned}$$
(2)

In (2) the contour and surface integrals are calculated numerically.

3 Mosaic Skeleton Approximations

The problem reduces to the solution of the linear system of algebraic equations

$$\begin{aligned} A\, z = b \end{aligned}$$
(3)

with a dense matrix A. To approximate the matrix we use the mosaic-skeleton method [6,7,8]. It partitions the matrix hierarchically into blocks, and the low-rank matrix blocks can be calculated independently using the incomplete cross approximation algorithm.

Let us investigate the approximation algorithm. In all examples below the surface \(\varSigma \) in Eq. (1) is a round cylinder with a diameter 15 cm and height 25 cm.

In Fig. 1 we present the inverse Radar Cross Section (RCS) for the frequency 16 GHz. The \(\sigma \) value for different directions \(\tau \) of the wave vectors of the incident wave is calculated as

$$\begin{aligned} \sigma (\tau ) = \frac{4\pi }{|\varvec{E}_0|^2}\left| \sum \limits _{i=1}^n \left( \varvec{j}_i-\tau \cdot (\tau \cdot \varvec{j}_i)\right) k^2\exp {(-ik\tau \cdot x_i)}\sigma _i \right| ^2. \end{aligned}$$
(4)

The black points show the results of the experiment, the grey line shows the results of the numerical simulation.

In all calculations the number of cells is 192 156, the number of right-hand sides is 2048, the accuracy of approximation of the matrix is \(10^{-3}\), the accuracy of solving system is \(5\cdot 10^{-3}\).

The accuracy of the matrix approximation and the accuracy of the solution of the system is determined by comparing the RSC obtained from experiments and from approximate calculations for a large number of problems.

Fig. 1.
figure 1

RCS, 16 GHz, vertical polarization, \(n=192\,156\)

In Table 1 one can see the number of necessary iterations for solving the system with 2048 right-hand sides up to the accuracy \(5\cdot 10^{-3}\) for different frequencies and numbers of cells n.

Table 1. Number of iterations for various parameters of the electrodynamics problem.

It can be seen from Table 1 that the number of iterations increases significantly with respect to the frequency, and it requires a lot of memory and computational time.

4 Preconditioner

Let us reduce the number of iterations in the GMRES method using the preconditioner for solving the system (3). The technology for building and using the preconditioner is described in the monograph [9].

The matrix is divided into blocks according to the mosaic-skeleton approximations method. The blocks corresponding to the interaction of distant domains are assumed to have low rank. All elements of the remaining blocks are calculated. Non-low-rank blocks will be called dense. For low-rank blocks, an approximation is constructed with a given accuracy [6, 10]. Suppose that for a block of size \(m\times n\), an approximation of rank r is constructed. If the \(m\cdot n < r(m+n)\) condition is satisfied for a block with m, n, r parameters, then such block is also assumed to be dense. Dense blocks are shown in red in Fig. 2.

Fig. 2.
figure 2

Mosaic matrix structure (Color figure online)

Fig. 3.
figure 3

Nonzero blocks of the preconditioner (Color figure online)

A matrix from the mosaic-skeleton representation is selected as the M preconditioner of the system (3), in which the only dense blocks are left and the low-rank blocks are assumed to be zero (see Fig. 3). The M matrix is sparse; therefore, the LU decomposition is quickly built for this matrix. The LU decomposition factors are also sparse, so small amount of RAM is needed to store the L and U matrices, and the system with a preconditioner is quickly solved:

$$\begin{aligned} M\, y = c. \end{aligned}$$
(5)

The construction of LU decomposition for sparse systems on multiprocessor computing systems is implemented in many packages. In this paper, the author used the MUMPS package [11].

Let us apply MUMPS to solve the system (3):

  1. 1.

    the original A matrix is represented in a low-rank format, and the structure of the matrix and the location of dense blocks in it are determined;

  2. 2.

    information about the found dense blocks is collected on a single processor by the MUMPS package, the distribution of the calculated elements of the M matrix by processors is determined;

  3. 3.

    the M preconditioner is calculated;

  4. 4.

    the LU decomposition of a sparse matrix is calculated in parallel using the MUMPS package;

  5. 5.

    in order to solve the (3) system by the GMRES method using the MUMPS package and the LU decomposition of the M matrix, the system (5) is solved at each iteration.

The MUMPS package provides for the use of different methods for constructing LU decomposition of sparse systems. In this paper, the authors used the method that leads to the least full L and U matrices [11].

5 Numerical Results

Table 2 shows the effectiveness of the preconditioner for solving systems of different sizes N. In the table \(p_1\) is the number of iterations without a preconditioner, \(p_2\) is the number of iterations with a preconditioner. The frequency in the example is 8 GHz.

Table 2. Number of iterations for solving of system without using a preconditioner and with using a preconditioner.
Fig. 4.
figure 4

RCS, 32 GHz, horizontal polarization, \(N = 1\,507\,200\) (Color figure online)

Fig. 5.
figure 5

RCS, 64 GHz, horizontal polarization, \(N = 2\,006\,928\) (Color figure online)

It is possible to construct RCS for cylinder on supercomputers without using a preconditioner only for frequency equal to the 16 GHz (see Fig. 1). Preconditioner allows to obtain a solution of the integral equation and \(\sigma (\tau )\) (4) for 32 GHz and 64 GHz.

In Figs. 4, 5, the RCS obtained from the solution of the integral equation (IE) is shown in black, the RCS obtained by the physical optics method (FO) [12] is shown in red.

In this section we used “Zhores” supercomputer installed at Skolkovo Institute of Science and Technology [1]. The work was supported by the Russian Science Foundation, grant 19-11-00338.