Keywords

1 Introduction

Contemporary antenna design strongly relies on electromagnetic (EM) simulations [1]. For accurate evaluation of responses, antenna models should account for environmental features such as the installation fixture, connectors, and housing. Contemporary computational techniques—implemented in commercial simulation packages—are capable of obtaining quite accurate reflection and radiation antenna responses. However, full-wave simulations of realistic models are computationally expensive, and simulation even for a single combination of design parameters may take up to several hours. This computational cost poses a significant problem for antenna design.

A task of automated adjustment of antenna parameters can be formulated as an optimization problem with the objective function supplied by an EM solver [2]. However, most conventional optimization techniques—both gradient-based [3], e.g., conjugate-gradient, quasi-Newton, sequential quadratic programming, and derivative-free [4], e.g., Nelder–Mead and pattern search techniques—require large numbers of design simulations, each of which is already computationally expensive. As a consequence, the direct use of the EM solver to evaluate the high-fidelity antenna model in the optimization loop is often impractical due to the unacceptably high computational cost. Other obstacles for successful application of conventional optimization techniques to antenna design originate from the poor analytical properties of simulation-based objective functions (e.g., discontinuity, numerical noise). As a result, the practice of simulation-driven antenna design relies on repetitive parameter sweep. While this approach can be more reliable than brute-force antenna optimization, it is very laborious and time-consuming, and it does not guarantee optimal results. Also, only antenna designs with a limited number of parameters can be handled this way.

Adjoint sensitivities can substantially speed up microwave design optimization while using gradient-based algorithms [5] and [6]; however, adjoint sensitivities are not yet widespread in commercial EM solvers. Only CST Microwave Studio [7] and HFSS [8] have recently implemented this feature. Also, the use of adjoint sensitivities is limited by the numerical noise of the response.

Population-based techniques (metaheuristics) have recently become popular in solving certain antenna-design-related tasks [9, 10]. Methods such as genetic algorithms [11], particle swarm optimizers [12], or ant colony optimization [13] can alleviate certain problems (e.g., getting stuck in the local optimum). However, these methods are mainly applicable if objective function evaluation is very fast, for example, for synthesis of antenna array patterns [14]. The use of such techniques for simulation-based antenna design is questionable due to the large number of model evaluations required by metaheuristics.

In recent years, there has been a growing interest in surrogate-based optimization (SBO) methods [1517], where direct optimization of the CPU-intensive full-wave EM model is replaced by iterative updating and reoptimization of a cheap and yet reasonably accurate representation of the antenna structure under consideration, called the surrogate model. There are many techniques exploiting both approximation surrogates, e.g., neural networks [18, 19], support vector regression [20, 21], radial basis functions [22], kriging [23, 24], fuzzy systems [25], and rational approximations [26], as well as physics-based surrogates (space mapping [15, 2729], simulation-based tuning [3032], manifold mapping [33], and shape-preserving response prediction [34]). Approximation models are fast and universal; however, they are associated with a high initial cost (due to sampling of the design space and acquiring EM simulation data) and they are typically not suitable for ad hoc optimization. Techniques exploiting physics-based surrogates are particularly attractive because they are capable of yielding a satisfactory design using a very limited number of expensive high-fidelity EM simulations [15].

One of the most important assumptions to ensure efficiency of the SBO techniques exploiting physics-based surrogates is that the underlying low-fidelity model is computationally cheap. The most prominent technique of this kind is space mapping [34]. It originated in the area of microwave filter design, where this assumption is naturally satisfied with circuit equivalents [15] serving as low-fidelity models for filters. In the case of antennas, physics-based surrogates can be obtained from coarse-discretization EM simulations, as this is the only versatile way to create lower-fidelity antenna models. Unfortunately, these models may be relatively expensive. As a result, their evaluation cost cannot be neglected and may contribute considerably to the overall design expenses.

Therefore, the proper choice of the surrogate model fidelity (controlled, among other things, by the mesh density) is of great significance. On one hand, using a coarser low-fidelity model allows us to reduce its evaluation time; on the other hand, the coarser models are less accurate. As a result, a large number of iterations of the SBO algorithm may be necessary to yield a satisfactory design so that the total cost may be about the same or even higher than the total cost of an optimization algorithm employing only the finer model. Also, the SBO process may simply fail if the underlying low-fidelity model is not sufficiently accurate. For finer models, the individual evaluation time may be higher, but this is not directly translated into a higher total design cost because a smaller number of iterations may be sufficient to find a good design. In general, finding a good trade-off between the low-fidelity model speed and accuracy is not obvious.

In this chapter, we will review antenna design using physics-based surrogates originating from the coarse-mesh models. We also study the importance of a proper selection of the antenna model fidelity and its influence on performance of the surrogate-based design process in terms of the computational cost and design quality. Furthermore, we investigate the potential benefits of using several models of different fidelity in the same optimization run.

2 Surrogate-Based Design Optimization of Antennas

In this section, we consider a number of antenna design examples. In every example we describe the antenna structure under design, formulate the design problem, and outline the SBO technique that seems to be the most suitable to handle that particular antenna of interest. Results as well as design computational costs are provided.

2.1 Optimization of a Microstrip Composite Antenna Using the Multi-Fidelity Optimization Technique

Consider the composite microstrip antenna [35] shown in Fig. 1. The design variables are x=[l 1 l 2 l 3 l 4 w 2 w 3 d 1 s]T. The multilayer substrate is l s ×l s (l s =30 mm). The antenna stack comprises a metal ground, RO4003 dielectric, signal trace, RO3006 dielectric with a through via connecting the trace to the driven patch, the driven patch, RO4003 dielectric, and four extra patches. The signal trace is terminated with an open-end stub. Feeding is with a 50 ohm SMA connector. The stack is fixed with four through bolts at the corners.

Fig. 1
figure 1

Microstrip antenna [35]: top and side views, substrates shown transparent

The final design is required to satisfy |S 11|≤−10 dB for 3.1–4.8 GHz. The IEEE gain is required to be not less than 5 dB for the zero zenith angle over the whole frequency band of interest.

In this example, the antenna under design is of relatively complex composition; therefore, the choice of the mesh density for the coarse discretization model as well as other settings of the EM solver, here the CST MWS transient solver, strongly affect the total design optimization time. On the other hand, the computational cost of the model and its accuracy can be easily controlled by changing the discretization density. This feature has been exploited in the multi-fidelity optimization algorithm introduced in [36].

The multi-fidelity optimization is based on a family of coarse-discretization models {R c.j }, j=1,…,K, all evaluated by the same EM solver. Discretization of the model R c.j+1 is finer than that of the model R c.j , which results in better accuracy but also a longer evaluation time. In practice, the number of coarse-discretization models, K, is two or three.

When we have the optimized design x (K) of the finest coarse-discretization model R c.K , the model is evaluated at all perturbed designs around x (K), i.e., at \(\boldsymbol{x}_{k}^{(K)}= [x_{1}^{(K)} \cdots x _{k}^{(K)} + \mathrm{sign}(k)\cdot d_{k} \cdots x_{n}^{(K)}]^{T}\), k=−n,−n+1,…,n−1,n. This data can be used to refine the final design without directly optimizing R f . Instead, an approximation model is set up and optimized in the neighborhood of x (K) defined as [x (K)d,x (K)+d], where d=[d 1 d 2d n ]T, and R (k) stands for \(\boldsymbol{R}_{ c. K}(\boldsymbol{x}_{k}^{(K)})\). The size of the neighborhood can be selected based on a sensitivity analysis of R c.1 (the cheapest of the coarse-discretization models); usually d equals 2 to 5 percent of x (K). A reduced quadratic model q(x)=[q 1 q 2q m ]T is used for approximation, where

$$\begin{aligned} q_{j}(\boldsymbol{x}) =& q_{j}\bigl([x_{1}\ldots x_{n}]^{T}\bigr) \\=& \lambda_{j.0} + \lambda _{j.1}x_{1} +\cdots+ \lambda_{j.n}x_{n} + \lambda_{j.n + 1}x_{1}^{2} +\cdots+ \lambda _{j.2n}x_{n}^{2}. \end{aligned}$$
(1)

Coefficients λ j.r , j=1,…,m, r=0,1,…,2n, can be uniquely obtained by solving the linear regression problem.

In order to account for unavoidable misalignment between R c.K and R f , it is recommended to optimize a corrected model q(x)+[R f (x (K))−R c.K (x (K))] that ensures a zero-order consistency [37] between R c.K and R f . The refined design can then be found as

$$ \boldsymbol{x}^{*} = \arg \min_{\boldsymbol{x}^{(K)} - \boldsymbol{d} \le \boldsymbol{x} \le \boldsymbol{x}^{(K)} + \boldsymbol{d}}U\bigl( \boldsymbol{q}(\boldsymbol{x}) + \bigl[\boldsymbol{R}_{f}\bigl( \boldsymbol{x}^{(K)}\bigr) - \boldsymbol{R}_{c.K}\bigl( \boldsymbol{x}^{(K)}\bigr)\bigr]\bigr). $$
(2)

This kind of correction is also known as output space mapping [15]. If necessary, step (2) can be performed a few times starting from a refined design where each iteration requires only one evaluation of R f .

The multi-fidelity optimization procedure can be summarized as follows (the input arguments are initial design x (0) and the number of coarse-discretization models K):

  1. 1.

    Set j= 1;

  2. 2.

    Optimize coarse-discretization model R c.j to obtain a new design x (j) using x (j−1) as a starting point;

  3. 3.

    Set j=j+ 1; if j<K go to 2;

  4. 4.

    Obtain a refined design x as in (2);

  5. 5.

    END

Note that the original model R f is only evaluated at the final stage (step 4). The operation of the algorithm in illustrated in Fig. 2. Coarse-discretization models can be optimized using any available algorithm.

Fig. 2
figure 2

Operation of the multi-fidelity design optimization procedure for three coarse-discretization models (K=3). The design x (j) is the optimal solution of the model R c.j , j=1, 2, 3. A reduced second-order model q is set up in the neighborhood of x (K) (gray area). The final design x is obtained by optimizing the model q as in (2)

Application of the multi-fidelity optimization methodology to this example can be outlined as follows. The initial design is set to x (0)=[15 15 15 20 −4 2 2]T mm. Two coarse-discretization models are used:R c.1 (122,713 mesh cells at x (0)) and R c.2 (777,888 mesh cells). The evaluation times for R c.1, R c.2, and R f (2,334,312 mesh cells) are 3 min, 18 min, and 160 min at x (0), respectively. |S 11| is the objective function with the goal of |S 11|≤−10 dB for 3.1–4.8 GHz. An IEEE gain of not less than 5 dB for the zero elevation angle over the band is implemented as an optimization constraint.

Figure 3(a) shows the responses of R c.1 at x (0) and at its optimal design x (1). Figure 3(b) shows the responses of R c.2 at x (1) and at its optimized design x (2). Figure 3(c) shows the responses of R f at x (0), at x (2), and at the refined design x =[14.87 13.95 15.4 13.13 20.87 −5.90 2.88 0.68]T mm (|S 11|≤−11.5 dB for 3.1 GHz to 4.8 GHz) obtained in two iterations of the refinement step (2).

Fig. 3
figure 3

Microstrip antenna: (a) model R c.1 at the initial design x (0) (- - -) and at the optimized design x (1) (—); (b) model R c.2 at x (1) (- - -) and at its optimized design x (2) (—); (c) model R f at x (0) (⋅⋅⋅ ⋅), at x (2) (- - -), and at the refined final design x (—) [38]

The design cost, shown in Table 1, corresponds to about 12 runs of the high-fidelity model R f . The antenna gain at the final design is shown in Fig. 4.

Fig. 4
figure 4

Microstrip antenna gain [dBi] of the final design at 3.5 GHz (⋅ − ⋅), 4.0 GHz (– –), and 4.5 GHz (—): (a) co-pol. in the E-plane (XOZ), with connector at 90 on the right; (b) x-pol., primary (thick lines) and co-pol. (thin lines) in the H-plane [39]

Table 1 Microstrip antenna: optimization cost [38]

2.2 Optimization of a Broadband Dielectric Resonator Antenna Using the Adaptively Adjusted Design Specifications Technique

Consider the rotationally symmetric dielectric resonator antenna (DRA) [40] shown in Fig. 5. It comprises two annular ring dielectric resonators (DRs) with a relative permittivity, ε r1, of 36; two supporting Teflon rings; a probe; and a cylindrical Teflon filling. The inner radius of the filling is the radius of the probe, 1.27 mm. The probe is an extension (h 0 above the ground) of the inner conductor of the input 50 ohm coaxial cable. The radius of each supporting ring equals that of the DR above it. All metal parts are modeled as perfect electric conductors (PECs). The coax is also filled by Teflon. The ground is of infinite extent.

Fig. 5
figure 5

Annular ring dielectric resonator antenna [40]: side view

The design variables are the inner and outer radii of the DRs, heights of the DRs and the supporting rings, and the probe length, namely, x=[a 1 a 2 b 1 b 2 h 1 h 2 g 1 g 2 h 0]T. The design objective is |S 11|≤−20 dB for 4 GHz to 6 GHz. A broadside gain of not less than 5 dBi is an optimization constraint.

Here, the overall shape of the low- and high-fidelity model responses is quite similar; therefore we use the adaptively adjusted design specifications (AADS) technique [41] that allows us to account for the misalignment between the models without actually adjusting the low-fidelity one. AADS consists of the following two steps that can be iterated if necessary:

  1. 1.

    Modify the original design specifications to account for the difference between the responses of the high-fidelity model R f and the coarse-discretization model R cd at their characteristic points.

  2. 2.

    Obtain a new design by optimizing the low-fidelity model R cd with respect to the modified specifications.

As R cd is much faster than R f , the design process can be performed at low cost compared to direct optimization of R f . Figure 6 explains the idea of AADS using a bandstop filter example [41]. The design specifications are adjusted using characteristic points that should correspond to the design specification levels. They should also include local maxima/minima of the responses at which the specifications may not be satisfied.

Fig. 6
figure 6

AADS concept (responses of R f (—) and R cd (- - -)) [41]: (a) responses at the initial design and the original design specifications, (b) characteristic points of the responses corresponding to the specification levels (here, −3 dB and −30 dB) and to the local maxima, (c) responses at the initial design as well as the modified design specifications. The modification accounts for the discrepancy between the models so that optimizing R cd with respect to the modified specifications approximately corresponds to optimizing R f with respect to the original specifications

It should be emphasized again that for the AADS technique there is no surrogate model configured from R cd—the discrepancy between R cd and R f is “absorbed” by the modified design specifications.

Figure 6(b) shows the characteristic points of R f and R cd. The design specifications are modified (mapped) so that the level of satisfying/violating the modified specifications by the R cd response corresponds to the satisfaction/violation levels of the original specifications by the R f response (Figs. 6(b) and (c)). R cd is subsequently optimized with respect to the modified specifications, and the new design obtained this way is treated as an approximated solution to the original design problem. Typically, a substantial design improvement is observed after the first iteration. Additional iterations can bring further improvement.

The initial design is x init=[6.9 6.9 1.05 1.05 6.2 6.2 2.0 2.0 6.80]T. The high- and low fidelity models are evaluated using CST Microwave Studio (R f : 829,000 meshes at x init, evaluation time 58 min, R cd: 53,000 meshes at x init, evaluation time 2 min).

The optimized design is found to be x =[5.9 5.9 1.05 1.55 7.075 7.2 4.51.0 8.05]T. It is obtained with three iterations of the AADS procedure. Significant improvement of the DRA’s bandwidth is observed; the 48 % fractional bandwidth at −20 dB is shown in Fig. 7. The far-field response of the optimized DRA, shown in Fig. 8 at selected frequencies, stays at TM01δ DRA mode behavior over the 60 % bandwidth (on the −10 dB level). The total design cost is equivalent to about 11 evaluations of the high-fidelity DRA model. The design cost budget is listed in Table 2.

Fig. 7
figure 7

DRA: fine model response at the initial (- - -) and the optimized design (—)

Fig. 8
figure 8

DRA at the optimal design: gain in the elevation plane at 3.5 GHz (thick —), 4 GHz (thick – –), 4.5 GHz (thick ⋅−⋅), 5 GHz (thin ), 5.5 GHz (thin – –), and 6 GHz (thin ⋅−⋅)

Table 2 DRA: design optimization costs

2.3 Design of UWB Antenna Using the Shape-Preserving Response Prediction Technique

Consider the planar antenna shown in Fig. 9. It consists of a planar dipole as the main radiator element and two additional strips. The design variables are x=[l 0 w 0 a 0 l p w p s 0]T. Other dimensions are fixed as a 1=0.5 mm, w 1=0.5 mm, l s =50 mm, w s =40 mm, and h=1.58 mm. The substrate material is Rogers RT5880.

Fig. 9
figure 9

UWB dipole antenna geometry: top and side views. The dash-dotted lines show the electric (YOZ) and the magnetic (XOY) symmetry walls. The 50 ohm source impedance is not shown

The high-fidelity model R f of the antenna structure (10,250,412 mesh cells at the initial design, evaluation time of 44 min) is simulated using the CST MWS transient solver. The design objective is to obtain |S 11|≤−12 dB for 3.1 GHz to 10.6 GHz. The initial design is x init=[20 10 1 10 82]T mm. The low-fidelity model R cd is also evaluated in CST but with coarser discretization (108,732 cells at x init, evaluated in 43 s).

For this example, the shapes of the low- and high-fidelity model response are similar (cf. Fig. 11(a)), which allows us to use the shape-preserving response prediction (SPRP) technique [34] as the optimization engine. SPRP, unlike some other SBO techniques including space mapping, does not use any extractable parameters. As a result SPRP is typically very efficient: in many cases only two or three iterations are sufficient to yield a satisfactory design. SPRP assumes that the change of the high-fidelity model response due to the adjustment of the design variables can be predicted using the actual changes of the low-fidelity model response. Here, this property is ensured by the low-fidelity model being the coarse-mesh simulation of the same structure that represents the high-fidelity model.

The change of the low-fidelity model response can be described by the translation vectors corresponding to what are called the characteristic points of the model’s response. These translation vectors are subsequently used to predict the change of the high-fidelity model response with the actual response of R f at the current iteration point, R f (x (i)), treated as a reference.

Figure 10(a) shows an example low-fidelity model response, |S 11| versus frequency, at the design x (i), as well as the coarse model response at some other design x. Circles denote characteristic points of R c (x (i)), selected here to represent |S 11|=−10 dB, |S 11|=−15 dB, and the local |S 11| minimum. Squares denote corresponding characteristic points for R c (x), while line segments represent the translation vectors (“shift”) of the characteristic points of R c when changing the design variables from x (i) to x.

Fig. 10
figure 10

SPRP concept. (a) Low-fidelity model response at the design x (i), R c (x (i)) (solid line), the low-fidelity model response at x, R c (x) (dotted line), characteristic points of R c (x (i)) (circles) and R c (x) (squares), and the translation vectors (short lines). (b) High-fidelity model response at x (i), R f (x (i)) (solid line) and the predicted high-fidelity model response at x (dotted line) obtained using SPRP based on characteristic points of (a); characteristic points of R f (x (i)) (circles) and the translation vectors (short lines) were used to find the characteristic points (squares) of the predicted high-fidelity model response. (c) Low-fidelity model responses R c (x (i)) and R c (x) are plotted using thin solid and dotted line, respectively

The high-fidelity model response at x can be predicted using the same translation vectors applied to the corresponding characteristic points of the high-fidelity model response at x (i), R f (x (i)). This is illustrated in Fig. 10(b). Figure 10(c) shows the predicted high-fidelity model response and the actual high-fidelity model response at x. A rigorous and more detailed formulation of the SPRP technique can be found in [42].

For this example, the approximate optimum of R cd, x (0)=[18.66 12.98 0.52613.717 8.00 1.094]T mm, is found as the first design step. The computational cost is 127 evaluations of R cd, which corresponds to about two evaluations of R f . Figure 11(a) shows the reflection responses of R f at both x init and x (0), as well as the response of R cd at x (0).

Fig. 11
figure 11

UWB dipole antenna reflection response: (a) high-fidelity model response (dashed line) at the initial design x init, and high- (solid line) and low-fidelity (dotted line) model responses at the approximate low-fidelity model optimum x (0); (b) high-fidelity model |S 11| at the final design

The final design x (2)=[19.06 12.98 0.426 13.52 6.80 1.094]T mm (|S 11|≤−13.5 dB for 3.1 GHz to 10.6 GHz) is obtained after two iterations of the SPRP-based optimization with the total cost corresponding to about seven evaluations of the high-fidelity model (see Table 3). Figure 11(b) shows the reflection response and Fig. 12 shows the gain response of the final design x (2).

Fig. 12
figure 12

UWB dipole at the final design: IEEE gain pattern (×-pol.) in the XOY plane at 4 GHz (thick solid line), 6 GHz (dash-dotted line), 8 GHz (dashed line), and 10 GHz (solid line)

Table 3 UWB dipole antenna: optimization cost

2.4 Design of a Planar Antenna Array Using a Combination of Analytical and Coarse-Discretization Electromagnetic Models

The design of two-dimensional antenna arrays requires full-wave simulations, each of which is time-consuming due to the complexity and size of the antenna array under design as well as the electromagnetic (EM) interaction within the structure. Typically many EM simulations are necessary in the design process of a realistic array. Moreover, array design normally involves a large number of design variables, such as dimensions of the array elements, element spacing, location of feeds, excitation amplitudes and/or phases, and dimensions of the substrate and ground [1].

The array model based on the single element radiation response combined with the analytical array factor [43] cannot account for interelement coupling. In addition, this model produces inaccurate radiation responses in the directions off the main beam. Therefore, discrete EM simulations of the entire array are required; however, these simulations are computationally expensive when accurate. Consequently, using numerical optimization techniques to conduct the design process may be prohibitively expensive in terms of the CPU time. The use of coarse discretization for the whole array model can substantially relieve the computational load. However, the responses of such coarse-mesh models are typically noisy and often discontinuous, so that the optimization algorithm needs more objective function calls to find an improvement or it can even fail.

In order to reduce the computational cost of the array optimization process and make it robust, we apply surrogate-based optimization (SBO) [44] where we use an analytical model of the planar array embedding the simulated radiation response of a single array element, a coarse-discretization model of the entire array, and a fine model of the entire array. The design optimization example presented below describes and illustrates this approach.

Consider a planar microstrip array (Fig. 13) comprising 25 identical microstrip patches. The array is to operate at 10 GHz and have a linear polarization. Each patch is fed by a probe in the 50 ohm environment. The design tasks are as follows: to keep the lobe level below −20 dB for zenith angles off the main beam with a null-to-null width of 34, i.e., off the sector of [−17, 17]; to maintain the peak directivity of the array at about 20 dBi; to have the direction of the maximum radiation perpendicular to the plane of the array; to have returning signals lower than −10 dB, all at 10 GHz. The initial dimensions of the elements, the microstrip patches, are 11 mm by 9 mm; a grounded layer of 1.58 mm thick RT/duroid 5880 is the substrate; the lateral extension of the substrate/metal ground is set to a half of the patch size in a particular direction. The locations of the feeds at the initial design are at the center of the patch in the horizontal direction and 2.9 mm up off the center in the vertical direction, referring to Fig. 13(a). The symmetry of the array EM models is imposed as shown in Fig. 13(a).

Fig. 13
figure 13

Microstrip antenna array. (a) Front view of the EM models R f and R cd. The symmetry (magnetic) plane is shown with the vertical dashed line. (b) Analytical model of the planar array embedding the simulated radiation response of a single array element

The use of discrete EM models of the entire array is unavoidable here for several reasons, including the effect of element coupling on the reflection response and the requirement of minor lobe suppression. In the same time the evaluation time of the high-fidelity model of the array, R f , is around 20 minutes using the CST MWS transient solver, which makes its direct optimization impractical.

Even though we impose a symmetry on the array model and, therefore, restrict ourselves to adjusting distances between array components (x t1,x t2,y t1,y t2), patch dimensions (x 1,y 1), and the amplitudes (a 1,…a 15) and/or phases (b 1,…b 15) of the incident excitation signals, the number of design variables is still large for simulation-based design optimization. Therefore, we consider two design optimization cases: a design with nonuniform amplitude (and uniform phase) excitation where the design variables are x=[x t1 x t2 y t1 y t2 x 1 y 1 a 1a 15]T and a design with nonuniform phase (uniform amplitude) excitation with x=[x t1 x t2 y t1 y t2 x 1 y 1 b 1b 15]T.

To evaluate the response of the array under design we adopt the following three EM models for it: a high-fidelity discrete EM model of the entire array, R f ; a coarse-discretization EM model of the entire array R cd which is essentially a coarse-mesh version of R f (evaluation time of R cd is about 1 min); and an analytical model of the array radiation pattern, R a outlined in Fig. 13(b), which embeds the simulated radiation response of the single microstrip patch antenna. The use of these models in a developed SBO procedure is described in the following section.

Due to the high computational cost of evaluating the array, the design process exploits the SBO approach [45], where direct optimization of the array pattern is replaced by iterative correction and adjustment of the auxiliary models R a and R cd, described in the previous section.

The design procedure consists of the following two major stages:

Stage 1 (pattern optimization): In this stage, the design variables x are optimized in order to reduce the side low level according to the specifications. Starting from the initial design x (0), the first approximation x (1) of the optimum design is obtained by optimizing the analytical model R a . Further approximations x (i), i=2,3,…, are obtained as x (i)=argmin{x:R a (x)+[R f (x (i−1))−R a (x (i−1))]}, i.e., by optimizing the analytical model R a corrected using output space mapping [46] so that it matches the high-fidelity model exactly at the previous design x (i−1). In practice, only two iterations are usually necessary to yield a satisfactory design. Note that each iteration of the above procedure requires only one evaluation of the high-fidelity model R f .

Stage 2 (reflection adjustment): In this stage, the coarse-discretization model R cd is used to correct the reflection of the array. Although we use the term “reflection response” and |S k | referring to returning signals at the feed points (ports), these signals include the effect of coupling due to simultaneous excitation of the elements.

In practice, after optimizing the pattern, the reflection responses are slightly shifted in frequency so that the minima of |S k | are not exactly at the required frequency (here, 10 GHz). The reflection responses can be shifted in frequency by adjusting the size of the patches, y 1 here. In order to find the appropriate change of y 1 we use the coarse-discretization model R cd. Because both R f and R cd are evaluated using the same EM solver, we assume that the frequency shift of reflection responses is similar for both models under the same change of the variable y 1, even though the responses themselves are not identical for R f and R cd (in particular, they are shifted in frequency and the minimum levels of |S k | are typically different). By performing perturbation of y 1 using R cd, one can estimate the change of y 1 in R f , necessary to obtain the required frequency shift of its reflection responses. This change would normally be very small so that it would not affect the array pattern in a substantial way. The computational cost of reflection adjustment using the method described here is only one evaluation of the high-fidelity model and one evaluation of the coarse-discretization model R cd.

In the case of severe mismatch, the feed offsets d n can also be used to adjust reflection; however, it was not necessary in the design cases considered in this work.

A starting point for the optimization procedure is chosen to be a uniform array, and the spacings x t1, x t2, y t1, and y t2 are easily found using model R a assuming x t1=x t2=y t1=y t2. The radiation response of the array at this design x (0) is shown in Fig. 14. x (0)=[x t1 x t2 y t1 y t2 x 1 y 1 a 1a 15 b 1b 15]T=[20 20 20 20 11 9 1 … 1 0 …0] where the dimensional parameters are in millimeters, the excitation amplitudes are normalized, and the phase shifts are in degrees. The side lobe level of this design x (0) is about −13 dB and the peak directivity of x (0) is 21.4 dBi. The feed offset, d n , shown in Fig. 13(a), is 2.9 mm for all patches.

Fig. 14
figure 14

Microstrip antenna array of Fig. 13 at the initial (uniform) design x (0), directivity pattern cuts at 10 GHz: (aH-plane; (bE-plane. EM model R f (solid lines) and model R a (dash-dotted lines)

Design optimization with nonuniform amplitude excitation. Following the two-stage procedure described above, design optimization has been carried out with incident excitation amplitude as design variables. The cost of stage 1, directivity pattern optimization, is only three evaluations of R f (the cost of optimizing the analytical R a can be neglected). At stage 2, matching, we change the y-size of the patches, global parameter y 1 to 9.05 mm in order to move reflection responses to the left in frequency y 1. The cost of this step is 1×R cd+1×R f .

The final design is found at x =[23.56 24.56 23.65 24.42 11.00 9.05 0.9520.476 0.0982 0.982 0.946 0.525 1.000 0.973 0.932 0.994 0.936 0.529 0.858 0.5940.0275]T. All excitation amplitudes are normalized to the maximum which is the amplitude of the seventh element located at the array center. The radiation response (directivity pattern cuts) and reflection response of the final design are shown in Fig. 15. The side lobe level of this design x is under −20 dB and the peak directivity of x (0) is 21.8 dBi. The total cost of optimization is 1×R cd+4×R f , that is, about 4×R f .

Fig. 15
figure 15

Microstrip antenna array of Fig. 13 at the final design with nonuniform amplitude excitation: (a) directivity pattern cuts in the E- and H-planes at 10 GHz; (b) reflection responses of the array at the patch feeds

Design optimization with nonuniform phase excitation. Another optimization case has been considered with the excitation phase shifts as design variables. The cost of stage 1, directivity pattern optimization, is again 3×R f , and the cost of stage 2 is 1×R c +1×R f . The final design is found at x =[23.85 25.00 23.72 24.56 11.00 9.01 0.0 −21.96 123.01 7.09 −13.15 79.58 41.5337.33 0.51 24.75 65.94 −15.19 59.16 69.38 −67.62]T where the phase shifts are in degrees and given relative to the first excitation element, which is shown in Fig. 13 and corresponds to the 0.0 entry in the vector x . The radiation response (directivity pattern cuts) and reflection response of the final design are shown in Fig. 16. The side lobe level of this design x is about −19 dB and the peak directivity of x (0) is 19.2 dBi. The total cost of optimization for this case is the same as in the previous example, i.e., around four evaluations of the high-fidelity model.

Fig. 16
figure 16

Microstrip antenna array of Fig. 1 at the final design with nonuniform phase excitation: (a) directivity pattern cuts in the E- and H-planes at 10 GHz; (b) reflection responses of the array at the patch feeds

2.5 SBO Techniques for Antenna Design: Discussions and Recommendations

The SBO techniques presented in this section have proven to be computationally efficient for the design of different types of antennas. The typical computational cost of the design process expressed in terms of the number of equivalent high-fidelity model evaluations is comparable to the number of design variables, as demonstrated through examples. Here, we attempt to qualitatively compare these methods and give some recommendations for the readers interested in using them in their research and design work.

The multi-fidelity approach applied in Sect. 2.1 is one of the most robust techniques, yet it is simple to implement. The only drawback is that it requires at least two low-fidelity models of different discretization density and some initial study of the model accuracy versus computational complexity. While the multi-fidelity technique will work with practically any setup, careful selection of the mesh density can reduce the computational cost of the optimization process considerably. More implementation details and application examples of this technique can be found in [38] and [47].

Among the considered methods, the AADS technique of Sect. 2.2 is definitely the simplest for implementation, as it does not require any explicit correction of the low-fidelity model. Therefore, AADS can even be executed within any EM solver by modifying the design requirements and using its built-in optimization capabilities. On the other hand, AADS only works with minimax-like design specifications. Also, AADS requires the low-fidelity model to be relatively accurate so that the possible discrepancies between the low- and high-fidelity models can be accounted for by design specification adjustment. More implementation details and application examples of this technique to antenna design can be found in [48] and [49].

The SPRP technique of Sect. 2.3 does not use any extractable parameters. It assumes that the change of the high-fidelity model response due to the adjustment of the design variables can be predicted using the actual changes of the low-fidelity model response. SPRP is typically very efficient: in many cases only two or three iterations are sufficient to yield a satisfactory design [42].

Space mapping, discussed in Sect. 2.1 (at the last step of the variable-fidelity technique) and in Sect. 2.3, is a very generic method used to correct the low-fidelity model. In particular, it is able to work even if the low-fidelity model is rather inaccurate. On the other hand, space mapping requires some experience in selecting the proper type of surrogate model. More implementation details and application examples of this technique to antenna design can be found in [50] and [51] as well as in chapter Space Mapping for Electromagnetic-Simulation-Driven Design Optimization of this book.

As already mentioned, the low-fidelity model accuracy may be critical for the performance of the SBO algorithms. Using finer, i.e., more expensive but also more accurate, models generally reduces the number of SBO iterations necessary to find a satisfactory design; however, each SBO iteration turns to be more time-consuming. For coarser models, the cost of an SBO iteration is lower but the number of iterations may be larger, and for models that are too coarse, the SBO process may simply fail. The proper selection of the low-fidelity model “coarseness” may not be obvious beforehand. In most cases, it is recommended to use finer models rather than coarser ones to ensure good algorithm performance, even at the cost of some extra computational overhead.

The problem discussed in the previous paragraph can be considered in the wider context of model management, thus it may be beneficial to change the low-fidelity model coarseness during the SBO algorithm run. Typically, one starts from the coarser model in order to find an approximate location of the optimum design and switches to the finer model to increase the accuracy of the local search process without compromising the computational efficiency, e.g., as with the multi-fidelity technique of Sect. 2.1. Proper management of the model fidelity may result in further reduction of the design cost. The next section addresses this problem.

3 Model Fidelity Management for Cost-Efficient Surrogate-Based Design Optimization of Antennas

A proper choice of the surrogate model fidelity is a key factor that influences both the performance of the design optimization process and its computational cost. Here, we focus on a problem of proper surrogate model management. More specifically, we present a numerical study that aims for a trade-off between the design cost and reliability of the SBO algorithms. Our considerations are illustrated using several antenna design cases. Furthermore, we demonstrate that the use of multiple models of different fidelity may be beneficial for reducing the design cost while maintaining the robustness of the optimization process. Recommendations regarding the selection of the surrogate model coarseness are also given.

3.1 Coarse-Discretization Electromagnetic Simulations as Low-Fidelity Antenna Models

The only universal way of creating physics-based low-fidelity antenna models is through coarse-discretization EM simulations. This is particularly the case for wideband and ultra-wideband (UWB) antennas [52], as well as dielectric resonator antennas (DRAs) [53], to name just a few. Here, we assume that the low-fidelity model R c is evaluated with the same EM solver as the high-fidelity model. The low-fidelity model can be created by reducing the mesh density compared to the high-fidelity one, as illustrated in Fig. 17. Other options of the low-fidelity model may include:

  • Using a smaller computational domain with the finite-volume methods;

  • Using low-order basis functions, e.g., with the moment method;

  • Applying simple absorbing boundaries;

  • Applying discrete sources rather than full-wave ports;

  • Modeling metals with a perfect electric conductor;

  • Neglecting the metallization thickness of traces, strips, and patches;

  • Ignoring dielectric losses and dispersion.

Fig. 17
figure 17

A microstrip antenna [35]: (a) a high-fidelity EM model with a fine tetrahedral mesh, and (b) a low-fidelity EM model with a coarse tetrahedral mesh

Because of the possible simplifications, the low-fidelity model R c is (typically 10 to 50 times) faster than R f but not as accurate. Therefore, it cannot substitute for the high-fidelity model in design optimization. Obviously, making the low-fidelity model mesh coarser (and, perhaps, introducing other simplifications) results in a loss of accuracy but also in a shorter computational time. Figure 18 shows the plots illustrating the high- and low-fidelity model responses at a specific design for the antenna structure in Fig. 17, as well as the relationship between the mesh coarseness and the simulation time.

Fig. 18
figure 18

An antenna of Fig. 17 evaluated with the CST MWS transient solver [7] at a selected design: (a) the reflection response with different discretization densities, 19,866 cells (■ ■ ■), 40,068 cells (⋅ — ⋅), 266,396 cells (– –), 413,946 cells (  ), 740,740 cells (—), and 1,588,608 cells (); and (b) the antenna evaluation time versus the number of mesh cells

In Fig. 18, one can observe that the two “finest” coarse-discretization models (with ∼400,000 and ∼740,000 cells) properly represent the high-fidelity model response (shown as a thick solid line). The model with ∼270,000 cells can be considered as borderline. The two remaining models can be considered as too coarse, particularly the one with ∼20,000 cells; its response substantially deviates from that of the high-fidelity model.

We stress that, at the present stage of research, visual inspection of the model responses and the relationship between the high- and low-fidelity models is an important step in the model selection process. In particular, it is essential that the low-fidelity model capture all important features present in the high-fidelity one.

3.2 Selecting Model Fidelity: Design of Microstrip Antenna Using Frequency Scaling

We consider an antenna design case with the optimized designs found using an SBO algorithm of the following type. A generic SBO algorithm produced a series of approximate solutions to (1), x (i), i=0,1,…, as follows (x (0) is the initial design) [15]:

$$ \boldsymbol{x}^{(i + 1)} = \arg \min_{\boldsymbol{x}}U\bigl( \boldsymbol{R}_{s}^{(i)}(\boldsymbol{x})\bigr), $$
(3)

where \(\boldsymbol{R}_{s}^{(i)}\) is the surrogate model at iteration i. Typically, the surrogate model is updated after each iteration using the high-fidelity model data accumulated during the optimization process. Normally, the high-fidelity model is referred to rarely, in many cases only once per iteration, at a newly found design vector x (i+1). This, in conjunction with the assumption that the surrogate model is fast, allows us to significantly reduce the computational cost of the design process when compared with direct solving of the original optimization problem.

Here we use three low-fidelity EM models of different mesh densities. We investigate the performance of the SBO algorithm working with these models in terms of the computational cost and the quality of the final design.

Consider the coax-fed microstrip antenna shown in Fig. 19 [54]. The design variables are x=[a b c d e l 0 a 0 b 0]T. The antenna is on 3.81 mm thick Rogers TMM4 substrate (ε 1=4.5 at 10 GHz); l x =l y =6.75 mm. The ground plane is of infinite extent. The feed probe diameter is 0.8 mm. The connector’s inner conductor is 1.27 mm in diameter. The design specifications are |S 11|≤−10 dB for 5 GHz to 6 GHz. The high-fidelity model R f is evaluated with CST MWS transient solver [7] (704,165 mesh cells, evaluation time 60 min). We consider three coarse models:R c1 (41,496, 1 min), R c2 (96,096, 3 min), and R c3 (180,480, 6 min).

Fig. 19
figure 19

Coax-fed microstrip antenna [54]: (a) 3D view, (b) top view

The initial design is x (0)=[6 12 15 1 1 1 1 −4]T mm. Figure 20(a) shows the responses of all the models at the approximate optimum of R c1. The major misalignment between the responses is due to the frequency shift, so the surrogate is created here using frequency scaling as well as output space mapping [15] and [16]. The results, summarized in Table 4, indicate that the model R c1 is too inaccurate and the SBO design process using it fails to find a satisfactory design. The designs found with models R c2 and R c3 satisfy the specifications, and the cost of the SBO process using R c2 is slightly lower than that using R c3.

Fig. 20
figure 20

Coax-fed microstrip antenna: (a) model responses at the initial design, R c1 (⋅⋅⋅), R c2 (⋅−⋅−), R c3 (- - -), and R f (—); (b) high-fidelity model response at the final design found using the low-fidelity model R c3

Table 4 Coax-fed microstrip antenna: design results

3.3 Coarse Model Management: Design of a Hybrid DRA

In this section, we again consider the use of low-fidelity models of various mesh densities for surrogate-based design optimization of the dielectric resonator antenna. We also investigate the potential benefits of using two models of different fidelity within a single optimization run.

Consider the hybrid DRA shown in Fig. 21. The DRA is fed by a 50 ohm microstrip terminated with an open-ended section. The microstrip substrate is 0.787 mm thick Rogers RT5880. The design variables are x=[h 0 r 1 h 1 u l 1 r 2]T. Other dimensions are fixed: r 0=0.635, h 2=2, d=1, r 3=6, all in millimeters. The permittivity of the DRA core is 36, and the loss tangent is 10−4, both at 10 GHz. The DRA support material is Teflon (ε 2=2.1), and the radome is of polycarbonate (ε 3=2.7 and tanδ=0.01). The radius of the ground plane opening, shown in Fig. 21(b), is 2 mm.

Fig. 21
figure 21

Hybrid DRA: (a) 3D-cut view and (b) side view

The high-fidelity antenna model R f (x) is evaluated using the time-domain solver of CST Microwave Studio [7] (∼1,400,000 meshes, evaluation time 60 min). The goal is to adjust the geometry parameters so that the following specifications are met: |S 11|≤−12 dB for 5.15 GHz to 5.8 GHz. The initial design is x (0)=[7.0 7.0 5.0 2.0 2.0 2.0]T mm. We consider two auxiliary models of different fidelity, R c1 (∼45,000 meshes, evaluation time 1 min), and R c2 (∼300,000 meshes, evaluation time 3 min). We investigate the algorithm (2) using either one of these models or both (R c1 at the initial state and R c2 in the later stages). The surrogate model is constructed using both output space mapping and frequency scaling [15] and [16]. Figure 22(a) justifies the use of frequency scaling, which, due to the shape similarity of the high- and low-fidelity model responses, allows substantial reduction of the misalignment between them.

Fig. 22
figure 22

Hybrid DRA: (a) high- (—) and low-fidelity model R c2 response at certain design before (⋅⋅⋅ ⋅) and after (- - -) applying the frequency scaling, (b) high-fidelity model response at the initial design (- - -) and at the final design obtained using the SBO algorithm with the low-fidelity model R c2 (—)

The DRA design optimization has been performed three times: (i) with the surrogate constructed using R c1—cheaper but less accurate (case 1), (ii) with the surrogate constructed using R c2—more expensive but also more accurate (case 2), and (iii) with the surrogate constructed with R c1 at the first iteration and with R c2 for subsequent iterations (case 3). The last option allows us to more quickly locate the approximate high-fidelity model optimum and then refine it using the more accurate model. The number of surrogate model evaluations was limited to 100 (which involves the largest design change) in the first iteration and to 50 in the subsequent iterations (which require smaller design modifications).

Table 5 shows the optimization results for all three cases. Figure 22(b) shows the high-fidelity model response at the final design obtained using the SBO algorithm working with low-fidelity model R c2. The quality of the final designs found in all cases is the same. However, the SBO algorithm using the low-fidelity model R c1 (case 1) requires more iterations than the algorithm using the model R c2 (case 3), because the latter is more accurate. In this particular case, the overall computational cost of the design process is still lower for R c1 than for R c2. On the other hand, the cheapest approach is case 2 when the model R c1 is utilized in the first iteration that requires the largest number of EM analyses, whereas the algorithm switches to R c2 in the second iteration, which allows us to both reduce the number of iterations and number of evaluations of R c2 at the same time. The total design cost is the lowest overall.

Table 5 Hybrid DRA design results

3.4 Discussion and Recommendations

The considerations and numerical results presented above allow us to draw some conclusions regarding the selection of model fidelity for surrogate-based antenna optimization. Using the cheaper (and less accurate) model may translate into a lower design cost; however, it also increases the risk of failure. Using the higher-fidelity model may increase the cost, but it definitely improves the robustness of the SBO design process and reduces the number of iterations necessary to find a satisfactory design. Visual inspection of the low- and high-fidelity model responses remains—so far—the most important way of accessing the model quality, which may also give a hint as to which type of model correction should be applied while creating the surrogate.

We can formulate the following rules of thumb and “heuristic” model selection procedure:

  1. (i)

    An initial parametric study of low-fidelity model fidelity should be performed at the initial design in order to find the “coarsest” model that still adequately represents all the important features of the high-fidelity model response. The assessment should be done by visual inspection of the model responses, keeping in mind that the critical factor is not the absolute model discrepancy but the similarity of the response shape (e.g., even a relatively large frequency shift can be easily reduced by a proper frequency scaling).

  2. (ii)

    When in doubt, it is safer to use a slightly finer low-fidelity model rather than a coarser one so that the potential cost reduction is not lost due to a possible algorithm failure to find a satisfactory design.

  3. (iii)

    The type of misalignment between the low- and high-fidelity models should be observed in order to properly select the type of low-fidelity model correction while constructing the surrogate. The two methods considered here (additive response correction and frequency scaling) can be viewed as safe choices for most situations.

We emphasize that, for some antennas, such as some narrowband antennas or wideband traveling wave antennas, it is possible to obtain quite a good ratio between the simulation times of the high- and low-fidelity models (e.g., up to 50), because even for relatively coarse mesh, the low-fidelity model may still be a good representation of the high-fidelity one. For some structures (e.g., multi-resonant antennas), only much lower ratios (e.g., 5 to 10) may be possible, which would translate into lower design cost savings while using the SBO techniques.

4 Conclusion

Surrogate-based techniques for simulation-driven antenna design have been discussed, and it was demonstrated that optimized designs can be found at a low computational cost corresponding to a few high-fidelity EM simulations of the antenna structure. We also discussed an important trade-off between the computational complexity and accuracy of the low-fidelity EM antenna models and their effects on the performance of the surrogate-based optimization process. Recommendations regarding low-fidelity model selection were also formulated. We have demonstrated that by proper management of the models involved in the design process one can lower the overall optimization cost without compromising the final design quality. Further progress of the considered SBO techniques can be expected with their full automation, combination, and hybridization with adjoint sensitivities, as well as with metaheuristic algorithms.