1 Introduction

In this contribution, we discuss a novel search strategy to find certain solutions of a nonlinear propagation equation for which an analytic expression may not exist. We specifically address the propagation of light pulses in optical fiber, which is governed by the nonlinear Schrödinger equation (NLSE). The same equation also describes surface waves in deep water [1] and (under the name of Gross–Pitaevskii equation) motion in a Bose–Einstein condensate [2].

The NLSE is integrable and does have analytically known solutions: Solitons [3] are pulse-like (localized) solutions which persist without change of shape [46]. After the suggestion [7] that they are useful for signalling pulses in telecommunication via optical fiber, and after a first experimental demonstration [8], they were eventually implemented (with some modification) in commercial systems around the turn of the century. Superpositions of several solitons are known as higher-order solitons [9] and find uses for pulse compression [10]. Higher-dimensional generalizations of the nonlinear Schrödinger equation are used to describe spatial solitons (beams of light with self-stabilized cross section) [11] or light bullets (solutions localized in all three spatial dimensions) [12].

However, the NLSE provides only an approximate modeling; correction terms for higher-order dispersion, the Raman effect, linear loss, etc. must often be applied. Broadly speaking, these additions to the nonlinear Schrödinger equation can be categorized into Hamiltonian and non-Hamiltonian perturbations; for the latter, the equation is no longer integrable [13]. The added realism of the modeling comes at the expense that only approximate solutions can be known; exact solutions may not even exist.

Moreover, fibers deployed today are of a special type which is called dispersion-managed [5]. These fibers consist of periodically alternating fiber segments with different dispersion coefficients and were introduced for better control over dispersive effects. A NLSE with periodically modulated parameters is mathematically challenging, and no exact solutions are known. While it is not obvious from the outset that solitons would even exist in such fibers, it was shown [17, 18] that suitable light pulses stabilize their shape due to the nonlinearity; by this token, they are commonly called dispersion-managed (DM) solitons. They are pulses with a shape that breathes in the dispersion allocation, but return to their initial shape periodically; this is referred to as ‘stable in the stroboscopic sense’. For want of analytic tools, generally one has to resort to approximations and numerical techniques to determine their shapes [1921].

It has been demonstrated that beyond DM solitons, bound states of DM solitons exist, which are known as soliton molecules. They can consist of two [22] or, as demonstrated very recently, three [23, 24] pulses; see also theoretical results in [2529]. The binding mechanism [30] favors a certain equilibrium separation (mutual distance) between the pulses; this is reminiscent of the bond between atoms in molecules—hence the name.

It has been argued that use of soliton molecules may make increased data rates in optical telecommunication possible. Before a concept enters the world of technical realities, however, it must be characterized and tested very carefully. In the absence of analytic solutions, an important first step is to determine the shape of soliton molecules with reasonable precision. This is what our strategy accomplishes: The idea is to compare shapes at fiber input and output, then to adjust the input shape by an optimization scheme until both shapes agree with each other. Here, we discuss the performance of two optimization schemes.

Our proposed technique is designed to be executed as an actual laboratory experiment. Therefore, even in this numerical study, we intentionally refrain from using information for the optimization which is inaccessible in the experiment. As it is available in the computation, we do use it for independent verification of results in this report, however.

2 Stating the problem, and methods

Nijhof’s much-used iteration procedure [31] is designed to find the shape of DM solitons. One makes an initial guess of the pulse shape. A pulse thus prepared is launched and propagates. The procedure requires access to the full propagation information which is not available in experiments. Typically, the pulse shape will oscillate during propagation in all its parameters. The physical background is that in addition to the DM soliton, radiative background is generated and produces a beating with the soliton [32, 33]. As the radiation disperses away, the beat amplitude diminishes. In this method, one identifies subsequent minima and maxima of the oscillation of a single pulse parameter, e.g., the peak power. An interpolation between them is used to derive an improved guess, with which the iteration is started over. In this particular context, the method works well: it speeds up convergence to the pure DM soliton considerably.

For our purpose, however, Nijhof’s method is not applicable because we need to work with information obtained at fiber input and output alone, but not at intermediate positions. Moreover, we deal with a situation in which the pulse shape is defined by many parameters; the shape of soliton molecules is defined by an a priori unknown but potentially very large number of degrees of freedom. It is not immediately clear whether the interpolation procedure can be suitably generalized so that convergence is assured.

The most versatile way of shaping pulses is by using a spatial light modulator [34]. Pulse shapers employing such device can control amplitude and phase in the Fourier domain and provide flexibility only limited by the dynamic range of the individual pixels and their total number. The pixel count also equals the number of controllable degrees of freedom and sets an upper limit to the number of controllable pulse shape parameters.

In our method, we initially guess the desired shape—in an experiment one would prepare it with a spatial light modulator—and let it propagate over a certain fiber length. Then, we compare the pulse shape at fiber input and output; any deviation between the two will be expressed as an error signal. Obviously, we seek situations in which pulse shapes at fiber input and output are the same, i.e., the error signal vanishes. Solutions to the propagation equation that have a stationary shape (in the stroboscopic sense) will fulfill this condition. We will discuss in Sect. 3 below how to quantify the error, and how to translate that into a suitable modification of the modulator pixel settings so that the error is reduced.

There are several challenges to this approach which we now illustrate. Consider the simplest case, the standard soliton of the nonlinear Schrödinger equation (NLSE)

$$i \frac{\partial A}{\partial z} + \frac{\beta_2}{2} \frac{\partial^2 A}{\partial t^2} + \gamma |A|^2 A = 0.$$
(1)

Here, z and t are position and time, and β 2 and γ are the coefficients of group velocity dispersion and Kerr nonlinearity, respectively. The fundamental soliton takes the form (except for irrelevant constants)

$$A(z,t) = \sqrt{P_0} \hbox{ sech}\left( \frac{t}{T_0} \right) \exp\left(\frac{i}{2}\gamma P_0 z\right)$$
(2)

where P 0 is the peak power and T 0 the width. The soliton is subject to the constraint

$$P_0 T_0^2 = \frac{|\beta_2|}{\gamma}$$
(3)

indicating that any combination of peak power and pulse duration can constitute a soliton as long as it yields the constant set by the fiber parameters. Therefore, the solution is not a point in the (P 0, T 0) parameter space: the error landscape locally resembles a canyon-like valley. In this situation, the optimum is not unique.

The experiment poses the constraint that signals are available only at the fiber’s input and output, but not in between. Eq. (1) is also solved by certain combinations of solitons known as higher-order solitons or N-solitons [9]. They exhibit a periodically evolving pulse shape which returns to the original shape after a distance z 0 = π T 20 /(2β 2). Whenever the fiber length L happens to coincide with an integer multiple of z 0, the error vanishes, and one has a false positive. This reminds us that equality of pulse shape at fiber input and output can potentially occur due to a variety of circumstances. Zero error does not, therefore, identify a solution: it only indicates a candidate for a useful solution. Among the candidates, we have to distinguish the desired solutions from the undesired ones. The undesired solutions are perfectly valid in a mathematical sense, but do not meet the investigator’s intention. The trivial solution A ≡ 0 of Eq. (1) is another example. The obvious countermeasure is to define, along with the initial guess of the pulse shape, a bounded volume in parameter space inside which parameter values are considered acceptable. It must be taken such that a desired solution is expected inside. It will turn out that one of the iteration methods described below can leave that parameter box, though.

A further complication is that we need to deal with dispersion-managed fibers in which dispersion, and nonlinearity are stepwise functions of z with a so called dispersion period of L map. With the substitution β 2 = β 2(z) and γ = γ(z) in Eq. (1), one obtains a dispersion-managed version of the nonlinear Schrödinger equation (DM-NLSE). As it is not integrable, no exact solutions are known for validation. Often a Gaussian approximation in the form \(A(0,t) = \sqrt{P_0} \exp{[-t^2/(2T_0^2)]}\) is used to resemble the DM solitons shape [35] at the end of a dispersion period.

Before we start optimizing the pulse shape, we must decide how many degrees of freedom we want to use. In principle, there are myriad ways to describe a pulse shape, but here an upper bound is the number of modulator pixels. For practical purposes, that is still too large a number. We therefore make simplifying assumptions, typically by assuming that the functional form of the individual peak in a several-peak compound is an unchirped Gaussian. Still, for a structure of m pulses, there are m peak powers, m widths, m − 1 relative phases, and m − 1 separations at the very least. If pulse chirps are also considered, there are m chirp parameters for even linear chirp alone. On the other hand, one may assume symmetries to reduce the number of parameters (e.g. equal peak powers), but this will reduce the accuracy of the end result. To keep a reasonable accuracy, when we proceed to compounds of several pulses below, we will increase the number of variables.

3 Assessing the error

In a DM fiber, pulse shapes evolve all the time and breathe across one dispersion period L map; stationary pulse shapes can only exist in the stroboscopic sense. Therefore, we use a DM fiber which begins and ends with a half-segment of the anomalously dispersive fiber. The task is to assess the difference between shapes at fiber input and output (the ‘error’) in a way that can be easily implemented technically. In experiments, one can use established pulse characterization methods like FROG [36], SPIDER [37], etc. to find the complex pulse shapes A in(t) and A out(t), respectively.

An obvious approach to define the error f is to use

$$f = \frac{1}{E_{\rm in}}\int\limits_{-\infty}^{\infty}\left(\left|A_{\rm in}(t)\right|-\left|A_{\rm out}(t)\right|\right)^2 \hbox{d}t,$$
(4)

where the input energy E in = ∫|A in|2dt is used for normalization; this helps to avoid the undesired trivial solution. The amplitudes can be obtained from measured powers by taking the square root.

There are other possible ways to define the error. A technically particularly simple way is to use the power spectrum and apply an expression in analogy to Eq. (4). That, however, would fail in a purely linear system. Another variant is to make use of a correlation measurement between A in and A out (called a blind FROG spectrogram [36]) and to assess its temporal asymmetry. A plot of f in parameter space constitutes the error landscape.

Examples are shown in Fig. 1. Whichever way the deviation between input and output shape is gauged, the error landscape is similar in its major features and identical in the position of candidates for solutions—even when results may differ in detail. This is shown for a fundamental soliton in Fig. 1a–c, based on a numerical simulation by the standard split-step Fourier method [34] and using fiber parameters as given in Table 1. The ‘soliton valley’, highlighted by a dashed line, is easily recognized; secondary error minima are due to higher-order solitons. These features do not depend on the details of the assessment method; at their positions, the error approaches zero.

Fig. 1
figure 1

The error landscape for the fundamental soliton (ac) and a Gaussian DM soliton (d) in the P 0, T 0 plane. Three different error definitions are used: a according to Eq. (4); b same for power spectrum; c from the asymmetry of a Blind FROG trace; d again according to Eq. (4). Error values are indicated by color code. Soliton parameters for ac T 0 = 302 fs, P 0 = 30.0 W; the graph is normalized to these values. Simulations refer to a fiber length of z = 3z 0. Soliton parameters for d Gaussian with T 0 = 178 fs, P 0 = 31.7 W. The initial parameter box is shown in a and d; see Sec. V. For symbols inserted in a and d refer to Fig. 4 below

Table 1 Parameters of the NLSE and DM-NLSE system

Figure 1d shows a similar error landscape for a DM soliton; DM fiber parameters correspond to those in the experiment of Ref. [23, 24]. A localized ‘soliton valley’ is still obvious, even when the same scaling as in Eq. (3) cannot be applied directly. However, due to the use of a Gaussian approximation, a perfect match of input and output signals is never expected as the true shape is likely to be somewhat different. Therefore, all error values in the minima are orders of magnitude larger than above. Depending on how close the DM soliton shape gets to a Gaussian, at different P 0T 0 parameter pairs, the error at valley bottom may vary.

4 Optimization methods

In this paper, two optimization strategies are used to numerically find single and multi-soliton solutions.

4.1 Nelder–Mead simplex method

The method as described by Nelder and Mead [38] (called NM below) uses the concept of a simplex. A simplex is a geometrical object consisting of D + 1 points (when the dimension of the parameter space is D ≥ 2) known as vertices and their connecting straight lines, planes, or hyperplanes. The initial choice can be made at random within the initial parameter box. The error is evaluated at each vertex; depending on the result, one of several geometric operations (reflection, contraction, expansion) can be performed to replace the ‘worst’ vertex with an updated choice. After assessing the error for the new point, the iteration is continued. It terminates once the error is reduced below some preset threshold, or after a preset number of steps—whichever occurs first. This method differs from a straightforward gradient search in that it keeps track of the neighborhood of selected points and is therefore less likely to be trapped in small local minima. In the field of optics, this algorithm has been used for various purposes ranging from wavefront correction for optimum image sharpness [39] to optimized representation of the modal profile of a fiber [40].

4.2 Genetic algorithm

We use here a binary version of a genetic algorithm [41], called GA here. Such algorithm has already been used in the context of optics by several researchers: In [42], light pulses were shaped to optimize the yield of a light-induced chemical reaction and in [43] to achieve the shortest possible pulse at the end of a fiber [2]. In [44], fiber design was optimized in order to achieve optical supercontinuum with the widest spectrum. The algorithm is designed to find a maximum of some fitness value \({\fancyscript{F}};\) this is easily accommodated here by defining it as \({\fancyscript{F}}=1/f.\) The parameters of the initial candidates to be evaluated are coded in so-called chromosomes which are strings of binary values called genes. Here, reflected binary (Gray) code is used for the parameter representation. A starting point inside the initial parameter box is described by a string of binary values: each of the k pulse parameters to be optimized is divided into 2n steps between minimum and maximum allowed by the box, with n typically between 8 and 10. Thus, the chromosomes have a length of N bit = nk, and the GA can evaluate 2nk possible parameter sets. These represent grid points in k-dimensional parameter space; at best, the algorithm can find the grid point closest to the solution.

An initial population is obtained by n pop randomly chosen chromosomes (we use n pop = 50). All of these chromosomes are evaluated (the respective initial pulse is numerically propagated) to obtain the respective 50 fitness values \({\fancyscript{F}}.\) A selection of parents by a roulette wheel weighted with the fitness values according to \(P_i={\fancyscript{F}}_i/(\sum_{m=1}^{n_{\rm pop}}{\fancyscript{F}}_{m})\) is then performed. From a set of two selected parents, two children are created by uniform crossover, where genes between both chromosomes are exchanged. This is repeated until n pop children are found.

The last step is a flip-bit mutation of the children where the probability of a gene to flip from zero to one or vice versa is P mut = N −1bit and depends on the length of the chromosome. Note that the mutation step, required for convergence, is also important to prevent the GA to be confined in local minima of the error landscape.

Then, the next generation (a new population) is created and can be evaluated again. After evaluating many generations (we use 100 generations for a total of 5,000 evaluations), the best chromosome with the highest fitness or lowest error is then selected to represent the best result found by GA.

5 NM-GA comparison: fundamental and DM solitons

A comparison of two methods can only be made when the criteria are defined. Methods may differ in speed of convergence or in the smallness of the remaining error (the accuracy) or in the scatter of the latter (which affects the reliability). With a view to experimental realization, it turns out that computational speed is not relevant: The required time to read out data, or to set the pulse shaper, is much larger than computation time. The latter being swamped in the total evaluation time anyway, it is of no use to minimize it. A fair comparison between both methods just requires that the same number of pulse shape measurements in the experiment and thus the same number of individual propagation simulations (evaluations) is used.

To validate the methods, we start with the simple case of a fundamental soliton of the NLSE (Eq. 2) in a constant dispersion fiber with the same parameters as above. This is a situation for which analytic results are well known. We define the initial parameter box as

$$\begin{aligned} &60 \hbox{ W} <P_0< 90\hbox{ W}\\ &100 \hbox{ fs} <T_0< 200 \hbox{ fs} \end{aligned}$$

and highlight it in Fig. 1. This choice assures that part of the ‘soliton valley’ is included. When we let both procedures run, we watch the error f as a function of the evaluation number. To obtain reasonable statistics, the lowest error f min from each of 30 independent runs is monitored for 5,000 evaluations; Fig. 2 shows results on a logarithmic scale in a box-and-whisker representation [45]. Fifty percent of the error values are inside the box, and the median is marked by a horizonal bar inside the box. The size of the whiskers is given by the remotest outliers of the distribution.

Fig. 2
figure 2

Evolution of the best error value distribution for the standard soliton with increasing number of evaluations. The distribution from 30 individual runs is given in a box-and-whisker representation: 50 % of the values are located inside the boxes; the median is indicated by a horizonal bar. The whiskers indicate the most outlying error values. Data are for NM (gray) and GA (orange). NM reaches lower median errors but retains a huge spread of error values

Note that the GA has a population size of 50; therefore, the GA box-and-whisker representation after 50 evaluations in Fig. 2 corresponds to the first, randomly distributed population. Both the NM method and the GA produce an error which decreases with increased evaluation numbers and reaches small values. We find that usually error values f min are not normally distributed. This is due to the fact that some runs stagnate in local error minima and contribute a huge scatter.

Errors for the GA do not drop as far as for the NM, but they are much more localized. This is due to the global nature of the GA. While NM can get stuck in irrelevant local minima (therefore the large whiskers), GA has superior capability to find the global minimum. Once NM reaches the immediate vicinity of the global minimum, however, it homes in more accurately because in contrast to GA, it is not limited by a parameter range discretization. Therefore, GA does not reach great accuracy, but seems more reliable.

In a very similar way, Fig. 3 shows corresponding results for the DM soliton of the DM-NLSE. The smallest error values here are roughly four orders of magnitude larger than in the NLSE case. As pointed out in Sect. 3 above, the reason is that a Gaussian approximation never comes very close to the ’true’ DM soliton shape. Basically, we find the same behavior as before: NM gets closer to zero error but suffers from outliers; GA is more reliable but never as accurate.

Fig. 3
figure 3

Evolution of the best error value distribution as in Fig. 2 but for the DM soliton. Convergence is comparable to the case of the standard soliton, but neither NM nor GA reaches similarly low values. Again GA has higher remaining error but with much smaller spread

For further illustration, we compare the end result after 5,000 evaluations for both fundamental and DM soliton in Fig. 4. Either method provides a very good match between input and output shape, as the left parts of all panels show. The right parts show the propagation evolution; in the DM case, the dispersion map structure is indicated. Parameters and error values are specified above the plots. The loci of solutions for NM in a) and c) are identified in Fig. 1 by corresponding symbols (dotted and open circle) for convenient identification. Parameters for these cases are outside the preset parameter range: NM can trespass beyond the confines of the parameter box. In contrast, GA is automatically restricted to remain inside; square and filled square symbols identify the location in Fig. 1. If the true optimum happens to be outside the box, GA cannot access it; this has happened in panel d) where the peak power is found right on the edge of the parameter box. It will be important to note below that the starting box is also the roaming box for GA, but NM is not restricted to it.

Fig. 4
figure 4

Comparison of optimization results for fundamental soliton (upper row) and DM soliton (lower row). Nelder–Mead results are shown in the left column, and genetic algorithm results in the right column. In each panel, input (dash-dotted black curve) and output (solid curve, gray for NM and orange for GA) field amplitudes are drawn together on the left; propagation is shown to the right. Inserted numbers are the best results obtained in the respective case

6 NM-GA comparison: two-soliton molecule

As we progress to increasingly more complex situations, we now turn to a discussion of a two-soliton molecule (which only exists in DM fibers). It will turn out that certain ambiguities or false results can occur when we keep restricting the information to that at input and output of the fiber. With the same DM fiber parameters as above, we approximate the two-soliton molecule by a double unchirped Gaussian with equal power and width:

$$\begin{aligned} A_{\rm in} &= \sqrt{P_0} \exp \! \left[-\frac{(t+\sigma/2)^2}{2T_0^2}\right]\\ &\quad +\sqrt{P_0} \exp \!\left[-\frac{(t-\sigma/2)^2}{2T_0^2}+i\varphi\right]. \end{aligned}$$
(5)

Four parameters are required to characterize this ansatz: the peak power P 0, the pulse width T 0, the separation σ, and the relative phase φ. The initial parameter range is given by

$$\begin{aligned} & 60 \hbox{ W} < P_0 < 90\hbox{ W} \\ & 100 \hbox{ fs} < T_0 < 200\hbox{ fs} \\ & 500 \hbox{ fs} < \sigma < 1000 \hbox{ fs} \\ & 0.9\pi \hbox{ rad} < \varphi < 1.1\pi \hbox{ rad} \end{aligned}$$

Both NM and GA successfully locate candidates for two-soliton molecules in a way comparable to that shown in Fig. 3 (not shown). We concentrate on a few representative cases: Fig. 5 shows two examples for each algorithm, all with low remaining error. On checking whether these candidates are acceptable solutions, we resorted to numerical propagation simulation and find that panels a) and c) are good solutions, whereas b) and d) are false.

Fig. 5
figure 5

Comparison of results from NM and GA. a Soliton molecule by NM, valid result; b same but false result (see text). c Soliton molecule by GA, valid result; d same but false result

In the NM case, the solution in b) has a huge separation which is far outside the parameter box. There is nothing in the NM algorithm that prevents the simplex to trespass beyond its borders; this creates a proneness to such false solutions. The small error is explained by realizing that the solitons can be interpreted here as single free solitons (which have near-Gaussian shape), not a compound state.

For the GA, panel d) shows a case in which only the propagation simulation reveals an nonstationary separation: Over the fiber length, the two pulses first approach each other, then move outwards again. At the end of the given fiber length, they happen to have the same separation as at launch. In spite of an error value even smaller than in c), this is a false solution.

The problem of undesired solutions outside the parameter box occurs for NM alone; false solutions due to dynamic evolution are shared by both methods. Only judicious checking can safeguard against these problems. For example, the large separations as in Fig. 5b represent an indifferent equilibrium. In an indifferent equilibrium, any position is as good as any other, so that each repetition probably will return a different separation. This would provide a clue that the solution is false. The main concern is structures which dynamically evolve during propagation and just happen to return to the initial shape at the fiber end. One can then repeat the experiment with slightly modified parameters, such as to map out the local neighborhood in parameter space. Valid solutions should form some continuous manifold in parameter space. The oscillatory cases are subject to the additional constraint that an integer multiple of their evolution length must coincide with the fiber length; their locus in parameter space must then more resemble a dotted line than a continuous one. If that approach fails, slight modifications of the setup might help: One might append another period to the dispersion map by way of a connector, or one might tap off a tiny fraction of the signal power at a midway point along the fiber; the power will only suffice to check on the optical spectrum, but that alone will provide additional information about stationarity.

7 Soliton molecules in a realistic fiber

Finally, we proceeded to a realistic fiber model including higher-order corrections to Eq. (1) leading to the following DM-NLSE:

$$\begin{aligned} \frac{\partial}{\partial z}A &= i\sum\limits_{k=2}^{5} \left\{\frac{i^{k}}{k!}\beta_{k}(z)\frac{\partial^k}{\partial t^k}A \right\} +i\gamma(z)\left|A\right|^2 A \\ & \quad - \frac{\alpha(z)}{2}A - i\gamma(z) T_{\rm R}\frac{\partial}{\partial t}\left|A\right|^2 A \end{aligned}$$
(6)

where β 3(z)–β 5(z) are higher-order dispersion terms, T R is the Raman response time, and α(z) describes linear losses. We use

$$A_{\rm in} = \sum\limits_{l=1}^{m}\sqrt{P_l}\exp\!{\left[-\frac{(t+(2-l)\sigma_{l})^2}{2T_0^2}+i \varphi_{l}\right]}$$
(7)

to model the two-soliton (m = 2) and three-soliton (m = 3) molecules. Table 2 updates the parameters used; those not mentioned again are as above. The values are chosen for close correspondence to the experimental fiber used in [23, 24]. The same applies to the initial pulse width T 0 = 178 fs, and the phase φ 2 = 0. Note that in the experiment, the correct parameters needed to be guessed; here, we present a procedure to find their optimal values.

Table 2 Parameters of higher-order dispersion, Raman response time, and overall splice losses

The initial parameter range, again inspired by the experiment, is given by

$$\begin{aligned} & 25 \hbox{ W} <P_l < 35 \hbox{ W}\\ & 600 \hbox{ fs} <\sigma_l < 900 \hbox{ fs}\\ & 0.9\pi \hbox{ rad} <\varphi_l < 1.1\pi \hbox{ rad}, \end{aligned}$$

this range is too narrow to find quite different solutions; it is designed to optimize the known one.

In this approximation, the number of parameters carried for the optimization is 4 for the two-soliton molecule and 7 for the three-soliton molecule. We point out that now we have to make subtle adjustments to the error definition in Eq. (4): In the presence of loss, the pulse shape at the fiber end must be rescaled in power to match the input energy. Also, in the presence of the Raman effect and higher-order dispersion, there is an overall temporal shift of the output pulse which is removed before inserting into Eq. (4).

In Fig. 6, the distribution of the best error values as a function of the number of evaluations is shown from 20 individual runs. Obviously, NM (gray) has a much broader distribution than GA (orange). This is again attributed to the possibility of NM to escape from the initial parameter range. While NM successfully finds the lowest possible error values, these solutions are false: When the separation comes out as nearly zero, what has been located is really a single soliton, not a molecule. To illustrate this, four positions are marked with symbols in Fig. 6 which identify the cases shown in the four panels of Fig. 7. Panels a) and b) show valid solutions. The relative delay is due to the Raman effect. The case in c) has the lowest error overall; however, it is really a single DM soliton because the initial separation has evolved to nearly zero. The low-error value is misleading, of course, as this is a mathematically valid, but not a desired solution which was obtained because the simplex strayed beyond the borders of the parameter box. If such events are discarded, one also loses the low-error cases, and the remaining error increases; this is indicated in Fig. 6 as the additional box-and-whisker mark with black color at 2,000 evaluations. In Fig. 7d, the error stagnates at the largest value; here, the procedure got stuck in a secondary optimum. In contrast to these complications, GA did not produce false results.

Fig. 6
figure 6

Two-soliton molecule in a realistic fiber: box-and-whisker representation of the error value distribution from NM and GA as a function of evaluation number. Both methods converge quite well. GA produces a higher median but with much less scatter than NM. Symbols at the far right indicate locus of candidates shown in detail in Fig.7a–d; the additional box-and-whisker in black is explained in the text

Fig. 7
figure 7

Amplitude profile of two-soliton molecules. Input (dash-dotted curve) and output (solid curve, gray for NM and orange for GA) signal and the corresponding simulated propagation down the fiber. Panels are identified by symbols which match with those in Fig. 6. a Best result from GA. b Same for NM. c, d False results from NM, see text

Figures 8 and 9 show corresponding results for three-soliton molecules. They are comparable to the two-soliton molecule case in that there is a slow convergence, that NM produces much more scatter but somewhat better median. However, the disadvantage of GA’s higher median is here reduced to no more than ≈ 25 %. Again, symbols in Fig. 8 mark examples shown in Fig. 9 where panel (a) and (b) are the best candidates from NM and GA, respectively. Once again the lowest error is found by NM for the false solution of a single DM soliton; removal of these cases again increases the remaining error (black box-and-whisker).

Fig. 8
figure 8

Three-soliton molecule in a realistic fiber: box-and-whisker representation of the error value distribution from NM and GA as a function of evaluation number. Both methods converge reasonably well. GA produces a slightly higher median but with much less scatter than NM. Symbols at the far right indicate locus of candidates shown in detail in Fig. 9a–d. The additional box-and-whisker in black is explained in the text

Fig. 9
figure 9

Amplitude profile of three-soliton molecules. Input (dash-dotted curve) and output (solid curve, gray for NM and orange for GA) signal and the corresponding simulated propagation down the fiber. Panels are identified by symbols which match with those in Fig. 8. a Best result from GA. b Same for NM. c, d False results from NM, see text

8 Conclusion

Our scheme to find solutions of a nonlinear wave equation leads to a multi-dimensional nonlinear optimization problem in an intensely and irregularly corrugated error landscape. In order to keep computational expense down and the procedure manageable, we restricted the number of free parameters, bounded by the number of pulse shaper pixels, to a much smaller number by making assumptions about the expected pulse shape. Such reduction defines a lower-dimensional subspace of the full parameter space; the best approximation to the solution in this projection subspace is considered as the best approximation to the ‘true’ shape that is obtainable in the circumstances. It is always possible to carry more free parameters in order to meet any required precision.

We compared two optimization algorithms; their ability to home in on the desired optimum is different: NM gets closer to the ‘true’ solution as it is not restricted by the discretization grid of GA, but GA arrives near the correct solution with better reliability. That is because NM is prone to stagnate at false solutions or get trapped in undesired cases, a behavior traced to the simplex walking out of the prescribed parameter box. Ironically, undesired solutions tend to contribute the smallest errors so that, when these events are eliminated, the net error rises above that of GA.

We conclude that for less complex cases, NM converges very well and is superior to GA, but it is more affected from increasing complexity. For the more interesting cases (where the solution is hitherto unknown), GA is the more reliable tool. Ultimately, for the optimization of soliton molecules in the experiment, the GA is preferable.

We have now set up an actual experiment to test the numerical results presented here in a realistic situation. For details about the light source, the spatial light modulator, the fiber, and data acquisition, we refer the reader to Refs. [23, 24]. Let us point out that initially, we plan to use optical cross-correlation between input and output pulses to simplify their comparison; later a full characterization of either may become necessary. It is appealing that by using an actual fiber, all corrections to its parameters are fully taken into account automatically even when they are not explicitly known. In this sense, the experiment can be thought of as an analog computer in which the fiber itself is utilized to find solutions of its propagation equation.