1 Introduction

Extreme high laser intensities have been pursued in many labs around the world [1,2,3,4,5,6,7] for cutting-edge researches, such as ICF, particle acceleration, lab astrophysics, QED, etc. [8,9,10,11]. Based on CPA [12] or OPCPA [13] configurations, both the single pulse energy and the peak power of the lasers have been growing steadily over the past few years, partly thanks to the increasing size of the laser gain media and the optical gratings [14, 15]. However, these techniques have been pushed to the limits, while the laser intensities interesting for many scientists still lie far beyond what can be achieved with current technologies. Several approaches have been proposed to achieve higher peak power, such as using tiled or synthesized gratings to increase the effective size of the grating [16,17,18,19,20,21,22,23] or coherent combining multiple channels of intense laser beams [24,25,26,27].

The technique of coherent beam combining (CBC) has been studied and applied with fiber lasers for many years [28, 29] and in great success. State of the art technologies can deliver average power above 10 kW [30], with temporal properties ranging from continuous wave to sub-picosecond pulses. Recently, coherent combining multiple beams from solid state lasers have been studied with the aim of applying CBC to petawatt (PW) level facilities [31,32,33,34,35]. For example, the Shanghai Extreme Light (SEL) facility aims at the output of 100 PW peak power with 1500 J single pulse energy and 15 fs pulse duration. SEL is based on OPCPA. After the last amplification stage, the nanosecond 500 mm-by-500 mm beam will be 2500 J. Direct compression will damage the grating. One way to avoid this is to split the amplified beam into four beams with equal energies. Then, the four beams are compressed separately. Last, they are tiled into 2-by-2 beam matrix before final combined and focused in a vacuum chamber.

Managing a single-beam PW laser is already a challenging task. Managing a PW CBC system is more than simply adding up the work for separate sub-beams. It also involves the combining. For fiber-laser CBC system, the output of each single channel is of high beam quality and the pointing of the channels can be well calibrated and fixed, therefore the only factor affecting CBC efficiency is piston phase among the channels. The project of James-Webb Space Telescope (JWST) led by NASA involves combining several beams for imaging [36]. It needs to adjust the pointing direction as well as the piston of each mirror segment to achieve a high-quality image of the stars. The PW CBC system faces extra challenges. For example, the wavefront of the meter-sized beam will be deformed and even influenced by beam pointing direction; the measurement data of one beam may be contaminated by other beams. They all need to be corrected or avoided. Considering the fluctuations of the environment around the system, these works need to be done on a daily basis.

The motivation of this work is to provide a systematic method to cope with the complex situations in aligning a PW CBC system. To simplify the system, the method proposes to only exploit the information of the fluence distribution (measured by CCDs) in the far field as the feedback for the control of a deformable mirror (DM) and a reflective mirror in each optical path to optimize the system in order to achieve coherent combining efficiency as high as possible. The full workflow in the method consists of 8 steps, among which the nontrivial ones are demonstrated through numerical simulations.

It should be noted that real system may suffer from various dynamical disturbances. The method in current work only optimizes a system under quasi-static conditions, i.e., we assume the system state does not change during optimization process. This assumption can be valid for single-beam PW system. Whether it is valid for a PW CBC system is not sure because no such full-scale system has been built yet. According to previous simulations [37,38,39,40,41,42], to achieve better than 80% focusing efficiency with SEL’s laser parameters, the pointing deviation of each beam should be within 0.5 \(\mu {\text{rad}}\); the delay among the channels should be within 1 fs, which are very stringent. On the other hand, even for the single-beam PW system the shot-to-shot fluctuations such as pointing direction cannot be corrected in real-time. It is better to improve the environment condition and reduce the fluctuations in passive ways, instead of real-time corrections.

The rest of the paper is organized as follows. Section 2 describes the numerical model used to simulate the PW CBC system. Section 3 describes the alignment method in detail. Section 4 presents one implementation of the method with traditional algorithms, demonstrates the workflow and assesses the performance of the optimized configurations. The influence of spatiotemporal coupling effects on combining efficiency is also analyzed. Section 5 discusses the potentials and improvements of the method. The paper ends with a conclusion in Sect. 6.

2 Numerical model

This work uses laser parameters according to the design of SEL facility. The details of the numerical model for the CBC system is described in previous work [42]. Figure 1a, b shows the minimum configuration of the CBC system. The input beam from the last amplifier first propagates through a beam splitting module, resulting in 4 beams with same size, balanced dispersion and one quarter of input energy. The four beams are compressed separately with grating compressors. The compressed pulses are incident with small angles onto a DM and then reflected on several mirrors before joining together on a large mirror. From then on, the tiled 2-by-2 beam matrix propagates as a single beam and is finally focused by a parabola in the chamber to perform experiments.

Fig. 1
figure 1

a 3D model of the optical layout of the femtosecond petawatt coherent combining system and b its top view. Black dot surrounded by red-dotted circle indicates the nominal focus; part (1) surrounded by black-rectangle is beam splitting module; part (2) are pulse compressing modules; mirrors indicated by black-dashed circles are deformable mirrors; black-solid ellipses indicate reflective mirrors for controlling beam delay and pointing. c Ideal spatial-spectral distribution of the coherent-combined pulse at the nominal focus and d the corresponding spatiotemporal distribution. The unit of time axis is converted to length. eh Ideal spatial-spectral distributions of each path. The 3D distribution is shown in two isosurfaces (20% and 50% of peak value)

In the ideal situation, the input beam and all the optical elements are free of deformations; the pulses are compressed to Fourier-transform-limited durations and are synchronized; the pointing directions of all beams are well calibrated. The beam parameters at the nominal focus are used as normalization values in later calculations. Figure 1c–h shows the 3D intensity distributions of the coherent-combined pulse and sub-pulses from each path at the nominal focus in the ideal situation.

To generate a configuration resembling real situations, the translational positions, pointing directions and surface shapes of all optical elements randomly slightly deviate from ideal conditions. The surface deformations are represented as a linear combination of 42 orthonormal functions (from the 4th polynomial on in Table 15 in [43]; for rectangular domain the coordinates are rescaled).

To optimize the configuration, the 2-dimentional field fluence distribution at the nominal focus is used as the feedback. Note that in real experiment the fluence distribution at the focus needs to be amplified in size by about 40 times to be well resolved by a CCD and relayed to other locations. This detail is not considered here. The control is through a DM and a reflective mirror in each of the four paths. The DM corrects wavefront curvature; the reflective mirror adjusts pointing direction and delay. The correction by the DM is also synthesized with the same 42 orthonormal functions. All the deviations will be corrected based on information of the image of the fluence distribution at the focus in this work. In real experiment, various kinds of measurement can be augmented to provide more information. Here we only focus on the “minimum configuration”.

For a numerical experiment, a CBC system is generated in which each optical element’s position, pointing and surface shape deviate from ideal values randomly. Then, the workflow of the method optimizes the focusing efficiency (feedback) by adjusting specified mirrors (control) in the system.

3 Alignment method

The full workflow for the method proposed in this work is shown in Fig. 2. It consists of 8 steps. The first step makes sure the four beams are captured by the CCD simultaneously. To determine whether it is so, record one image, move the pointing direction of a reflective mirror in one path, and see if there is a difference in the image before and after the move. If there is a difference, then the beam is captured; if not, then the beam is missed, and scan the direction of the mirror in a larger range until the beam is captured or a human intervention is needed. Then, repeat for other paths until all beams are detected.

Fig. 2
figure 2

Block diagram of the workflow of the method. The blue-dashed parts are briefly described, and the red-solid parts are simulated in this work. WFC wavefront correction

The fluence distributions (FD) of the four beams may overlap onto each other. For some operations such as optimizing the wavefront of one path, interference effects with other paths may corrupt the feedback information. The 2nd step makes sure the four beam are temporally well separated so that there is no interference. To achieve this, record one image, change the path length of one path, and see if there is a difference in the image. If there is no difference, then the combination is incoherent; if there is, scan the path length in a larger range until the beam is apart from others temporally. Then, repeat for other paths until all four paths are separated. This step prepares for the next step, coarse wavefront correction (WFC).

The coarse WFC step performs the wavefront correction for each path under the condition that the FDs of the four beams overlap onto each other. It is necessary because initially the FDs may look like a scattered pattern and spread over a large area on the CCD due to severe wavefront deformations. Complete separating the four distributions in space by turning the directions of the beams in large angles results in several problems. Large deviations from intended path may induce unwanted scattered light that contaminates the detection, and it is not determined which path corresponds to which beam profile on CCD. This step ensures that the FD of each path are well localized without first separating them. Building on this result, a mapping between path and FD can be established in the 4th step, beam identification.

After the 4th step, we can put each beam at specified locations on the CCD, and therefore fine WFC in the 5th step can be performed, in which only one beam is at the center of the CCD for correction while the other beams are moved away. After fine WFC, the mapping between path and FD should be reestablished in the 6th step for accurate locating. The 7th step, coarse synchronization, reverses the 2nd step by bringing all 4 beams into coherent ranges and moving all 4 FDs to the center of CCD. This will make the 8th step easier. In the 8th step, coherent combining is performed by adjusting the pointing directions and translation positions of the four reflective mirrors simultaneously to maximize the total fluence through a small virtual aperture at the center of CCD.

The 1st, 2nd and 7th steps are easy and can be implemented as described above. The other steps can be implemented in different ways depending on the algorithms. The following section demonstrates implementations using traditional algorithms.

It should be noted that the optimization of grating compressors is not considered in the current work mainly because of the large computational cost. To accelerate the simulation process, ray matrix at single central wavelength (910 nm) is used. After optimization, the system performance is assessed by calculating the 3D intensity distributions of the pulses at the focus with full spectrum. In real experiments, CCD can detect narrowband beam with a filter or the pulse with full spectrum, and the method can apply in most situations for both cases.

4 Implementations

4.1 Wavefront correction

During fine WFC, only the beam to be corrected is near the center of the CCD with the other three moved away. Many established image-based wavefront correction algorithms [44,45,46,47,48] can be used. In this work, the SPGD algorithm [44] is used, which is widely adopted in fiber laser CBC system.

Instead of electrical volts applied to each actuator in a DM which is not available to us yet, the control signal is the 42-dimension vector \(a = \{ a_{4} , \ldots ,a_{j} , \ldots ,a_{{45}} \}\) corresponding to the coefficients of the orthonormal functions [43], and the surface shape of the DM is,

$$d({\mathbf{r}}) = \sum\limits_{j = 4}^{45} {a_{j} S_{j} ({\mathbf{r}})} ,$$
(1)

where \(S_{j} ({\mathbf{r}})\) is the jth orthonormal function; \({\mathbf{r}}\) is the 2D projective coordinates \((x,y)\) of the surface. The RMS of the surface deformation is \(\sqrt {\sum\limits_{j = 4}^{45} {a_{j}^{2} } }\). The signal to be optimized is the peak value of the fluence distribution. There are two parameters in the algorithm, i.e., the dithering amplitude \(\Delta a\) and the gain \(g\) which need to be properly chosen. The parameters depend on the scales of other variables and need to be determined through trial (numerical) experiment. The control signal is updated after each iteration of the algorithm. Three measurements of the fluence distribution are needed in one iteration.

The SPGD algorithm is proved to improve the target signal in a statistical sense [44]. However, it is often found that the target signal decreases or fluctuates during the optimization process, even after the dithering amplitude and gain have been chosen carefully. To avoid such situation continuing too long, a fixed number of iterations is chosen as one episode. After each episode, the best result is selected as the starting pointing for the next episode. This does not prevent the target signal from occasional decreasing, but initiate a new searching direction in the complex high-dimension space between episodes.

For coarse WFC, the signal to be optimized is changed to the peak value of the difference between two consecutive fluence distributions during one iteration.

4.2 Beam identification

Beam identification (BI) is achieved through moving each fluence spot towards the center of the CCD. The combining of four paths is assumed to be incoherent. The coordinates of the target location (here is the center of CCD) is \((x_{{\text{t}}} ,y_{{\text{t}}} )\); the pointing state of the control mirror in path m is \((\alpha_{m} ,\beta_{m} )\). The basic procedure of moving fluence spot of path m towards the center can be described as follows: 1). Record current CCD frame \(I_{0}\); 2). Change pointing state to \((\alpha_{m} + \Delta \alpha ,\beta_{m} )\) and record CCD frame \(I_{\alpha }\); 3). Obtain the change of coordinates \((\Delta x,\Delta y)\) of the fluence peak from \(I_{0}\) and \(I_{\alpha }\); 4). Obtain proportional factors \(p_{\alpha } = (\frac{\Delta x}{{\Delta \alpha }},\frac{\Delta y}{{\Delta \alpha }})^{{\text{T}}}\); 5). Change pointing state to \((\alpha_{m} + \Delta \alpha ,\beta_{m} + \Delta \beta )\) and record CCD frame \(I_{\beta }\); 6). Obtain the change of coordinates \((\Delta x,\Delta y)\) of the fluence peak from \(I_{\beta }\) and \(I_{\alpha }\); 7). Obtain proportional factors \(p_{\beta } = (\frac{\Delta x}{{\Delta \beta }},\frac{\Delta y}{{\Delta \beta }})^{{\text{T}}}\); 8). Construct matrix \({\mathbf{P}} = [p_{\alpha } ,p_{\beta } ]\); 9). Obtain the change of coordinates \((\Delta x,\Delta y)\) from current position to the target \((x_{{\text{t}}} ,y_{{\text{t}}} )\); 10). Obtain and execute the change of pointing state \(\left[ {\begin{array}{*{20}c} {\Delta \alpha } \\ {\Delta \beta } \\ \end{array} } \right] = {\mathbf{P}}^{ - 1} \left[ {\begin{array}{*{20}c} {\Delta x} \\ {\Delta y} \\ \end{array} } \right]\).

The basic procedure assumes a linear relation between pointing direction change and fluence spot shift. However, due to the ubiquitous surface deformation of the mirrors, this assumption is not valid, and even the shape of the fluence spot changes with the pointing direction change. Even worse, if the wavefront of a single beam is seriously deformed the fluence distribution is not localized and can be difficult to identify.

To solve this problem, before BI the wavefront deformation of a beam needs to be corrected to the extent that a main stable spot emerges, under the condition that the fluence distributions of all beams overlap onto each other on the CCD. This is what the 3rd step in Fig. 2 does. Once this preliminary correction is completed, the basic procedure for BI can be performed for each path. To ensure a robust identification, the above 10-step procedure can be executed several times until the location of the spot converges.

With reliable BI, the beam spots can be put at or away from the center of CCD in a reliable way.

4.3 Coherent beam combining

According to one of SEL’s future applications, the transverse location of the peak intensity should be within a circle of 1 \(\mu {\text{m}}\) radius centered at the nominal focus. Therefore, the signal to be optimized for coherent beam combining (CBC) can be set to maximize the total fluence through a virtual aperture with 1 \(\mu {\text{m}}\) radius at the center of CCD. The SPGD algorithm will be used in this step too. The control signal is a 12-dimentsion vector \(c = \{ t_{1} ,\alpha _{1} ,\beta _{1} ;t_{2} ,\alpha _{2} ,\beta _{2} ;t_{3} ,\alpha _{3} ,\beta _{3} ;t_{4} ,\alpha _{4} ,\beta _{4} \}\) corresponding to three degrees of freedom (translation, tip and tilt) for each of the four control mirrors (indicated by black-solid ellipses in Fig. 1).

4.4 Workflow demonstrations

A deformed CBC configuration from the ideal one is firstly generated, in which each optical element’s position, pointing direction and surface shape deviate from ideal values randomly. There are near 50 m-size optical elements in total, including reflective mirrors, beam splitters, glass plates and gratings. The translation deviation of each optics is sampled randomly from [-50, 50] nm; the tip and tilt deviations are from [− 0.6, 0.6]\(\mu {\text{rad}}\); the RMS of surface curvature is \({\lambda \mathord{\left/ {\vphantom {\lambda {10}}} \right. \kern-\nulldelimiterspace} {10}}\) (= 91 nm), but the coefficients of the 42 polynomials are random. The deviation range may seem small. However, with many optics involved the optimization is nontrivial. It should be noted that only the red-solid blocks in Fig. 2 are simulated and demonstrated in the following; the blue-dashed blocks are assumed to be done where needed.

Figure 3a–d shows the FDs of the four paths separately before any optimizations. It should be noted that CCD detects the incoherent superposition of the four distributions and the optimization process uses the CCD frame data shown in Fig. 3e. The process of coarse WFC consists of 10 episodes, each including 60 iterations. In this step, the dithering amplitude decreases from 60 to 30 nm and the gain is fixed at 2.5. Figure 3f–i shows the results after coarse WFC, in which a single spot emerges for each path and can be identified robustly. After coarse WFC, the BI algorithm (\(\Delta \alpha = \Delta \beta = 0.1\mu {\text{rad}}\)) is repeated 10 times for each path to bring all the spots to the center of CCD. Then, the spots can be moved to the four corners away from center as shown in Fig. 3j.

Fig. 3
figure 3

2D fluence distributions of the 910 nm beam. ad Four paths before any optimizations and e their incoherent superposition on CCD; fi after coarse wavefront corrections; j after beam identification; kn after fine wavefront corrections; o after beam identification; ps after coherent beam combining. t The final coherently combined beam. The units of axes are micrometer

During fine WFC, one round of optimization consists of 20 episodes, each including 60 iterations. The dithering amplitude decreases from 24 to 6 nm with the gain fixed at 2.5. After one round of optimization for each path, the peak values of fluence distributions are all about 90% of ideal values as shown in Fig. 3k–n, and the difference between peak-values are within 5% of the highest peak value. In case the peak fluence of some path is lower than 95% of the current best path, a new round optimization for that path will be started. In this way, a kind of “competition” among the four path is formed to ensure the best final result within limited time.

After fine WFC, the BI algorithm is repeated 10 times again for each path to bring all the spots to the center of CCD. This stage is necessary because after WFC the location of the spot is changed and therefore re-identification is needed. Figure 3o shows the result after BI, and the superposition is incoherent (The ideal peak value should be 4). Following BI is the stage of coherent combining. This stage consists of 12 episodes, each including 10 iterations. The dithering amplitude for translation is fixed at 64 nm; for pointing it is 0.1 \(\mu {\text{rad}}\), and the gain is 2.5. After optimization, the focusing efficiency is about 85% for 910 nm beam shown in Fig. 3t. The locations of each sub-beam can be seen in Fig. 3p–s. Continuing the optimization process may further improve the efficiency at the cost of time. A movie recording results during the process can be watched in Visualization 1.

It should be noted that the choice of the “hyper-parameters” such as the number of episodes in each round and iterations in each episode is heuristic. They should be chosen based on practice, both in numerical simulations and experiments. If each iteration takes short time, then a larger number can be chosen for each episode. In general, the total number of iterations (summing all episodes) will decrease for system with small aberrations and high stability.

4.5 Performance assessment

To assess the optimized configuration, the propagation of a pulse with full spectrum (825–1025 nm sampled at 52 wavelengths) through the system is simulated. Figure 4 shows the 3D spatial-spectral and spatiotemporal intensity distributions of the coherent combined pulse and each sub-pulse at the focus. The pulse duration is 23 fs and the peak intensity is 28.1% of the ideal situation. This efficiency is too low for a CBC system. Although the optimization method results in configuration that leads to 85% combining efficiency at 910 nm, this high efficiency does not maintain at other wavelengths as shown in Fig. 4. This is due to the spatiotemporal coupling (STC) effects induced by surface deformations of optical gratings, which is already well known in single-beam PW lasers [49,50,51].

Fig. 4
figure 4

3D intensity distributions of the single pulse in each path (p1–p4) and the coherent combined pulse at the focus after 12 episodes of coherent combining optimization. The unit of time axis is converted to length. The 3D distributions are shown in two isosurfaces, i.e., 20% and 50% of peak value

For the sub-pulse of each path, the focusing efficiencies are 56.6%, 72.9%, 75.2% and 61.5%, respectively. The difference among the paths is because the surface deformations of each optics are generated randomly. The STC in path 1 is more difficult to compensate than the other paths. When coherently combined, the efficiency (28.1%) is barely half of the lowest value (56.6%) of single pulse. This is due to another level of mismatch, i.e., the mismatch of phase differences among the four beam at each wavelength. It is manifested as an obvious pulse front-tilt (PFT) shown in \(\lambda - {\text{X}}\) plane compared with single-pulse situations.

It is possible to improve the efficiency by balancing the performance at multiple wavelengths. Simply use the sum of FDs at multiple wavelengths as the feedback information, and it is natural in real experiment because it is a pulse that propagates through the system and get detected by the CCD. This situation is demonstrated with simulations using 4 different wavelengths (i.e. 868 nm, 910 nm, 957 nm and 1008 nm) in ray tracing and optimization. The focusing efficiencies of sub-beams and combined beam change to 62.9%, 76.0%, 78.4%, 65.3% and 31.3%, respectively. There is about 3% improvement. However, even if the whole spectrum is taken into account, the combining efficiency may not surpass 40%. This low performance is inherent in the configuration, particularly due to the surface deformation of optical gratings.

To justify this point, a new configuration is generated in which the RMS of surface deformation for each non-grating optics increases from \(0.1\lambda\) to \(0.16\lambda\), but the surfaces of the 16 gratings are perfectly flat. The same workflow of the method is applied to this configuration. After optimization, the focusing efficiencies of sub-beams and combined beam increase to 88.8%, 92.0%, 89.6%, 87.2% and 68.0%, respectively. The CBC efficiency is more than double of the previous result, and if we reduce the translation dithering amplitude from 64 to 32 nm it increases to 72% (further reducing the translation dithering amplitude does not increase CBC efficiency). Figure 5 shows the 3D intensity distributions of each path and combined pulse. The energy concentration at each wavelength improves a lot compared with Fig. 4, indicating that STC effect is small. Therefore, in order to achieve a decent CBC efficiency, the surface deformation of the gratings must be corrected (active control or better manufacture). According to the current and previous simulations, the RMS of the grating surface deformation should be less than \(0.06\lambda\) to get a decent combined efficiency (> 50%).

Fig. 5
figure 5

3D intensity distributions of the single pulse in each path (p1–p4) and the coherent combined pulse at the focus after 12 episodes of coherent combining optimization. The unit of time axis is converted to length. The 3D distributions are shown in two isosurfaces, i.e., 20% and 50% of peak value

In general, the proposed method optimizes the CBC system to its best achievable performance in an automatic way. This may help managing a large-scale complex laser system like PW CBC system and ensure a robust alignment performed on a daily basis.

5 Discussions

Most optical elements with large sizes have high-frequency surface deformations. Although 42 orthonormal polynomials are used to synthesize the surface deformation, it cannot represent high-frequency modulations. On the other hand, high-frequency wavefront modulation cannot be removed by DMs in real experiments. Fortunately, the above method is model-free. It will take the best advantage of the DMs to correct whatever deformation there is. Therefore, if future technologies allow DMs to correct high-frequency modulation, the proposed method will make good use of it.

A grating with reduced-size aperture is also recommended to achieve better surface figure and hence better CBC efficiency. This would also mean that a larger number of beams are needed to achieve same level of power, which further complicates the system. On the other hand, this method can be applied to a larger number of beams, as long as the overall numerical aperture (N.A.) of the combined beam is limited to some value determined by engineering practice. The N.A. in this work is less than 0.3. For system with larger N.A., dividing the beams to groups is necessary so that each group has smaller N.A. and can be optimized separately. A stitching procedure may be needed between adjacent groups.

Most of the time the laser is in maintain state in which a weak probe beam with high repetition rate is available for alignment purposes. However, the laser shot with full energy will have different wavefront from the probe beam, which should be measured and monitored at places off the main optical path and through different paths. As a result, it is also necessary to correct noncommon-path aberrations (NCPA) in real experiment [52]. The characterization of NCPA relies on high-quality optimization of the probe field at the focal plane. Only after the optimization can we accurately determine the reference wavefront at the monitoring places.

New method implementations are still needed to improve the correction speed and cope with difficult situations including dynamical disturbance which is not considered in this work. One of the advantages of the current workflow is that it can help collecting large quantities of data automatically for training a model based on machine learning (ML) techniques [48]. The ML model is well known for its fast response speed, accuracy and capability. Once trained, it can be incorporated into the current method to form a hybrid method which will be much more effective.

6 Conclusion

This work proposes a method for automatic aligning a femtosecond multi-petawatt coherent beam combining system under quasi-static conditions. The method only exploits the information of the fluence distribution in the far field as the feedback for the control of a deformable mirror and a reflective mirror in each optical path, which may simplify the already very complex system. The feasibility of the method is demonstrated through numerical examples. It is also found that spatiotemporal coupling effects induced by gratings have a large negative impact on coherent combining efficiency. This work provides a framework to manage the large-scale coherent combining laser systems, and may benefit relevant projects around the world.