1 Introduction

A radio-frequency quadrupole (RFQ)-based accelerator has been developed to be used as test stand facility for the 100-MeV proton linear accelerator at the Korea Multi-purpose Accelerator Complex (KOMAC) since the linac started user service [1, 2]. Low-energy beam transport (LEBT) is generally considered as major bottleneck to achieve higher beam current in the high-intensity linac due to strong space charge effects, which require mitigation schemes, such as octupole magnetic field [3] and beam spinning effect [4,5,6]. Experimental studies were conducted to measure and optimize Twiss parameters, including beam emittance, depending on various operating conditions of ion sources or LEBT [7,8,9,10,11]. The design and measurement results of basic beam diagnostics are presented for low-energy beams [12]. An electric-sweep scanner, such as an Allison-type scanner [13] and pepperpot-pot method [14], are advanced techniques to measure beam emittance in phase space. Solenoid scan method [15, 16] is another technique with lens approximation by measuring beam sizes with varying strengths of the solenoid magnet located upstream. In the study using Allison scanner, the space charge neutralization effect was observed over time within a beam pulse depending on the input flow rate of a neutralizing gas [7]. However, these measurement methods take a lot of time in the scan process, and because they are invasive methods that block the beam, they have the disadvantage of being difficult to apply simultaneously with the beam matching process. Non-invasive beam diagnostics methods measure ionization products caused by the interaction between residual gas and beam. The ionization profile monitor (IPM) measures the beam profile by collecting ionization products with a position sensitive detector by applying an electric field to the gas–beam interaction region. A beam-induced fluorescence monitor (BIFM) captures fluorescence photons with a light collection device such as a photodiode or CCD camera [17, 18].

Machine learning is promising and suited to diagnostic analysis and optimization in modern accelerator physics, compared to traditional optimization methods which require specific rules or threshold values [19, 20]. As previous studies, a deep learning model predicts beam current measurements in a faraday cup depending on ion source operating variables [21]. Another study built machine learning models to predict RFQ transmission by LEBT control values and gas pressure measurements set as input variables [22, 23]. However, as device characteristics change over time, the reproducibility of pre-trained models is gradually to be inaccurate, and the models may periodically need to be updated with recent data to obtain optimal performance. Adaptive machine learning was developed to control time-varying systems of particle accelerators and beams by comparing model’s prediction of beam sizes to measurement [24].

The presented study suggests non-invasively measured beam profiles as important features to effectively represent time-varying property of ion source and LEBT using a beam-induced fluorescence monitor as the key low-energy beam diagnostics. If a beam profile measurement is used in the input layer, rather than using only device control variables, relatively accurate feedforward predictions can be made without the need for a complex model to predict or feedback control the time-varying system. For these models to be valid in large parameter spaces, non-invasive and real-time measurement techniques are important, and preliminary experimental results are presented in section II. Before establishing a robust machine learning model in a time-varying system based on long-term experimental beam data, this study develops and validates a machine learning model based on randomly varying initial beam properties in beam dynamics simulation tool—TraceWin. In section III, several machine learning regressors, such as tree-based ensemble, boosting algorithm, and deep neural networks, are compared in terms of error to predict beam transmission ratio through the RFQ accelerator at the KOMAC. Among the different machine learning algorithms, the results of comparing the LEBT tuning map predicted by deep neural networks with the tuning map obtained by parameter scan are presented in section IV.

2 Non-invasive beam profile monitor

In the RFQ accelerator at the KOMAC test stand, three types of beam diagnostics devices—a beam current monitor, a scintillator-based beam profile monitor, and a beam-induced fluorescence monitor—have been operated to analyze low-energy beam transport since the helium beam was first commissioned. Pulsed beam currents are measured by Bergoz’s in-flange AC current transformer, ACCT-CF8″-96.0-40-UHV model. The scintillator-based beam profile monitor is an invasive diagnostics that images the transverse beam profile with a quartz plate installed at an angle of 45° to the beam axis, and is connected to a linear stepper motor that can move vertically.

Beam-induced fluorescence monitor (BIFM) was installed with a view port, lens, and camera sequentially attached to the nearest flange about 265 mm away from the beam extraction aperture as illustrated in Fig. 1, to measure the characteristics of beam formation. A gas is injected into the 2.45 GHz microwave ion source for a continuous plasma discharge, and neutral gas–beam interaction occurs by the residual gas flowing toward the vacuum pump located in the low-energy beam transport section. Transverse spatial beam profiles can be estimated by imaging the fluorescence photon induced by the interaction. Unlike the invasive method, it has the great advantage of being able to measure the beam profile in real time without blocking the beam.

Fig. 1
figure 1

Layout of low-energy beam diagnostics for ion beam injector in the radio-frequency quadrupole-based accelerator test stand at KOMAC

In recent beam commissioning phase, the non-invasive measurement has been successfully performed without residual gas injection such as gas jet or image intensifier due to a moderate level of base pressure—like 0.1 mTorr—to observe fluorescence light at the BIFM near the ion beam extraction system, and degradation of beam quality is not expected with no additional gas injection. Helium beam size measurement data are shown in Fig. 2, depending on forward RF magnetron power and solenoid current fed into 2.45 GHz microwave ion source. These two variables are important control variables for plasma properties, plasma meniscus, and beam extraction since solenoid current determines the static magnetic field, and RF magnetron power forms the electromagnetic field, which affects electron cyclotron resonance heating. When the solenoid strength of ion source is fixed at the optimal value—73.0 A to achieve high current and small beam size, the forward RF power has little effect on the beam size within about 0.2 mm of measurement error and slight effect on the beam current. On the other hand, at the constant forward RF power of 200 W, solenoid strength affects both beam size and beam current above measurement errors. These results imply that the beam diagnostics well represents features of the initial beam over several operating conditions of the ion source. Nevertheless, due to the presence of unknown or unmeasurable variables that affect ion beam extraction, the aging of device elements, including the ion source, causes the output beam characteristics to drift with time, even when the values of the ion source’s control variables are fixed. When creating a data-driven machine learning model for the tuning of LEBT, therefore, beam profile data, not ion source control variables, should be included in input layer to well predict the best beam matching condition with RFQ accelerator even in time-varying initial beam condition.

Fig. 2
figure 2

Helium beam size measurement data a at the fixed solenoid strength of 73.0 A, and b at the fixed forward RF power of 200 A fed into microwave ion source

To develop and validate a machine learning-based low-energy beam transport tuning model, this study uses beam data generated by TraceWin, a beam dynamics simulation tool, by simulating the beam operating conditions of the RFQ-based accelerator at KOMAC. Transverse momentum of a high-intensity beam nonlinearly increases even in drift space with no external field due to nonlinear self-generated fields. Beam envelope calculation solves the beam matrix by assuming a linear space charge force, while beam as multi-particle is normally to be solved by particle-in-cell method with consideration of non-linear space charge field. Figure 3 presents calculated helium ion beam current passing through the extraction aperture, and the beam sizes at the location of the non-invasive beam profile monitor and the first solenoid magnet in multi-particle beam dynamics simulation. This reveals the non-linear relationship between the beam parameters measured in the beam diagnostics and the beam parameters input of the first solenoid magnet which affects RFQ matching.

Fig. 3
figure 3

Multi-particle beam dynamics simulation data—helium ion beam current passing through electrode aperture and beam size at the non-invasive beam profile monitor and the first solenoid magnet

3 Searching for machine learning algorithms

A robust machine learning model is needed to predict the beam matching condition with the RFQ accelerator even though the initial beam properties vary. The input layer consists of input beam twiss parameters, input beam current, and LEBT solenoid strength, and the output layer is beam transmission ratio through RFQ entrance for machine learning models. Datasets were prepared by grid search over a range of the input beam parameters and LEBT control parameters in beam dynamics simulation code—TraceWin [25]. The beam transmission ratio is the ratio of the number of particles surviving a subsequent radio-frequency quadrupole (RFQ). The goal of LEBT tuning or beam matching is to achieve 100% beam transmission ratio from LEBT to RFQ or 0% beam loss ratio by adjusting ion source or LEBT control parameters. Machine learning-based regression models were created to predict transmission ratio as a target feature ranging from a minimum of 0 to a maximum of 1. For the regression problem with 4 numerical input features, several predictive models were built and their performances were compared in machine learning platforms—scikit-learn, PyCaret, and Tensorflow-Keras.

Figure 4 and Table 1 summarize results of exploring machine learning-based regression algorithms with ten-fold cross-validation using PyCaret. The beam dynamics datasets were normalized and split into 70% train datasets and 30% test datasets. It was set to obtain average values for metrics, such as MAE, MSE, RMSE, and R2-score, after training with 10 iterations and ten-fold cross-validation with mean squared error as the loss function. Average inference times are less than 0.3 s in all the models. The extra-trees model shows the best performance with root mean square error (RMSE) of 1.5e-2 for the train datasets. Unlike random forest, the extra-trees model optimally divides nodes after selecting characteristics through random sampling without replacement [26]. However, both extra trees and random forest produced poor results in the test datasets due to over-fitting. Among the boosting-based methods, light gradient boosting machine (LightGBM) showed the better performance in terms of RMSE of 2.7e-2 than stochastic gradient boosting regressor (RMSE = 9.3e-2) and Adaboost regressor (RMSE = 1.5e-1) because LightGBM efficiently discards instances with small gradient [27]. Deep neural networks (DNN) model was trained by four normalized inputs and the three dense layer with 100/100/50 nodes at a fixed learning rate with ReLU activation function and Adam optimizer, resulting in RMSE of 2.8e-2. DNN model seems to be less prone to overfitting since its difference in R2 scores on between train and test datasets is smaller than those of tree-based ensemble methods.

Fig. 4
figure 4

Learning curves, residuals and test prediction of three machine learning-based regression models trained for beam dynamics simulation datasets with ten-fold cross-validation in PyCaret—a Extra trees, b LightGBM, c deep neural networks (DNN)

Table 1 Comparison of several machine learning-based regression models trained for beam dynamics simulation datasets using PyCaret with ten-fold cross-validation

4 Prediction for low-energy beam tuning using deep neural network model

Deep neural networks are generally expected to have better generalization performance than tree-based algorithms for regression problems with continuous variables as target features. This is because DNNs do not learn with a deterministic structure that classifies data based on the attributes of each node, but with a probabilistic structure using activation functions such as ReLU or tanh. In this context, to create a DNN model with better performance than PyCaret and scikit-learn library, we adopt the Keras platform, which is a high-level API of tensorflow, and additionally use AutoKeras, which automatically tunes hyper-parameters. Each case was calculated for 50 epochs with 100 hyper-parameter combinations including use of batch normalization, dropout rate, number of layers, number of nodes, optimizer functions, and learning rate to minimize mean square error (MSE) function. The best model found in AutoKeras is illustrated as Fig. 5 with its learning curve and prediction results. Adam optimizer is adopted and learning rate is converged up to minimum of 1.0e-5 preset in ReduceLROnPlateau function as a learning rate scheduler. In the 448th epoch, the training was terminated early by the callback function to prevent the model from overfitting. The tuned deep neural networks (DNN) model shows better accuracy and less over-fitted than tree ensemble-based methods. The DNN model results in 2.5e-3 of RMSE, which is 6 times lower than the extra-trees model.

Fig. 5
figure 5

Deep neural networks (DNN) model tuned by AutoKeras and training results—a model layout, b training history, c test prediction

Figure 6 is the LEBT tuning data in forms of 2-D density plots, showing that the beam transmission ratio from the LEBT to the RFQ changes with the current strength of the two solenoid magnets that make up the LEBT. These figures compare real test data obtained by parametric scan that changes the strength of the LEBT solenoid magnets to prediction data inferred by the tuned DNN model. There is a high transmission case in which the initial beam current from the ion source is 10 mA and the root mean square (RMS) beam size at the BIFM position is 4.25 mm, and a low transmission case in which the beam current is 20 mA and the RMS beam size is 6.73 mm. It is seen that beam tuning maps are successfully reconstructed in both opposite cases. In particular, parametric scan is a time-consuming data generation process in actual beam experiments, but the data-driven DNN model is expected to have the advantage of inferring data immediately with finer intervals and a better precision.

Fig. 6
figure 6

Comparison of low-energy beam tuning maps as a test data and b prediction data by tuned DNN model in a high transmission case and a low transmission case

5 Conclusion

Machine learning techniques were deployed to tune low-energy beam under various operating conditions. Deep neural network model has the best performance in low-energy beam tuning among several machine learning algorithms. Total inference time is normally spent less than half a second—corresponding to the operation period of 2 Hz pulse repetition—to obtain beam tuning maps and optimal set points of magnets, given data-driven deep neural network models. The methodology presented in this study can facilitate real-time optimization with high accuracy in accelerator system sensitive to variability. Further experiments will be performed with helium beam and deuterium beam in the RFQ-based accelerator test stand at the KOMAC.