1 Introduction

The photon is an ideal probe to investigate nuclear structure. Also, it has many different applications. Although photon beam can be produced in various ways, the bremsstrahlung photon production is the most commonly used method. When the accelerated electron beam hits a target, the so-called radiator, bremsstrahlung photons are resulted, as the kinetic energy of the projectile electron converts into electromagnetic energy.

The intensity of the produced bremsstrahlung photons depends on the radiator material and thickness. The production of bremsstrahlung photons can be investigated experimentally, or using various simulation techniques.

The experimental production of bremsstrahlung in laboratory is difficult in terms of the cost and the special needed conditions. Thus, simulation techniques have an advantage for this purpose.

Large number of works can be found on the use of ANN method to obtain analytical data from the results of Monte Carlo simulation method. For example, Doostmohammad et al. [1] have used the Monte Carlo and ANN methods to simulate qualitative prompt gamma neutron activation analysis. Hussain [2] has developed an ANN model from Monte Carlo computed bremsstrahlung spectra for different medical accelerators.

The ANN concepts have emerged as a result of mathematical modeling of the human brain learning process. The modeled nerve cells imitate the real neurons. The cells constitute a network as they attached to each other. These networks are capable of learning, memorizing, and exposing the relation between the data.

In this work, the ANN has been utilized for modeling of bremsstrahlung photon flux values, which are obtained with FLUKA simulation code [3] for Ta target in different thicknesses.

2 Material and method

2.1 Monte Carlo simulation code FLUKA

FLUKA is a tool for calculations of particle transport and interactions with matter, covering an extended range of applications spanning from proton and electron accelerator shielding to target design, calorimetry, activation, dosimetry, detector design, etc. FLUKA can simulate about 60 different particles with high accuracy, including photons, electrons, neutrons, heavy ions, and antiparticles. The program can also transport polarized photons (e.g., synchrotron radiation) and optical photons. The lowest transport limit for all particles is about 1 keV [4].

Fluka uses an original transport algorithm for charged particles, including complete multiple Coulomb scattering treatment. It also uses Bethe–Bloch theory for energy loss mechanism. It considers bremsstrahlung and electron pair production at high energy by heavy charged particles [5].

Fluka can handle very complex geometries, using an improved version of the well-known Combinatorial Geometry (CG) package. Fluka is having a rich set of biasing options for electrons, photons, and neutrons. Fluka scores fluence and current as a function of energy and angle, via boundary-crossing, collision, and track-length estimators coincident with regions or region boundaries. It can also score track-length fluence in a binning structure (Cartesian or cylindrical) independent of geometry.

Using FLUKA, bremsstrahlung photon spectra produced by electron beam energy of 15 MeV were obtained for 2, 8, 12, and 18 μm thicknesses of the Ta target. The simulation was started with 107 primary electrons. Then the photon fluxes were generated depending on energy values for each thickness. In the simulation code, the Ta target of 8 mm radius of simple disk geometry is used (Fig. 1).

Fig. 1
figure 1

The geometry of FLUKA for radiator

2.2 Artificial neural networks (ANNs)

ANNs are trained using the available data and then are tested with data not used during training. Because of their capabilities, learning and generalization, to tolerate mistakes and to take advantage of the wrong examples, in modeling of both linear and nonlinear systems have found a very wide range of applications [68].

A neural cell consists of five main parts, the inputs, weights, total function, activation function, and output. Inputs are the information that enters into one cell from the other cells or the external environment. Weights (w i ) indicate the importance of information from the input cell and the impact of cell. In Fig. 2, that weights can be seen.

Fig. 2
figure 2

The structure of artificial nerve cell

Total function is a function that calculates the effects of elements on all inputs and weights. This function calculates the net inputs that come to cell. Each input value is multiplied by its own weight. This function is expressed as follows:

$$ {\text{net}} = b + \sum\limits_{\imath = 1}^{n} {w_{i} } x_{i} $$
(1)

Here x i is the value of the i input neurons, w ij is the weight coefficients, n is the total number of entries to cell, b is the threshold value that prevents to be zero output of the network and the threshold value is ±1, and Σ is the total function.

Feed-forward ANNs comprise a system of neurons that are arranged in layers. Between the input and output layers, there may be one or more hidden layers. The neurons in each layer are connected to the neurons in a subsequent layer by a weight w, which may be adjusted during training. A data pattern comprising the values x i presented at the input layer i is propagated forward through the network toward the first hidden layer j. Each hidden neuron receives the weighted outputs w ij x ij from the neurons in the previous layer. These are summed to produce a net value, which is then transformed to an output value upon the application of an activation function [9]. A typical three-layer feed-forward ANN is shown in Fig. 3.

Fig. 3
figure 3

A neural network structure showing three layers

Activation function is a function that determined cell outputs by function derived from the total net input through an operation. In general, in multilayer sensor model, activation function f(net) is used as a sigmoid function. Using the f(net) function is calculated the output of nerve cells. The f(net) function is shown following:

$$ y_{i} = f({\text{net}}) = \frac{1}{{1 + e^{ - net} }}. $$
(2)

Output is sent to the outside world or to another cell that is derived from nerve cells. ANN consists of many connected nerve cells. The combination of nerve cells is not random. In general, an ANN consists of three layers, namely the input, hidden, and output layers. Each layer comes together in parallel. Inputs are applied to the input layer, and outputs are obtained in the output layer. Hidden layers are located between input and output layers. Outcomes cannot be observed directly in this way may be referred to one or more hidden layers [10, 11]. Theoretically, the number of hidden layers and hidden nodes in a neural network can be unlimited. However, as pointed out by Funahashi [12] and Hornik et al. [13], an ANN with a single hidden layer containing a sufficient number of nodes can approximate any functional relationships to any degree of accuracy. Therefore, ANNs designed with three layers, including only one hidden layer, are usually preferred in practical applications. Otherwise, experiments can be performed to determine the optimal ANN layout by starting with a single hidden layer and comparing validation errors when the number of hidden layers is gradually increased. The procedure of adding a hidden layer is repeated, and network training is restarted until the validation error is observed to bottom out and start increasing. The determination of the desirable number of hidden nodes follows the same procedure of obtaining the optimal number of hidden layers.

Multilayered sensor networks operate according to trained learning strategy. Training of these networks was carried out according to the generalized delta rule. Of the network to learn needs, a set called training set occurred from samples. Also for that training set, for each sample in that training set are determined both the inputs and the outputs that needed to produce for the inputs. Learning rule is used to reduce the minimum difference between the produced outputs during training of the network and needed to produce outputs by distributing the weights. During learning, firstly inputs are presented to network and the corresponding outputs for these inputs are produced. This process is called forward calculation. Then, the expected output is compared with the output produced, and the weights are changed by distributing backward error between them. This is called the backward calculation [14].

In this study, the photon flux obtained using the FLUKA code system for modeling an artificial neural network model has been developed. In this model, the target material, thickness, and electron energy are used as inputs to the photon flux used in the output. Prior to execution of the model, standardization is done according to the following expression such that all data values fall between 0 and 1:

$$ F = \frac{{F_{i} - F_{\min } }}{{F_{\max } - F_{\min } }} $$
(3)

where F is standardized value of F i , and F max and F min are maximum and minimum values in all the observation sequences. The main reason for standardizing the data is that the variables are usually measured in different units. By standardizing the variables and recasting them into dimensionless units, the arbitrary effect of similarity between objects is also removed [15].

Generally, ANN models (i, j, k) are shown with network architecture, where i is the number of neurons in input layer, j the number of neurons in the hidden layer, and k the number of neurons in the output layer. In this study, the number of input layer neurons, i, is 2, the number of hidden layer neurons, j 1 and j 2, is 4 and 5, respectively, and the number of output layer neurons, k, is 1.

3 Results and discussion

The photon flux from Ta target in different thicknesses has been estimated using developed ANN model. In the training of ANN model, as input from 1 to 15 MeV energy values and as output flux values 2, 8, 12, and 18 μm thickness of Ta target were used. In Fig. 4, the results obtained by FLUKA and estimation of the ANN have been displayed for training data. It can be seen from this figure that the flux obtained by ANN and FLUKA is in good agreement. The obtained results with FLUKA and ANN for 5-, 10-, and 20-μm-thick Ta target are displayed as a function of bremsstrahlung photon energy in Figs. 5, 6, 7.

Fig. 4
figure 4

Correlation between simulated Flux (Part/GeV/cmq/pr) results by FLUKA and ANN model at the end of training

Fig. 5
figure 5

The obtained flux results with FLUKA and ANN for 5 μm as a function of energy

Fig. 6
figure 6

The obtained flux results with FLUKA and ANN for 10 μm as a function of energy

Fig. 7
figure 7

The obtained flux results with FLUKA and ANN for 20 μm as a function of energy

As shown in these figures, the obtained results with FLUKA and the estimated results with developed ANN model are consistent. The correlation graphs that show compatibility between the results are shown in Figs. 8, 9, 10, where it is clearly seen that the correlation constant R 2 is over 95 %.

Fig. 8
figure 8

Correlation of Flux (Part/GeV/cmq/pr) values between FLUKA and ANN model for 5 μm

Fig. 9
figure 9

Correlation of Flux (Part/GeV/cmq/pr) values between FLUKA and ANN model for 10 μm

Fig. 10
figure 10

Correlation of Flux (Part/GeV/cmq/pr) values between FLUKA and ANN model for 20 μm

It can be concluded from this work that the bremsstrahlung photon flux can be estimated using developed models of ANN.