Introduction

Drug release and dissolution are critical for the dosage forms like tablets, capsules, creams, ointments, and implants1. Modified-release (MR) dosage forms are designed to release the drug gradually and steadily over the prescribed duration, and ensures that the medication remains effective over a prolonged period, without causing any adverse effects or sudden spikes in the drug concentration in the body2,3. Several process and formulation variables are involved in the drug development process; the best combination of ingredients is found by using the multivariate optimization method4. The pharmaceutical industry is concentrating on developing new technologies for oral drugs at low-cost and at the minimal amount of time5. Currently, Pharmaceutical formulation development relies on trial-and-error techniques; the discovery, development and maintenance of this process require a lot of time, money, and labour6,7. However, with the right approach and resources, it is possible to streamline this process and make it more efficient. Reducing healthcare costs and increasing production of Active Pharmaceutical Ingredients (APIs) can be quite challenging for the pharmaceutical industry. To develop successful strategies for manufacturing the new drug product, the industry must determine the ideal formulations. Traditional methods such as Response Surface Method (RSM), Composite Experimental Design (CED). Shows challenges in the modelling of these interactions8. One of the challenges in modelling the drug formulation is understanding the relationships between process variables and unique pharmacological responses9. Pharmaceutical experts may develop crucial features of novel medications, such as higher absorption and controlled administration, through the formulation of drug compositions10. The drug release kinetics can be compared using the range of mathematical models including zero-order, first-order, Higuchi model, Hixson Crowell, quadratic, and Weibull models.11,12,13.

Sutariya et al.14 studied the benefits of ANNs in pharmaceutical research. ANNs ability to predict complex nonlinear interactions and combine experimental and evidence-based data makes them a valuable tool in solving complex problems15. ANNs make predictions, detects trends and, draws decisions on information previously stored in the network. Once trained, the network can predict outcomes for the untested data and gives the best possible result. These makes them ideal for dealing with formulation optimization challenges in the development of drug products16. ANNs used to optimise drug release characteristics by several authors as reported in the literature2,4,17,18,19.

As a universal approximator, Multilayer Perceptron (MLP) can approximate any nonlinear function with arbitrary accuracy when sufficient processing elements are provided. Prediction of drug release profiles can also be assumed as a function approximation problem and it is achieved by the MLP network. The drug components which meet the desired criteria of drug release can be determined using the Feedforward neural network in the MATLAB software20.

This study also compares the drug release characteristics of the predicted formulation of Quetiapine Fumarate MR tablets to the commercially available drugs using a similarity factor (f2). The present research highlights the importance of process and formulation variables in determining percentage drug release. This investigation extends the work on Linear Regression Model which also compares the drug release profiles13. The structure of the paper is as follows: Section "Preliminaries" presents the preliminaries. Section "Methodology" explains the process of ANNs in drug release kinetics with graphical presentation. In Section "Results" optimum excipient concentration of formulation and process variables on drug release profile is achieved by using MATLAB simulation network. Conclusion is discussed in Section "Conclusion".

Preliminaries

Feedforward networks or multilayer preceptor (MLP)

Feedforward networks have one input layer, absent (0) or present (n) hidden layers, and one output layer. In a feedforward network, each neuron in one layer is exclusively directed toward the following layer, referred to as the output layer21. Multilayer network architecture is a special case of feedforward neural network. It consists of three layers: an input layer with vector length k, single or many hidden layers with m number of hidden neurons and an output layer with vector length n. For the development of artificial neural networks and the assessment of their accuracy, data was divided into three categories: training, validation, and test data set. The models were built using data from the training, testing, and validation sets, with 70% of the data belonging to the training set, 15% to the testing set, and 15% to the validation set. Data sets for training, testing, and validation were chosen at random by MATLAB software.

The effectiveness of the created network was evaluated using test datasets. It is essential to note that the test data was not made accessible to the network during training, that is they are the unseen data for the neural network. Figure 1 demonstrates fully connected architecture of a multilayer feedforward neural network with a single hidden layer. The network in Fig. 1 is described as a k-m-n network for simplicity because it consists of k input neurons, single hidden layer with m hidden neurons, and n output neurons.

Figure 1
figure 1

Multilayer neural network architecture and mathematical calculation inside the artificial neuron.

Similarity factor (f2)

The similarity factor is adopted by USFDA22,23 and, is given by Moore and Flanner in 199624. A similarity factor is a logarithmic reciprocal square root transformation of the sum of squared errors. It compares the percentage dissolution of two curves to determine the similarity of two drug profiles13.

$${f}_{2}=50 \,log \{{(1+\frac{1}{n}\sum_{j=1}^{n}{({R}_{j}-{T}_{j})}^{2})}^{-0.5} \times 100\}$$

The similarity factor (f2) assumes the value 100 when the fit is perfect, and the value decreases when the profiles become more unsimilar. According to the guideline of FDA, the accepted range of f2 is between 50 and 10013,22,23.

Methodology

To access the drug release profile, ANN was employed as a part of simultaneous optimization method. The most effective drug release profile for the Quetiapine Fumarate MR tablet was found by analysing the similarity factor f2 with that of marketed available drug release profile.

The composition of Sodium Citrate (\({x}_{1}\)), Eudragit® L100 55 (\({x}_{2}\)), Eudragit® L30 D55 (\({x}_{3}\)), Lactose Monohydrate (\({x}_{4}\)), DCP (x5) and Glyceryl Behenate (x6) was used as a formulation variable. Also, Drug Substance Quetiapine Fumarate, Triethyl Citrate and Magnesium Stearate were kept constant at 230.27, 1.5, and 4 respectively in the training of the network. The concentration of each of the drug content is in mg/tab. The drug content in each formulation is presented in Table 1. The tablet weight kept constant at 400 mg for all the formulations. All possible permutations of the formulation variables within the experimental domain were generated by fractional factorial design. For six variables \({x}_{1}\) to \({x}_{6}\), we get \({2}^{6-2}+1={2}^{4}+1=17\) formulations. All the input variables are simultaneously varied in this approach (See Table 1). The flow chart of manufacturing process is as given below (see Fig. 2).

Table 1 Tablet Formulation with different level of Excipient concentration.
Figure 2
figure 2

Flow chart of manufacturing process.

Results

A formulator can better understand how process variables and formulation variables affect modified-release drug formulation by using conventional (statistical) method but it is time consuming method in creating and developing the modified-release drug formulation. This study found that the ANN model exhibited most suitable approach in checking the drug release similarity by optimizing the formulation variables. The results of the dissolution test for the formulations produced using the fractional factorial design is displayed in Fig. 3. These dissolution profiles were utilized for training, testing, and validating the neural network. The simulation network was processed by MATLAB code20. The formulation variable, Sodium Citrate (\({x}_{1}\)), Eudragit® L100 55 (\({x}_{2}\)), Eudragit® L30 D55 (\({x}_{3}\)), Lactose Monohydrate (\({x}_{4}\)), DCP (x5) and Glyceryl Behenate (x6) were used as input variables. The drug release profile at ten different time points were used as output. The dimension of input vector and target vector was 6 by 17 (6 variables and 17 samples) and 10 by 17 (10 variables and 17 samples) respectively. “The data set presented in the supplementary table S1 shows the partition of all seventeen formulations in three subsets”. There was total ten neurons in the hidden layer. The Levenberg–Marquardt algorithm was selected to train the network. In MATLAB, the function “trainlm” uses this algorithm for training feedforward neural networks. This algorithm updates weight and bias values during training the neural network. The simulation of trained network (different drug concentration of excipients) was processed when the mean square error is minimum and at this stage the regression coefficient was tending to one for all the trained data sets (see Fig. 4).

Figure 3
figure 3

Drug release kinetics for tablet formulation with different level of excipient concentration at ten different timepoints.

Figure 4
figure 4

Regression analysis of training, testing and validating data sets with respect to target data.

The simulated network was used to predict the drug release profiles for the different concentration of drug contents. Also, the change in drug release profile is noted with the different drug concentration of excipients. The percentage drug release is then compared with the drug release of reference drug product. It was noted that, the similarity factor f2 is more than 80% for the proposed formulation and, is presented in the Table 2. “The calculation of f2 is found in the supplementary table S2 online.”

Table 2 Optimized formulation with percentage drug release and similarity factor f2.

The steps involved in developing the ANN model is given below:

  • Step 1: Data Collection (Experimental data)

  • Step 2: Data preprocessing

    • Data cleaning (If any value I not available then put zero)

    • Data normalization

  • Step 3: Data Splitting

    • Training (70%)

    • Testing (15%)

    • Validation (15%)

  • Step 4: Design ANN model

    • Select ANN architecture

    • Define Input and Output Layers

    • Choose activation functions (TRANSIG, PURELIN)

  • Step 5: Training the model

    • Forward propagation

    • Compute loss

    • Backward propagation

    • Update weights

    • Iterate until convergence (train the network until regression coefficient becomes close to 1)

  • Step 6: Model validation

    • Validate the model on validation set

    • Tune hyperparameters (number of hidden layers, number of nodes in the hidden layer)

  • Step 7: Model evaluation

    • Evaluate model on test set

    • Analyse performance

  • Step 8: Optimization

    • Optimize release profile parameters

    • Use trained ANN for prediction

    • Simulate and optimize drug release

  • Step 9: Model deployment

Conclusion

We attempted building an ANN model to predict the impact of formulation excipients on drug release profiles. Various mathematical models are available to check drug release similarity of the dosage form. ANNs were used to find the combination of drug product excipients to predict the percentage drug release. The predicted percentage drug release is compared with the reference drug release using similarity factor (f2). Further, the optimal formulation predicted by ANN exhibited the best practice to check influence of drug product excipients on drug release from the tablet. With the recent development, k-nearest neighbor (KNN), Bayesian Algorithm, and Neuro ordinary differential equation can also be applicable in future, as it requires relatively small amount of data and indicates which variables are the most important and the direction that the future experiment should take.