Abstract
RNA Secondary Structure prediction is one of the most significant research areas in bioinformatics. Many research works have been developed in the area of RNA secondary structure prediction with optimization methods. But there are certain drawbacks still prevailing in the existing optimization methods. To avoid such drawbacks in the existing methods, we have proposed an Acceleration base Particle Swarm Optimization (APSO) algorithm for finding minimum free energy of RNA secondary structures. The experimental result shows that our proposed APSO algorithm efficiently locates the RNA structures. Furthermore, the performance of our proposed APSO algorithm is evaluated by invoking eight benchmark functions and also compared with Genetic Algorithm (GA) and standard Particle Swarm Optimization (PSO). The test result shows that the APSO is efficient both in test benchmark functions and prediction model and better than other algorithms.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- RNA secondary Structure
- Free energy
- Genetic Algorithm
- Particle Swarm Optimization (PSO)
- Acceleration base Particle Swarm Optimization (APSO)
1 Introduction
A ribonucleic acid (RNA) molecule consists of a chain of ribonucleotides connected together by covalent chemical bonds. Each ribonucleotide consists of one of the four bases such as adenine (A), cytosine (C), guanine (G) or uracil (U), and the exact sequence of bases along the chain, the primary structure of the molecule, determines the kind of RNA it is. The secondary structure is the outcome of hydrogen bonds between nucleotides that are situated far away in the chain. Characteristically these hydrogen bonds exist only between G and C, or A and U, or G and U (or vice versa). The nucleotides so linked are called base pairs and are labeled as GC, CG, AU, UA, or GU, or UG, the first base being the one with the smaller index in the chain.
Though RNA secondary structure forecast is fundamentally based on Nuclear Magnetic Resonance and X -ray crystallography, these methods are very complex, time-consuming and highly costly. This has given birth to the development of various mathematical and computational models which have now become highly indispensable in bioinformatics for RNA secondary structure forecast. Several investigational efforts have been carried out in the area of RNA secondary structure prediction with optimization techniques. The cardinal issue here is to find out, from among all the probable secondary structures, the one which has the minimum energy, because it is the most likely one to have a robust stable secondary structure. But from the computer perspective, it emerges into an optimization issue spread over a search space and has an objective function [11]. No wonder, a number of diverse optimization techniques and strategies have been introduced as an aftermath. With a view to eliminate this vexed dilemma, the contribution of diverse meta heuristics, intimately linked with the GA in the iterative search procedure like simulated annealing (SA), particle swarm optimization (PSO), ant colony optimization (ACO), and tabu search (TS), have also become subject matter of hot debate and discussion [21, 13]. Particle Swarm Optimization algorithm is employed to optimize the structure of an RNA molecule, by means of a sophisticated thermodynamic model [9]. Herbert et al. [22] have skillfully explained the advantages of SARNA-Predict, an RNA secondary structure prediction algorithm based on Simulated Annealing (SA). Estimation for the execution of SARNA-Predict in terms of forecast precision is established with native structures. Hao wu et al. [23] cleverly put forward a fuzzy adaptive particle swarm optimization (FPSO) blending particle swarm optimization (PSO) and fuzzy logic control (FLC) to forecast RNA secondary structure with the least energy. This technique is targeted at forecasting pseudoknots in the huge search spaces. The arithmetical outcomes and statistical investigations have proved beyond doubt that the innovative approach is competent to locate an optimal feature subset from a bulky noisy data set.
The forecast of RNA secondary structure with minimum free energy is a very critical function. In the perfect RNA structure prediction the calculation of base pairs plays a very effective role. With an eye on forecasting highly perfect RNA secondary structure several optimization techniques have been introduced in the literature. But it is unfortunate that each and every technique suffers from certain lacuna in the area of precise structure prediction.The drawbacks in the existing methods are solved by constructing a new optimization technique named as Acceleration base Particle Swarm Optimization (APSO) algorithm.
The outline structure of the paper is organized as follows:- In section 2, a brief discussion on the standard PSO and the proposed APSO algorithm is presented. The experimental results on the multimodal benchmark functions and RNA structure prediction are given in Section 3 and conclusion of the paper is given in Section 4.
2 Proposed Acceleration based Particle Swarm Optimization (APSO)
2.1 Particle Swarm Optimization (PSO)
Particle swarm optimization (PSO) [26] is triggered by the community actions of organic creatures, like fishes and birds equipped with supreme qualities of clustering jointly to function hand in hand in recognizing advantageous locations in a particular zone, as evidenced by the fishes looking out for a food source.
PSO emulates the swarm behavior and individuals represent potential solutions in a D-dimensional search space. Particle i is often composed of four vectors:
X i = (x 1 i , x 2 i , ⋯, x D i ) with x d i being its position in the dth dimension, pbest i = (pbest 1 i , pbest 2 i , ⋯, pbest D i ) with pbest d i being the best position in the dth dimension that particle i has found by itself, V i = (v 1 i , v 2 i , ⋯, v D i ) with v d i being the velocity in the dth dimension, and gbest = (gbest 1, gbest 2, ⋯, gbest D) with gbest d being the global best position in the dth dimension that all particles have found. Particles in a swarm move through the search space by
Where c 1 and c 2 are two constants often with the value of 2.0, and r 1 and r 2 are two independent random numbers uniformly generated in the range [0, 1] at each updating iteration from d = 1 to D, respectively.
The standard PSO technique has been extensively applied in various investigational efforts to achieve an optimal key for the dilemma. With a view to achieve a highly accurate optimal outcome, the defects inherent in the PSO technique such as premature convergence and loss of diversity have to be properly tackled and surmounted by initiating effective alteration or augmentation in the procedure of PSO. The cardinal defect of the PSO technique is a random value choice during the new particles generation procedure where during the course of the velocity calculation procedure the acceleration coefficients are engendered arbitrarily. The arbitrary value choice in the velocity procedure makes it essential that the created particles are in random. The unfortunate fact is that the random populations fail miserably in generating further precise outcomes. Therefore, to attain a further precise outcome and to steer clear of the PSO defects, rather than fixing the value of acceleration coefficient we introduce an Acceleration base Particle swarm optimization (APSO) technique in which acceleration coefficient values are updated on the basis of evaluation function. Next section briefly explain the proposed APSO.
2.2 Acceleration based Particle Swarm Optimization (APSO)
In our proposed technique, an Acceleration based Particle Swarm Optimization (APSO) algorithm is developed to solve premature convergence problem of the standard PSO. In APSO, the acceleration coefficients are selected based on the fitness value which will increase the accuracy of the results. The selection of acceleration coefficient values in APSO algorithm is described as follows,
In Equ. (3) and (4), the λ value is computed based on the fitness values is calculated as,
Where,
χ - Alteration probability
ω, ϕ - are the coefficient factors
F max, F min, F avg - denotes the maximum, minimum and average fitness of the particles
By exploiting Equ. (3) and (4), the acceleration coefficients values, the velocity formula which is given in Equ. (1) is updated by,
In Equ. (7), the a 1 and a 2 values are computed by comparing the particles which have the minimum fitness values with the pbest particles values. If the both particles values are same the value of a1 and a2 is set as zero otherwise the values are 1.
In the APSO, the particles are generated by using the RNA structures base pairs. To generate the particles, different length RNA structures are stored in the database D. The process of APSO algorithm in RNA secondary structure prediction is described as follows,
Step 1: Initially, the particles are initiated by generate RNA sequence sets which contain the RNA base pairs (A, U, C, and G) in an encoded format. Then, the parameters of each particle, including its position and velocity are initiated.
Step 2: Every generated particles fitness value is calculated by the RNA fold algorithm. The particles which have the minimum free energy are selected as the best particles.
Step 3: Update pbest i of each particle and gbest i . Based on these values, the particles velocity and positions are updated by exploiting the Eqn. (7) and (2). The particles are updated by using the both Equ. (7) and thus we get the new particles. From this new set of particles best particles are selected by comparing the particles values and these best particles are given to the fitness computation.
Step 4: Stop if the current optimization solution is good enough or some stopping criterion is satisfied. Other-wise, go to Step 2.
3 Experimental Results
The proposed Acceleration based PSO (APSO) is implemented in the working platform of MATLAB and results are compared with GA and standard PSO. The parameters which are utilized in the APSO are listed in Table 1.
3.1 Results of APSO on Benchmark Functions
The proposed APSO method is first tested on eight benchmark functions which are described as follows,
To accomplish the performance analysis process, we have performed 30 independent runs of APSO, GA and PSO methods. Result of convergence in terms of average fitness value for number of iterations, of all the three algorithms, for all the above mentioned functions is shown in Figure 1.
As can be seen from Figure. 1, our proposed APSO method has obtained global optima in less number of iterations as compared to PSO and GA methods. Among three optimization methods, GA has given poorer performance when compared to PSO and APSO methods whereas PSO performance degrades when compared to APSO. Hence, our proposed APSO method has better convergence rate for all the eight standard functions than the PSO and GA methods.
3.2 Results of APSO on RNA Secondary Structure Prediction
The proposed APSO, PSO & GA methods’ performances are analyzed by conducting experiments on four different RNA sequences of varying sizes. The results of the 4 different experiments with 4 RNA sequences are given in Table 2.
Table 2 shows the performance of APSO, PSO & GA methods’ performance in the RNA pair’s prediction. In S.cerevisiae sequence 37 pairs are predicted in total, among which 31 pairs are correctly predicted with standard deviation 4.9 in APSO and in PSO & GA methods 29 & 28 pairs are correctly predicted respectively. In H.marismortui RNA sequence, APSO correctly predicts 201 base pairs among the 233 predicted base pairs and GA & PSO methods have correctly predicted 192 and 190 base pairs respectively. In other two RNA sequences X.laevis and D.virilis our proposed APSO correctly predicts 222 base pairs in 254 predicted base pairs and 30 base pairs in 38 predicted base pairs. When compared to the GA and PSO methods our proposed APSO method has given superior base pair prediction accuracy.
Performance of APSO is further evaluated on the basis of convergence characteristic on the 4 RNA sequences and is presented in Figure 2. APSO shows superior performance than GA and PSO in all the cases.
Prediction accuracy is shown in Figure 3. As can be seen from Fig. 4, our proposed APSO method has given a greater number of correctly predicted pairs than the GA and PSO methods in all 10 iterations. While comparing GA & PSO methods, GA has given superior prediction accuracy but this is lower than that of the APSO method. Hence, our proposed APSO method has given high performance in RNA structure prediction than the GA and PSO methods.
4 Conclusion
In this paper we have proposed a new Acceleration base Particle Swarm Optimization (APSO) algorithm for finding minimum energy RNA secondary structures. The proposed APSO is first tested on eight benchmark functions and then the algorithm is extended to solve RNA secondary structure prediction problems. The experimental result on benchmark functions shows that APSO acquires a better performance than the GA and standard PSO algorithms. Simulation to predict RNA secondary structure is performed on 4 different RNA sequences of varying lengths. Computational result shows that APSO algorithm not only converges well but also has high prediction accuracy as compared to GA and standard PAO algorithms. Hence, our proposed APSO algorithm is able to predict more corrected structures and also find better secondary structures in terms of free energy.
Reference
Hao Wu, Yan-feng Shi, Xing Jin, Gang Wang and Hao Dong, "A Fuzzy Adaptive Particle Swarm Optimization for RNA Secondary Structure Prediction", In Proceedings of International Conference on Information Science and Technology, Nanjing, pp. 1390-1393, 2011.
Arturo Díaz Perez Mario A. Garcia-Martinez, "FPGA Accelerator for RNA Secondary Structure Prediction", In Proceedings of 12th Euromicro Conference on Digital System Design / Architectures, Methods and Tools, Patras, pp. 667-671, 2009.
Eberhart R.C. and Kennedy J., “Particle swarm optimization,”IEEE International Conference on Neural Networks, Volume 4(27), pp 1942-1948, 1995.
Esquivel S., Coello C. and Coello A., “On the use of particle swarm optimization with multimodal functions”, IEEE Congress on Evolutionary Computation (CEC), pp1130–1136, 2003.
Herbert H. Tsang and Kay C. Wiese, "SARNA-Predict: Accuracy Improvement of RNA Secondary Structure Prediction Using Permutation Based Simulated Annealing", IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 7, No. 4, pp. 727-738, 2010.
Kangtai Wang and Ning Wang, “A novel RNA Genetic Algorithm for Parameter Estimation of Dynamic Systems”, Journal of Chemical Engineering Research and Design, Vol. 88, No. 11, pp. 1485-1493, 2010.
Marais Neethling and A.P. Engelbrecht, "Determining RNA Secondary Structure using Set based Particle Swarm Optimization", In Proceedings of IEEE Congress on Evolutionary Computation, Vancouver, pp. 1670-1677, 2006.
Poli R., “An Analysis of the publications on the applications of Particle Swarm Optimization”, Journal of Artificial Evolution & Applications, 2007.
S.S. Ray, M. Bachhar, and S.K. Pal, “RNA Secondary Structure Prediction in Soft Computing Framework: A Review,” Proc. IEEE Third Int’l Conf. Computer Science and Information Technology, vol. 5, pp. 430-435, 2010.
Shubhra Sankar Ray and Sankar K. Pal, "RNA Secondary Structure Prediction Using Soft Computing", IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 10, No. 1, pp. 2-17, 2013.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Agrawal, J., Agrawal, S. (2015). Acceleration based Particle Swarm Optimization (APSO) for RNA Secondary Structure Prediction. In: Selvaraj, H., Zydek, D., Chmaj, G. (eds) Progress in Systems Engineering. Advances in Intelligent Systems and Computing, vol 366. Springer, Cham. https://doi.org/10.1007/978-3-319-08422-0_106
Download citation
DOI: https://doi.org/10.1007/978-3-319-08422-0_106
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08421-3
Online ISBN: 978-3-319-08422-0
eBook Packages: EngineeringEngineering (R0)