Skip to main content

Parameter Tuning onto Recurrent Neural Network and Long Short-Term Memory (RNN-LSTM) Network for Feature Selection in Classification of High-Dimensional Bioinformatics Datasets

  • Chapter
  • First Online:
Bio-inspired Algorithms for Data Streaming and Visualization, Big Data Management, and Fog Computing

Abstract

Feature selection helps with the selection of relevant features that are present in large number of features and ignores the remaining features that have little value on output feature set. Deep learning methods have been applied to select relevant features in the classification problem; however, the current approach (i.e., search strategies) to the learning of a parameter can either grow out of bound or shrink (they decay exponentially in the number of layers) at each time step (iteration) with the subsequent effect of inaccurate classification of features. To address this challenge of the current search strategies, we proposes an approach to the learning of a parameter for the classification problem based on the behavior of birds (i.e., kestrel bird). The proposed approach, bio-inspired approach, is modeled as a search algorithm which is then integrated with deep learning method. The integration enables learning of optimum parameter for feature selection in a classification problem. A benchmark dataset (i.e., bioinformatics dataset with continuous data attributes) from the Arizona State University was chosen because of its high dimensionality and its continuous data attribute nature. This dataset was used to test the proposed algorithm. The algorithm proposed was evaluated against comparative bio-inspired algorithms namely PSO, ACO, WSA-MP and BAT. The findings indicate that KSA produces minimum learning rate in five datasets out of nine datasets. While on the classification accuracy, KSA produces the highest accuracy of classification in four out of nine dataset. In terms of comparison of classification accuracy using “Wilcoxon signed-rank test,” the finding indicates that “there is no statistically significant differences between the comparative algorithm and the proposed algorithm.” This indicates that KSA could be used as an alternative approach to feature selection for a classification problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Aamodt, T. (2015). Predicting stock markets with neural networks: A comparative study. Master’s Thesis.

    Google Scholar 

  • Abd-Alsabour, N., Randall, M., & Lewis, A. (2012). Investigating the effect of fixing the subset length using ant colony optimization algorithms for feature subset selection problems. In 2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies (pp. 733–738). IEEE.

    Google Scholar 

  • Abdel-Hamid, O., Deng, L., & Yu. D. (2013). Exploring convolutional neural network structures and optimization for speech recognition. In Interspeech (Vol. 11, pp. 73–5).

    Google Scholar 

  • Abdel-Hamid, O., Mohamed, A., Jiang, H., & Penn, G. (2012). Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. In 2012 IEEE international Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4277–4280). IEEE.

    Google Scholar 

  • Aboudi, N. E., & Benhlima, L. (2016). Review on wrapper feature selection approaches. In 2016 International Conference on Engineering & MIS (ICEMIS) (pp. 1–5). IEEE.

    Google Scholar 

  • Agbehadji, I. E. (2011). Solution to the travel salesman problem, using omicron genetic algorithm. Case study: tour of national health insurance schemes in the Brong Ahafo region of Ghana. Online Master’s Thesis.

    Google Scholar 

  • Agbehadji, I. E., Millham, R., & Fong, S. (2016). Wolf search algorithm for numeric association rule mining. In 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA 2016). Chengdu, China.

    Google Scholar 

  • Agbehadji, I. E., Millham, R., & Fong, S. (2016). Kestrel-based search algorithm for association rule mining and classification of frequently changed items. In: IEEE International Conference on Computational Intelligence and Communication Networks, Dehadrun, India. 10.1109/CICN.2016.76.

    Google Scholar 

  • Al-Ani, A., & Al-Sukker, A. (2006). Effect of feature and channel selection on EEG classification. In 2006 International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 2171–2174). IEEE.

    Google Scholar 

  • Al-Ani, A. (2007). Ant colony optimization for feature subset selection. World Academy of Science, Engineering and Technology International Journal of Computer, Electrical, Automation, Control and Information Engineering, 1(4).

    Google Scholar 

  • Almuallim, H., & Dietterich, T. G. (1994). Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69(1–2), 279–305.

    Google Scholar 

  • Batres-Estrada, G. (2015). Deep learning for multivariate financial time series.

    Google Scholar 

  • Ben-Bassat, M. (1982). Pattern recognition and reduction of dimensionality. In P. R. Krishnaiah & L. N. Kanal (Eds.), Handbook of statistics-II (pp. 773–791), North Holland.

    Google Scholar 

  • Berka, P., & Rauch, J. (2010). Machine learning and association rules. University of Economics

    Google Scholar 

  • Binh, T. Z. M., & Bing, X. (2014). Overview of particle swarm optimisation for feature selection in classification (pp. 605–617). Berlin: Springer International Publishing.

    Google Scholar 

  • Bishop, C. M. (2006). Pattern recognition and machine learning. Available on http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf.

  • Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial Intelligence, 97, 245–271.

    Article  MathSciNet  Google Scholar 

  • Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classier. http://w.svms.org/training/BOGV92.pdf.

  • Dorigo M., & Cambardella, L. M. (1997). Ant colony system: A cooperative learning approach to traveling salesman problem. IEEE Transactions on Evolutionary Computation, 1 (1), 53–66.

    Google Scholar 

  • Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12, 2493–2537.

    MATH  Google Scholar 

  • Cui, X., Gao, J., & Potok, T. E. (2006). A flocking based algorithm for document clustering analysis. Journal of Systems Architecture, 52(8–9), 505–515.

    Google Scholar 

  • Dash, M., & Liu, H. (1997). Feature selection for classification, intelligent data analysis. 1, 131–156.

    Google Scholar 

  • Deng, L. (2011). An overview of deep-structured learning for information processing. In Proceedings of Asian-Pacific Signal & Information Processing Annual Summit and Conference (APSIPA-ASC).

    Google Scholar 

  • Deng, L. (2012). Three classes of deep learning architectures and their applications: A tutorial survey. APSIPA Transactions on Signal and Information Processing

    Google Scholar 

  • Deng, L., & Chen, J. (2014). Sequence classification using the high-level features extracted from deep neural networks. In Proceedings of International Conference on Acoustics Speech and Signal Processing (ICASSP).

    Google Scholar 

  • Deng, L., & Yu, D. (2013). Deep learning: Methods and applications. Foundations and trends in signal processing, 7(3–4), 197–387.

    Google Scholar 

  • Elisseeff, A., & Guyon, I. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(2003), 1157–1182.

    MATH  Google Scholar 

  • Englert, P., Paraschos, A., Peters, J., & Deisenroth, M. P. (2013). Probabilistic model-based imitation learning. http://www.ias.tu-darmstadt.de/uploads/Publications/Englert_ABJ_2013.pdf.

  • Ferchichi, S. E., Laabidi, K., Zidi, S., & Maouche, S. (2009). Feature Selection using an SVM learning machine. In 2009 3rd International Conference on Signals, Circuits and Systems (SCS) (pp. 1–6). IEEE.

    Google Scholar 

  • Fong, S., Yang, X.-S., & Deb, S. (2013). Swarm search for feature selection in classification. In 2013 IEEE 16th International Conference on Computational Science and Engineering.

    Google Scholar 

  • García, S., Fernández, A., Benítez, A. D., & Herrera, F. (2007). Statistical comparisons by means of non-parametric tests: A case study on genetic based machine learning. http://www.lsi.us.es/redmidas/CEDI07/%5B9%5D.pdf.

  • Graves, A., & Jaitly, N. (2014). Towards end-to-end speech recognition with recurrent neural networks. In International Conference on Machine Learning (pp. 1764–1772).

    Google Scholar 

  • Hall, M. A. (2000). Correlation-based feature selection for discrete and numeric class machine learning. In Proceedings of 17th International Conference on Machine Learning (pp. 359–366).

    Google Scholar 

  • Holland, J. (1975). Adaptation in natural and artificial systems. Ann Arbor, MI: University of Michigan Press.

    Google Scholar 

  • Honkavaara, J., Koivula, M., Korpimäki, E., Siitari, H., & Viitala, J. (2002). Ultraviolet vision and foraging in terrestrial vertebrates. https://projects.ncsu.edu/cals/course/zo501/Readings/UV%20Vision%20in%20Birds.pdf.

  • Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. In Proceedimgs of IEEE International Conference on Neural Networks (pp. 1942–1948), Piscataway, NJ.

    Google Scholar 

  • Kim, J. W. (2013). Classification with deep belief networks. Available on https://www.ki.tu-berlin.de/fileadmin/fg135/publikationen/Hebbo_2013_CDB.pdf.

  • Kohavi, R., & John, G. H. (1996). Wrappers for feature subset selection. Artificial Intelligence, 97(1–2), 273–324.

    Google Scholar 

  • Krause, J., Cordeiro, J., Parpinelli, R. S., & Lopes, H. S. (2013).A survey of swarm algorithms applied to discrete optimization problems.

    Google Scholar 

  • Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of the Twenty-Sixth Annual Conference on Neural Information Processing Systems (pp. 1097–1105). Lake Tahoe, NY, USA, 3–8 December 2012.

    Google Scholar 

  • Kumar, R. (2015). Grey wolf optimizer (GWO).

    Google Scholar 

  • Kumar, V., & Minz, S. (2014). Feature selection: A literature review. Smart Computing Review, 4(3).

    Google Scholar 

  • Le, Q. V. (2015). A tutorial on deep learning part 1: Nonlinear classifiers and the backpropagation algorithm.

    Google Scholar 

  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Review: Deep learning. Nature, 521(7553), 436–444.

    Google Scholar 

  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278–2324.

    Article  Google Scholar 

  • Lee, H., Grosse, R., Ranganath, R. & Ng, A. Y. (2009). Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In ICML.

    Google Scholar 

  • Li. D. (2013). Three classes of deep learning architectures and their applications: A tutorial survey. research.microsoft.com.

    Google Scholar 

  • Li, J., Fong, S., Wong, R. K., Millham, R., & Wong, K. K. L. (2017). Elitist binary wolf search algorithm for heuristic feature selection in high-dimensional bioinformatics datasets. Scientific Reports, 7(1), 1–14.

    Google Scholar 

  • Liang, J., Wang, F., Dang, C., & Qian, Y. (2012). An efficient rough feature selection algorithm with a multi-granulation view. International Journal of Approximate Reasoning, 53(6), 912–926.

    Google Scholar 

  • Lin, C.-J. (2006). Support vector machines: status and challenges. Available on https://www.csie.ntu.edu.tw/~cjlin/talks/caltech.pdf.

  • Liu, H., & Yu, L. (2005). Towards integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17(4).

    Google Scholar 

  • Longbottom, C, & Bamforth, R. (2013). Optimising the data warehouse. Dealing with large volumes of mixed data to give better business insights. Quocirca.

    Google Scholar 

  • Mafarja, M., & Mirjalili, S. (2018). Whale optimization approaches for wrapper feature selection. Applied Soft Computing, 62, 441–453.

    Google Scholar 

  • Marcus, G. (2018). Deep learning: A critical appraisal. https://arxiv.org/abs/1801.00631.

  • Marill, D. G. T. (1963). On the effectiveness of receptors in recognition systems. IEEE Transactions on Information Theory, 9(1), 11–17.

    Article  Google Scholar 

  • Patel, A. B., Nguyen, T., & Baraniuk, R. G. (2015). A probabilistic theory of deep learning. arXiv preprint arXiv:1504.00641.

  • Qui, C. (2017). Bare bones particle swarm optimization with adaptive chaotic jump for feature selection in classification. International Journal of Computational Intelligence Systems, 11(2018), 1–14.

    Google Scholar 

  • Sainath, T., Mohamed, A., Kingsbury, B., & Ramabhadran, B. (2013). Deep convolutional neural networks for LVCSR. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 8614–8618). IEEE.

    Google Scholar 

  • Shrubb, M. (1982). The hunting behaviour of some farmland Kestrels. Bird Study, 29, 121–128.

    Article  Google Scholar 

  • Siripurapu, A. (2015). Convolutional networks for stock trading. Stanford University Department of Computer Science, Course Project Reports

    Google Scholar 

  • Sohangir, S., Wang, D., Pomeranets, A., & Khoshgoftaar, T. M. (2018). Big data: Deep learning for financial sentiment analysis. Journal of Big Data, 5(1), 3.

    Google Scholar 

  • Spencer, R. L. (2002). Introduction to Matlab.

    Google Scholar 

  • Stützle, T., & Dorigo, M. (2002). The ant colony optimization metaheuristic: algorithms, applications, and advances. In F. Glover & G. Kochenberger (Eds.), Handbook of metaheuristics. Norwell, MA: Kluwer Academic Publishers.

    Google Scholar 

  • Tang, R., Fong, S., Yang, X.-S., & Deb, S. (2012). Wolf search algorithm with ephemeral memory.

    Google Scholar 

  • Tian, Z., & Fong, S. (2016). Survey of meta-heuristic algorithms for deep learning training. Optimization algorithms—methods and applications.

    Google Scholar 

  • Uncu, O., & Turksen, I. B. (2007). A novel feature selection approach: Combining feature wrappers and filters. Information Sciences, 177(2007), 449–466.

    Article  MathSciNet  Google Scholar 

  • Unler, A., & Murat, A. (2010). A discrete particle swarm optimization method for feature selection in binary classification problems. European Journal of Operational Research, 206(3), 528–539.

    Google Scholar 

  • Varland, D. E. (1991). Behavior and ecology of post-fledging American Kestrels. Retrospective Theses and Dissertations Paper 9784.

    Google Scholar 

  • Vlachos, C, Bakaloudis, D., Chatzinikos, E., Papadopoulos, T., & Tsalagas, D. (2003). Aerial hunting behaviour of the lesser Kestrel falco naumanni during the breeding season in thessaly (Greece). Acta Ornithologica, 38(2), 129–134.

    Google Scholar 

  • Waad, B., Ghazi, B. M., & Mohamed, L. (2013). On the effect of search strategies on wrapper feature selection in credit scoring. In 2013 International Conference on Control, Decision and Information Technologies (CoDIT) (pp. 218–223). IEEE.

    Google Scholar 

  • Weston, J., Chopra, S., & Adams, K. (2014). # tagspace: semantic embeddings from Hashtags. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1822–1827).

    Google Scholar 

  • Whitney, A. W. (1971). A direct method of nonparametric measurement selection. IEEE Transactions on Computers, C-20(9), 1100–1103.

    Google Scholar 

  • Xue, B., Bing, W. N., & Zhang, M. (2014). Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Applied Soft Computing, 18, 261–276.

    Article  Google Scholar 

  • Zar, J. H. (1999). Biostatistical analysis. Prentice Hall.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Millham .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Millham, R., Agbehadji, I.E., Yang, H. (2021). Parameter Tuning onto Recurrent Neural Network and Long Short-Term Memory (RNN-LSTM) Network for Feature Selection in Classification of High-Dimensional Bioinformatics Datasets. In: Fong, S., Millham, R. (eds) Bio-inspired Algorithms for Data Streaming and Visualization, Big Data Management, and Fog Computing. Springer Tracts in Nature-Inspired Computing. Springer, Singapore. https://doi.org/10.1007/978-981-15-6695-0_2

Download citation

Publish with us

Policies and ethics