Skip to main content

Deep Learning for Predicting Gene Regulatory Networks: A Step-by-Step Protocol in R

  • Protocol
  • First Online:
Reverse Engineering of Regulatory Networks

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2719))

  • 810 Accesses

Abstract

Deep learning has emerged as a powerful tool for solving complex problems, including reconstruction of gene regulatory networks within the realm of biology. These networks consist of transcription factors and their associations with genes they regulate. Despite the utility of deep learning methods in studying gene expression and regulation, their accessibility remains limited for biologists,  mainly due to the prerequisites of programming skills and a nuanced grasp of the underlying algorithms. This chapter presents a deep learning protocol that utilize TensorFlow and the Keras API in R/RStudio, with the aim of making deep learning accessible for individuals without specialized expertise. The protocol focuses on the genome-wide prediction of regulatory interactions between transcription factors and genes, leveraging publicly available gene expression data in conjunction with well-established benchmarks. The protocol encompasses pivotal phases including data preprocessing, conceptualization of neural network architectures, iterative processes of model training and validation, as well as forecasting of novel regulatory associations. Furthermore, it provides insights into parameter tuning for deep learning models. By adhering to this protocol, researchers are expected to gain a comprehensive understanding of applying deep learning techniques to predict regulatory interactions. This protocol can be readily modifiable to serve diverse research problems, thereby empowering scientists to effectively harness the capabilities of deep learning in their investigations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, Kellis M, Collins JJ, Stolovitzky G (2012) Wisdom of crowds for robust gene network inference. Nat Methods 9:796–804

    Article  Google Scholar 

  2. Li Z, Gao E, Zhou J, Han W, Xu X, Gao X (2023) Applications of deep learning in understanding gene regulation. Cell Rep Methods 3:100384

    Article  Google Scholar 

  3. Muley VY, König R (2022) Human transcriptional gene regulatory network compiled from 14 data resources. Biochimie 193:115–125

    Article  Google Scholar 

  4. Dynan WS, Tjian R (1983) The promoter-specific transcription factor Sp1 binds to upstream sequences in the SV40 early promoter. Cell 35(1):79–87

    Article  Google Scholar 

  5. Vaquerizas JM, Kummerfeld SK, Teichmann SA, Luscombe NM (2009) A census of human transcription factors: function, expression and evolution. Nat Rev Genet 10:252–263

    Article  Google Scholar 

  6. Muley VY, López-Victorio CJ, Ayala-Sumuano JT, González-Gallardo A, González-Santos L, Lozano-Flores C, Wray G, Hernández-Rosales M, Varela-Echavarría A (2020) Conserved and divergent expression dynamics during early patterning of the telencephalon in mouse and chick embryos. Prog Neurobiol 186:101735

    Article  Google Scholar 

  7. Levine M, Tjian R (2003) Transcription regulation and animal diversity. Nature 424:147–151

    Article  Google Scholar 

  8. Salah FS, Ebbinghaus M, Muley VY, Zhou Z, Al-Saadi KR, Pacyna-Gengelbach M, O’Sullivan GA, Betz H, König R, Wang ZQ, Bräuer R (2016) Tumor suppression in mice lacking GABARAP, an Atg8/LC3 family member implicated in autophagy, is associated with alterations in cytokine secretion and cell death. Cell Death Dis 7(4):e2205

    Article  Google Scholar 

  9. Greener JG, Kandathil SM, Moffat L, Jones DT (2022) A guide to machine learning for biologists. Nat Rev Mol Cell Biol 23(1):40–55

    Article  Google Scholar 

  10. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  11. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A (2012) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res 41(D1):D991–D995

    Article  Google Scholar 

  12. Barzel B, Barabási AL (2013) Network link prediction by global silencing of indirect correlations. Nat Biotechnol 31(8):720–725

    Article  Google Scholar 

  13. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  14. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323

    Google Scholar 

  15. Cybenko G (1918) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314

    Article  MathSciNet  MATH  Google Scholar 

  16. Allaire J, Chollet F (2023) Keras: R interface to ’keras’

    Google Scholar 

  17. Allaire J, Tang Y (2022) Tensorflow: R interface to ’TensorFlow’

    Google Scholar 

  18. Deribe YL, Pawson T, Dikic I (2010) Post-translational modifications in signal integration. Nat Struct Mol Biol 17(6):666–672

    Article  Google Scholar 

  19. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

    Google Scholar 

  20. Gal Y, Ghahramani Z (2016) Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: International conference on machine learning, pp 1050–1059

    Google Scholar 

  21. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  22. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456

    Google Scholar 

  23. Sarkans U, Füllgrabe A, Ali A, Athar A, Behrangi E, Diaz N, Fexova S, George N, Iqbal H, Kurri S, Munoz J (2021) From arrayexpress to biostudies. Nucleic Acids Res 49(D1):D1502–D1506

    Article  Google Scholar 

  24. Osorio D, Kuijjer ML, Cai JJ (2022) rPanglaoDB: an R package to download and merge labeled single-cell RNA-seq data from the PanglaoDB database. Bioinformatics 38(2):580–582

    Article  Google Scholar 

  25. Collado-Torres L, Nellore A, Jaffe AE (2017) Recount workflow: accessing over 70,000 human RNA-seq samples with bioconductor. F1000Research 6:1558

    Article  Google Scholar 

  26. Lizio M, Harshbarger J, Shimoji H, Severin J, Kasukawa T, Sahin S, Abugessaisa I, Fukuda S, Hori F, Ishikawa-Kato S, Mungall CJ (2015) Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol 16:1–4

    Article  Google Scholar 

  27. GTEx Consortium (2020) The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369(6509):1318–1330

    Article  Google Scholar 

  28. Katz K, Shutov O, Lapoint R, Kimelman M, Brister JR, O’Sullivan C (2022) The sequence read archive: a decade more of explosive growth. Nucleic Acids Res 50(D1):D387–D390

    Article  Google Scholar 

  29. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11(3):1–9

    Article  Google Scholar 

  30. Smyth GK, Speed T (2003) Normalization of cDNA microarray data. Methods 31(4):265–273

    Article  Google Scholar 

  31. Heumos L, Schaar AC, Lance C, Litinetskaya A, Drost F, Zappia L, Lücken MD, Strobl DC, Henao J, Curion F (2023) Best practices for single-cell analysis across modalities. Nat Rev Genet:1–23

    Google Scholar 

  32. Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J (2019) Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res 29(8):1363–1375

    Article  Google Scholar 

  33. Liska O, Bohár B, Hidas A, Korcsmáros T, Papp B, Fazekas D, Ari E (2022) TFLink: an integrated gateway to access transcription factor–target gene interactions for multiple species. Database 2022:baac083

    Article  Google Scholar 

  34. Mathelier A, Zhao X, Zhang AW, Parcy F, Worsley-Hunt R, Arenillas DJ, Buchman S, Chen CY, Chou A, Ienasescu H, Lim J (2014) JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res 42(D1):D142–D147

    Article  Google Scholar 

  35. Johnson JM, Khoshgoftaar TM (2019) Survey on deep learning with class imbalance. J Big Data 6(1):1–54

    Article  Google Scholar 

  36. Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28:1–26

    Article  Google Scholar 

  37. Lunardon N, Menardi G, Torelli N (2014) ROSE: a package for binary imbalanced learning. R J 6(1):79–89

    Article  Google Scholar 

  38. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–327

    Article  MATH  Google Scholar 

  39. Lippmann R (1987) An introduction to computing with neural nets. IEEE Assp Mag 4(2):4–22

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijaykumar Yogesh Muley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Muley, V.Y. (2024). Deep Learning for Predicting Gene Regulatory Networks: A Step-by-Step Protocol in R. In: Mandal, S. (eds) Reverse Engineering of Regulatory Networks. Methods in Molecular Biology, vol 2719. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3461-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-3461-5_15

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-3460-8

  • Online ISBN: 978-1-0716-3461-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics