Skip to main content

Capturing Hadoop Storage Big Data Layer Meta-Concepts

  • Conference paper
  • First Online:
Advanced Intelligent Systems for Sustainable Development (AI2SD’2018) (AI2SD 2018)

Abstract

Nowadays, producing streams of data is not helpful if you cannot store them somewhere. Applications, software, and objects generate huge masses of data, which need to be collected, stored, and made available for analysis. Moreover, these data are very valuable and need to be preserved. That is why Big Data has attracted global interest from all leaders of information technology and new ways of storing information have emerged and flourished. Accordingly, while proceeding our analysis on this subject, we note that in terms of Big Data architecture, the storage layer is very useful and is essential for the proper functioning of any Big Data system. In fact, there are two types of storage at this layer: Hadoop distributed file system (HDFS) and NoSQL databases. We relied on previous works in which we identified key storage concepts through comparative studies of main big data distributions. The storage layer is located directly above Data Sources and Data ingestion layers for which we already proposed a meta-model. Thus, in this paper, we applied techniques related to Model Driven Engineering ‘MDE’ to provide a universal Meta-modeling for the storage layer at the level of a Big Data system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Richards, Ken: Machine Learning: For Beginners—Your Starter Guide For Data Management, Model Training, Neural Networks. CreateSpace Independent Publishing Platform, Machine Learning Algorithms (2018)

    Google Scholar 

  2. Erraissi, A., Belangour, A., Tragha, A.: A big data hadoop building blocks comparative study. Int. J. Comput. Trends Technol. Accessed 18 June 2017

    Google Scholar 

  3. Erraissi, A., Belangour, A., Tragha, A.: A comparative study of hadoop-based big data architectures. Int. J. Web Appl. IJWA. 9(4) (2017)

    Google Scholar 

  4. Erraissi, A., Belangour, A., Tragha, A.: Digging into hadoop-based big data architectures. Int. J. Comput. Sci. Issues IJCSI. 14(6), 52–59 (2017)

    Google Scholar 

  5. Erraissi, A., Belangour, A, Tragha, A.: Meta-Modeling of Data Sources and Ingestion Big Data Layers. SSRN Scholarly Paper. Rochester, Social Science Research Network, NY 26 May 2018. https://papers.ssrn.com/abstract=3185342

  6. White, T.: Hadoop—The Definitive Guide 4e-. 4th ed. O'Reilly, Beijing (2015)

    Google Scholar 

  7. Alapati, S.R.: Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS. Addison Wesley, Boston, MA (2016)

    Google Scholar 

  8. Raj, P., Deka, G.C.: A Deep Dive into NoSQL Databases: The Use Cases and Applications. S.l.: Academic Press (2018)

    Google Scholar 

  9. Dunning, T., Friedman, E.: Real-World Hadoop (2015)

    Google Scholar 

  10. Blokdyk, G.: MapReduce Complete Self-Assessment Guide. CreateSpace Independent Publishing Platform (2017)

    Google Scholar 

  11. N. Sawant, Shah, H.: Big data application architecture Q & A a problem-solution approach. Apress (2013)

    Google Scholar 

  12. Balasubramanian, S.: Big Data Hadoop The Premier Interview Guide (2017)

    Google Scholar 

  13. Borthakur, : HDFS architecture guide. Hadoop Apache Proj. http//hadoop apache …, pp. 1–13 (2008)

    Google Scholar 

  14. Banane, M., Belangour, A., El Houssine, L.: Storing RDF data into big data NoSQL databases. In: Mizera-Pietraszko J., Pichappan P., Mohamed L. (eds) Lecture Notes in Real-Time Intelligent Systems. RTIS 2017. Advances in Intelligent Systems and Computing. vol. 756. Springer, Cham

    Google Scholar 

  15. Banane, M., Belangour, A., Labriji, E.H.: RDF data management systems based on NoSQL Databases : a comparative study. Int. J. Comput. Trends Technol. (IJCTT). V58(2), 98–102 (2018)

    Google Scholar 

  16. Sadalage, P.: NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence, 1st edn. Addison Wesley, Upper Saddle River, NJ (2009)

    Google Scholar 

  17. Nayak, A., Poriya, A., Poojary, D.: Type of NoSQL Databases and its Comparison with Relational Databases. Int. J. Appl. Inf. Syst. 5(4), 16–19 (2013)

    Google Scholar 

  18. Seeger, M., Ultra-Large-Sites, S.: Key-value stores: a practical overview. … Sci. Media, pp. 1–21 (2009)

    Google Scholar 

  19. Carlson, J.L.: Redis in Action. Pap/Psc. Shelter Island. Manning Publications, NY (2013)

    Google Scholar 

  20. Meyer, M.: Riak Handbook (2011)

    Google Scholar 

  21. Akboka, B., Filipchuk, N., Zimanyi, E.:Advance database: Voldemort (2015)

    Google Scholar 

  22. Abadi, D.: The Design and Implementation of Modern Column-Oriented Database Systems. Found. Trends® Databases, 5(3), 197–280 (2012)

    Google Scholar 

  23. VLDB 2009 Tutorial Column-Oriented Database Systems Column-Oriented Database Systems

    Google Scholar 

  24. George, L.: Hbase: The Definitive Guide: Random Access to Your Planet-size Data. 2nd Revised edition. O’Reilly Media, Inc, USA (2018)

    Google Scholar 

  25. Chang, F. et al.: Bigtable: A distributed storage system for structured data. 7th Symp. Oper. Syst. Des. Implement. (OSDI ’06). pp. 205–218, Novemb. 6–8, Seattle, WA, USA (2006)

    Google Scholar 

  26. Carpenter, J., Eben Hewitt.: Cassandra—The Definitive Guide 2e. 2nd ed. Sebastopol, O'Reilly, CA (2015)

    Google Scholar 

  27. Amazon Web Services: Amazon DynamoDB Developer Guide API Version 2012-08-10 (2012)

    Google Scholar 

  28. Issa, A., Schiltz, F.: Document oriented Databases (2015)

    Google Scholar 

  29. Team, C.: CouchDB 2.0 Reference Manual. Samurai Media Limited (2015)

    Google Scholar 

  30. Syn-Hershko, I.: RavenDB in Action. Manning Publications (2016)

    Google Scholar 

  31. Bradshaw, Shannon, Chodorow, Kristina: Mongodb: The Definitive Guide: Powerful and Scalable Data Storage, 3rd edn. Place of publication not identified, O’Reilly Media Inc, USA (2018)

    Google Scholar 

  32. Robinson, I., Webber, J., Elfrem, E.: Graph Databases 2e. 2nd ed. O'Reilly, Beijing (2015)

    Google Scholar 

  33. Baton, J., Van Bruggen, R.: Learning Neo4j 3.x—Second Edition: Effective data modeling, performance tuning and data visualization techniques in Neo4j. 2nd Revised edition. Packt Publishing Limited (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Allae Erraissi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Erraissi, A., Belangour, A. (2019). Capturing Hadoop Storage Big Data Layer Meta-Concepts. In: Ezziyyani, M. (eds) Advanced Intelligent Systems for Sustainable Development (AI2SD’2018). AI2SD 2018. Advances in Intelligent Systems and Computing, vol 915. Springer, Cham. https://doi.org/10.1007/978-3-030-11928-7_37

Download citation

Publish with us

Policies and ethics