Abstract
NoSQL databases have emerged as a backend to support Big Data applications. NoSQL databases are characterized by horizontal scalability, schema-free data models, and easy cloud deployment. To avoid overprovisioning, it is essential to be able to identify the correct number of nodes required for a specific system before deployment. This paper benchmarks and compares three of the most common NoSQL databases: Cassandra, MongoDB and HBase. We deploy them on the Amazon EC2 cloud platform using different types of virtual machines and cluster sizes to study the effect of different configurations. We then compare the behavior of these systems to high-level queueing network models. Our results show that the models are able to capture the main performance characteristics of the studied databases and form the basis for a capacity planning tool for service providers and service users.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Barbierato, E., Gribaudo, M., Iacono, M.: Performance evaluation of nosql big-data applications using multi-formalism models. Future Generation Computer Systems (2013) (to appear) (available online)
Bertoli, M., Casale, G., Serazzi, G.: JMT: Performance engineering tools for system modeling. SIGMETRICS Perform. Eval. Rev. 36(4), 10–15 (2009)
Castiglione, A., Gribaudo, M., Iacono, M., Palmieri, F.: Exploiting mean field analysis to model performances of big data architectures. Future Generation Computer Systems (2013) (article in press); cited by (since 1996)
Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39(4), 12–27 (2011)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4:1–4:26 (2008)
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, pp. 143–154. ACM, New York (2010)
Coulden, D., Osman, R., Knottenbelt, W.J.: Performance modelling of database contention using queueing petri nets. In: ICPE, pp. 331–334 (2013)
Cudré-Mauroux, P., et al.: NoSQL databases for RDF: An empirical evaluation. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 310–325. Springer, Heidelberg (2013)
Db engines. Db-engines ranking of database management systems (March 2014) (accessed: March 04, 2014)
De Candia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Di Sanzo, P., Palmieri, R., Ciciani, B., Quaglia, F., Romano, P.: Analytical modeling of lock-based concurrency control with arbitrary transaction data access patterns. In: WOSP/SIPEW 2010, pp. 69–78. ACM, New York (2010)
Elnikety, S., Dropsho, S., Cecchet, E., Zwaenepoel, W.: Predicting replicated database scalability from standalone database profiling. In: EuroSys 2009, pp. 303–316. ACM, New York (2009)
Apache Software Foundation. Cassandra, http://cassandra.apache.org/ (accessed: March 04, 2014)
Apache Software Foundation. Hbase project, https://hbase.apache.org/ (accessed: March 04, 2014)
Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)
Lakshman, A., Malik, P.: Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Lazowska, E.D., Zahorjan, J., Graham, G.S., Sevcik, K.C.: Quantitative System Performance. Prentice-Hall (1984)
MongoDB, Inc. Mongodb, http://www.mongodb.org/ (accessed: March 04, 2014)
Nicola, M., Jarke, M.: Performance modeling of distributed and replicated databases. IEEE Trans. on Knowl. and Data Eng. 12(4), 645–672 (2000)
Oracle. Oracle nosql database. An oracle white paper. white paper (September 2011)
Osman, R., Awan, I., Woodward, M.E.: Queped: Revisiting queueing networks for the performance evaluation of database designs. Simulation Modelling Practice and Theory 19(1), 251–270 (2011)
Osman, R., Coulden, D., Knottenbelt, W.J.: Performance modelling of concurrency control schemes for relational databases. In: Dudin, A., De Turck, K. (eds.) ASMTA 2013. LNCS, vol. 7984, pp. 337–351. Springer, Heidelberg (2013)
Osman, R., Knottenbelt, W.J.: Database system performance evaluation models: A survey. Perform. Eval. 69(10), 471–493 (2012)
Osman, R., Piazzolla, P.: Modelling replication in nosql datastores. In: QEST (2014)
Rabl, T., Sadoghi, M., Jacobsen, H.-A., Gómez-Villamor, S., Muntés-Mulero, V., Mankowskii, S.: Solving big data challenges for enterprise application performance management. PVLDB 5(12), 1724–1735 (2012)
Weber, S.: Nosql databases. University of Applied Sciences HTW Chur, Switzerland
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gandini, A., Gribaudo, M., Knottenbelt, W.J., Osman, R., Piazzolla, P. (2014). Performance Evaluation of NoSQL Databases. In: Horváth, A., Wolter, K. (eds) Computer Performance Engineering. EPEW 2014. Lecture Notes in Computer Science, vol 8721. Springer, Cham. https://doi.org/10.1007/978-3-319-10885-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-10885-8_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10884-1
Online ISBN: 978-3-319-10885-8
eBook Packages: Computer ScienceComputer Science (R0)