Abstract
The utmost requirement of any successful application in today’s environment is to extract the desired piece of information from its Big Data with a very high speed. When Big Data is managed via traditional approach of relational model, accessing speed is compromised. Moreover, relational data model is not flexible enough to handle big data use cases that contains a mixture of structured, semi-structured, and unstructured data. Thus, there is a requirement for organizing data beyond relational model in a manner which facilitates high availability of any type of data instantly. Current research is a step towards moving relational data storage (PostgreSQL) to decentralized structured storage system (Cassandra), for achieving high availability demand of users for any type of data (structured and unstructured) with zero fault tolerance. For reducing the migration cost, the research focuses on reducing the storage requirement by efficiently compressing the source database before moving it to Cassandra.
Experiment has been conducted to explore the effectiveness of migration from PostgreSQL database to Cassandra. A sample data set varying from 5,000 to 50,000 records has been considered for comparing time taken during selection, insertion, deletion, and searching of records in relational database and Cassandra. The current study found that Cassandra proves to be a better choice for select, insert, and delete operations. The queries involving the join operation in relational database are time consuming and costly. Cassandra proves to be search efficient in such cases, as it stores the nodes together in alphabetical order, and uses split function.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
DataStax Corporation, White paper: Why Migrate from MySQl database to Cassandra and How? International Journal of Computer Trends and Technology 3(2) (2012)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. In: Proceedings of the 7th Conference on USENIX Symposium on Operating Systems Design and Implementation, vol. 7, pp. 205–218 (2006)
Candia, G.D., Hastorun, D., Jampani, M., Kakulapati, G., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazonOs highly available key-value store. In: Proceedings of twenty first ACM SIGOPS symposium on Operating systems principles, pp. 205–220 (2007)
Lakhsman, A., Malik., P.: Cassandra - A Decentralized Structured Storage System. In: International Conference on Computing, Engineering and Information (2012)
Wang, G., Tang, J.: The NoSQL Principles and Basic Application of Cassandra Model. In: International Conference on Computer Science & Service System, China (2012)
Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. In: vol. 33(2), pp. 51–59. Massachusetts Institute of Technology,ACM SIGACT News Homepage archiv, Cambridge (2002)
Theorem, B.: Practical Partition-Based Theorem Proving for Large Knowledge Bases. In: MacCartney, B., McIlraith, S., Amir, E., Uribe, T.E. (eds.) 18th Int’l Joint Conference on Artificial Intelligence, IJCAI 2003 (2003)
Bailis, P., Ghodsi, A.: Eventual Consistency, ‘Eventual Consistency Today: Limitations, Extensions, and Beyond. ACM, UC Berkeley (2013), doi:1542-7730/13/0300
Featherston, D.: Cassandra: Principles and Application. In: International Conference on Computing, Engineering and Information, University of Illinois at Urbana-Champaign
PostgreSQL White Paper: How to increase performance, scalability and security within a Session Management architecture- (2005)
Network Defense- White Paper: Current open issues in NoSql database
Google Code, https://code.google.com/
Phani Krishna Kollapur Gandla, Migration of Relational Data structure to Cassandra (No SQL) Data structure, http://www.codeproject.com/Articles/279947/Migration-of-Relational-Data-structure-to-Cassandr
What is JSON, www.json.org
Beale, T., Heard, S.: The openEHR architecture: Architecture overview. In: The openEHR release 1.0.2, openEHR Foundation (2008)
Duftschmid, G., Wrba, T., Rinner, C.: Extraction of standardized archetyped data from Electronic Health Record Systems based on the Entity-Attribute-Value Model. International Journal of Medical Informatics 79(8), 585–597 (2010)
Batra, S., Sachdeva, S., Mehndiratta, P., Parashar, H.J.: Mining standardized semantic interoperable electronic healthcare records. In: Pham, T.D., Ichikawa, K., Oyama-Higa, M., Coomans, D., Jiang, X. (eds.) ACBIT 2013. CCIS, vol. 404, pp. 179–193. Springer, Heidelberg (2014)
OpenEHR Community (accessed 10, 2013), http://www.openehr.org/
CEN - European Committee for Standardization: Standards (accessed May, 09), http://www.cen.eu/CEN/Sectors/TechnicalCommitteesWorkshops/CENTechnicalCommittees/Pages/Standards.aspx?param=6232&title=CEN/TC+251
: ISO 13606-1.: Health informatics: Electronic health record communication. Part 1: RM, 1st edn (2008)
ISO 13606-2.: Health informatics: Electronic health record communication. Part 2: Archetype interchange specification, vol. 1 (2008)
HL7. Health level 7 (First accessed 10/13), http://www.hl7.org
Cassandra, Tracing on Feature, http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/tracing_r.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Saxena, U., Sachdeva, S., Batra, S. (2015). Moving from Relational Data Storage to Decentralized Structured Storage System. In: Chu, W., Kikuchi, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2015. Lecture Notes in Computer Science, vol 8999. Springer, Cham. https://doi.org/10.1007/978-3-319-16313-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-16313-0_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16312-3
Online ISBN: 978-3-319-16313-0
eBook Packages: Computer ScienceComputer Science (R0)