Abstract
Due to the increasing number of cloud applications, the amount of data in the cloud shows signs of growing faster than ever before. The nature of cloud computing requires cloud data processing systems that can handle huge volumes of data and have high performance. However, most cloud storage systems currently adopt a hash-like approach to retrieving data that only supports simple keyword-based enquiries, but lacks various forms of information search. Therefore, a scalable and efficient indexing scheme is clearly required. In this paper, we present a skip list-based cloud index, called SLC-index, which is a novel, scalable skip list-based indexing for cloud data processing. The SLC-index offers a two-layered architecture for extending indexing scope and facilitating better throughput. Dynamic load-balancing for the SLC-index is achieved by online migration of index nodes between servers. Furthermore, it is a flexible system due to its dynamic addition and removal of servers. The SLC-index is efficient for both point and range queries. Experimental results show the efficiency of the SLC-index and its usefulness as an alternative approach for cloud-suitable data structures.
摘要
随着基于云平台的应用的增加, 云存储系统中的数据呈现出爆炸式增长的趋势, 要求云数据处理系统具备高效的海量数据处理能力, 然而, 现有的云存储系统大多采用哈希方法检索数据, 主要提供针对键值的查询, 范围查询效率较低。 因此, 有必要为云存储系统构建辅助数据索引。 提出了一种基于跳表的云数据索引结构, 简称 SLC 索引。 SLC 索引采用双层体系结构, 该索引结构契合云存储系统的分布式存储特性, 易于在多个服务器节点上灵活扩展。 局部索引节点基于查询耗费计算模型向全局索引节点发布索引信息, 保证 SLC 索引结构的整体高效性。 通过动态的索引节点分裂与合并机制, 降低数据倾斜带来的性能影响, 实现索引结构负载均衡。 实验结果表明, SLC 索引能够支持高效的单点查询和范围查询, 是一种适用于云计算系统的具有高可扩展性的辅助数据索引。
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
ARMBRUST M, FOX A, GRIFFITH R. A view of cloud computing [J]. Communications of the ACM, 2010, 53(4): 50–58. DOI: https://doi.org/10.1145/1721654.1721672.
BEN-YEHUDA O A, BEN-YEHUDA M, SCHUSTER A, TSAFRIR D. Deconstructing Amazon EC2 spot instance pricing [J]. ACM Transactions on Economics and Computation, 2013, 1(3): 16. DOI: https://doi.org/10.1145/2509413.2509416.
WANG Li-zhe, von LASXEWSKI G, YOUNGE A. Cloud computing: A perspective study [J]. New Generation Computing, 2010, 28(2): 137–146. DOI: https://doi.org/10.1007/s00354-008-0081-5.
CALDER B, WANG J, OGUS A W. Windows Azure storage: A highly available cloud storage service with strong consistency [C]// ACM Symposium on Operating Systems Principles. Cascais: ACM, 2011: 143–157. DOI: https://doi.org/10.1145/2043556.2043571.
LIU An-feng, LIU Xiao, LI He. MDMA: A multi-data and multi-ACK verified selective forwarding attack detection scheme in WSNs [J]. IEICE Transactions on Information and Systems, 2016, E99-D(8): 2010–2018. [2016-08-01] https://doi.org/search.ieice.org/bin/summary.php?id=e99-d_8_2010.
DING Y S, HAO K R. Multi-objective workflow scheduling in cloud system based on cooperative multi-swarm optimization algorithm [J]. Journal of Central South University, 2017, 24(5):1050–1062. DOI: https://doi.org/10.1007/s11771-017-3508-7.
CHANG F, DEAN J, GHEMAWAT S. Bigtable: A distributed storage system for structured data [J]. ACM Transactions on Computer Systems, 2008, 26(2): 1–26. DOI: https://doi.org/10.1145/1365815.1365816.
GHEMAWAT S, GOBIOFF H, LEUNG S T. The Google file system [C]// ACM Symposium on Operating Systems Principles. Bolton landing: ACM, 2003, 37(5): 29–43. DOI: https://doi.org/10.1145/1165389.945450.
VAVILAPALLI V K, MURTHY A C, DOUGLAS C. Apache hadoop yarn: Yet another resource negotiator [C]// ACM Annual Symposium on Cloud Computing. Santa clara: ACM, 2013(5): 1–16. DOI: https://doi.org/10.1145/2523616.2523633.
DECANDIA G, HASTORUN D, JAMPANI M. Dynamo: Amazon’s highly available key-value store [C]// ACM Symposium on Operating Systems Principles. Stevenson: ACM, 2007, 41(6): 205–220. DOI: https://doi.org/10.1145/1323293.1294281.
LAKSHMAN A, MALIK P. Cassandra: A decentralized structured storage system [J]. ACM SIGOPS Operating Systems Review, 2010, 44(2): 35–40. DOI: https://doi.org/10.1145/1773912.1773922.
DITTRICH J, QUIANERUIZ J, JINDAL A. Hadoop++: Making a yellow elephant run like a cheetah (without it even noticing) [J]. Proceedings of The VLDB Endowment, 2010, 3(1, 2): 515–529. DOI: https://doi.org/10.14778/1920841.1920908.
YANG H C, PARKER D S. Traverse: Simplified indexing on large map-reduce-merge clusters [C]// International Conference on Database Systems for Advanced Applications. Brisbane: Springer, 2009: 308–322. DOI: https://doi.org/10.1007/978-3-642-00887-0_27.
LIN J, RYABOY D, WEIL K. Full-text indexing for optimizing selection operations in large-scale data analytics [C]// The Second International Workshop on Map Reduce and Its Applications. San Jose: ACM, 2011: 59–66. DOI: https://doi.org/10.1145/1996092.1996105.
LU Peng, CHEN Guang, OOI B C. ScalaGiST: Scalable generalized search trees for mapreduce systems [innovative systems paper] [J]. Proceedings of the VLDB Endowment, 2014, 7(14): 1797–1808. DOI: https://doi.org/10.14778/2733085.2733087.
RICHTER S, QUIANERUIZ J, SCHUH S. Towards zero-overhead static and adaptive indexing in Hadoop [J]. Proceedings of the VLDB Endowment, 2014, 23(3): 469–494. DOI: https://doi.org/10.1007/s00778-013-0332-z.
CHANG B R, TSAI H F, HSU H T. Secondary ındex to big data NoSQL database–ıncorporating solr to HBase approach [J]. Journal of Information Hiding and Multimedia Signal Processing, 2016, 7(1): 80–89. https://doi.org/bit.kuas.edu.tw/~jihmsp/2016/vol7/JIH-MSP-2016-01-009.pdf.
AGUILERA M K, GOLAB W, SHAH M A. A practical scalable distributed B-Tree [J]. Proceedings of the VLDB Endowment, 2008, 1(1): 598–609. DOI: https://doi.org/10.14778/1453856.1453922.
HUANG Bin, PENG Yu-xing. An efficient two-level bitmap index for cloud data management [C]// IEEE International Conference on Communication Software and Networks. Xi’an: IEEE, 2011: 509–513. DOI: https://doi.org/10.1109/ICCSN.2011.6014776.
ZHOU Wei, LU Jin, LUAN Zhong-zhi. SNB-index: A SkipNet and B+ tree based auxiliary Cloud index [J]. Cluster Computing, 2014, 17(2): 453–462. DOI 10.1007/s10586-013-0246-y.
WANG Jin-bao, WU Sai, GAO Hong. Indexing multidimensional data in a cloud system [C]// ACM SIGMOD International Conference on Management of Data. Indianapolis: ACM, 2010: 591–602. DOI: https://doi.org/10.1145/1807167.1807232.
CHENG Chun-lin, SUN Chun-ju, XU Xiao-long. A multi-dimensional ındex structure based on ımproved VA-file and CAN in the cloud [J]. International Journal of Automation and Computing, 2014, 11(1): 109–117. DOI: https://doi.org/10.1007/s11633-014-0772-y.
LU Peng, WU Sai, SHOU Li-dan. An efficient and compact indexing scheme for large-scale data store [C]// IEEE International Conference on Data Engineering. Brisbane: IEEE, 2013: 326–337. DOI: https://doi.org/10.1109/ICDE.2013.6544836.
DINARI H, NADERI H. A method for improving graph queries processing using positional inverted index (PII) idea in search engines and parallelization techniques [J]. Journal of Central South University, 2016, 23(1): 150–159 DOI: https://doi.org/10.1007/s11771-016-3058-4.
HE Jing, WU Yue, DONG Yun-yun. Dynamic multidimensional index for large-scale cloud data [J]. Journal of Cloud Computing Advances Systems & Applications, 2016, 5(1): 1–12. DOI: https://doi.org/10.1186/s13677-016-0060-1.
PUGH W. Skip lists: A probabilistic alternative to balanced trees [J]. Communications of The ACM, 1990, 33(6): 668–676. DOI: https://doi.org/10.1145/78973.78977.
JESI G. P. Peersim HOWTO: Build a new protocol for the PeerSim 1.0 simulator [EB/OL]. [2011-08-04] https://doi.org/peersim.sourceforge.net/.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Projects(61363021, 61540061, 61663047) supported by the National Natural Science Foundation of China; Project(2017SE206) supported by the Open Foundation of Key Laboratory in Software Engineering of Yunnan Province, China
Rights and permissions
About this article
Cite this article
He, J., Yao, Sw., Cai, L. et al. SLC-index: A scalable skip list-based index for cloud data processing. J. Cent. South Univ. 25, 2438–2450 (2018). https://doi.org/10.1007/s11771-018-3927-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-018-3927-0