Abstract
Given a set of k-dimensional objects, the skyline-CL query returns all clusters over skyline objects according to their cardinalities. A naïve solution to this problem can be implemented in two phases: (1) using existing skyline query algorithms to obtain all skyline objects and (2) utilizing the DBSCAN algorithm to cluster these skyline objects. However, it is extremely inefficient in real applications because phases 1 and 2 are all CPU-sensitive. Motivated by the above facts, in this paper, we present Algorithm for Efficient Processing of the Skyline-CL Query (AEPSQ), an efficient sound and complete algorithm for returning all skyline clusters. During the process of obtaining skyline objects, the AEPSQ algorithm organizes these objects as a novel k-ary tree SI (k) -Tree which is first proposed in our paper, and employs several interesting properties of SI (k) -Tree to produce skyline clusters fast. Furthermore, we present detailed theoretical analyses and extensive experiments that demonstrate our algorithm is both efficient and effective.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Lin X., Xu J., Hu H., Lee W. (2014) Authenticating location-based skyline queries in arbitrary subspaces. IEEE Trans. Knowl. Data Eng. 26(6): 1479–1493
Jiang T., Zhang B., Lin D. et al (2015) Incremental evaluation of top-k combinatorial metric skyline query. Knowl. Based Syst. 74: 89–105
Godfrey, P.: Skyline cardinality for relational processing. In: International Conference on Foundations of Information and Knowledge Systems, pp. 78–97 (2004)
Borzsonyi, S.; Kossmann, D.; Stocker, K.: The skyline operator. In: International Conference on Data Engineering, pp. 421–430 (2001)
Kossmann, D.; Ramsak, F.; Rost, S.: Shooting stars in the sky: an online algorithm for skyline queries. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 311–322 (2002)
Papadias, D.; Tao, Y.; Fu, G.; Seeger, B.: An optimal and progressive algorithm for skyline queries. In: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 467–478 (2003)
Papadias D., Tao Y., Fu G., Seeger B. (2005) Progressive skyline computation in data systems. ACM Trans. Database Syst. 30(1): 41–82
Chomicki, J.; Godfrey, P.; Gryz, J.; Liang, D.: Skyline with Presorting:Theory and Optimization. In: International Conference on Intelligent Information Systems, pp. 595–604 (2005)
Chan, C.; Jagadish, H.; Tan, K.; Tung, A.; Zhang Z.: On High Dimensional Skylines. In: International Conference on Extending Database Technology, pp. 478–495 (2006)
Huang Z., Guo J., Sun S., Wei W. (2008) Efficient optimization of multiple subspace skyline queries. J. Comput. Sci. Technol. 23(1): 103–111
Li Y., Li Z., Dong M. et al (2015) Efficient subspace skyline query based on user preference using MapReduce. Ad Hoc Netw. 35: 105–115
Huang Z., Sun S., Wang W. (2010) Efficient mining of skyline objects in subspaces over data streams. Knowl. Inf. Syst. 22(2): 159–183
Sander J., Ester M., Kriegel H.P. et al (1998) Density-based clustering in spatial databases: The algorithm gdbscan and its applications. Data Min. Knowl. Discov. 2(2): 169–194
Gan, J.; Tao, Y.: DBSCAN Revisited: Mis-Claim, Un-Fixability, and Approximation. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 519–530. ACM (2015)
Jin, W.; Han, J.; Ester, M.: Mining thick skylines over large databases. In: Proceedings of PKDD, pp. 255–266 (2004)
Chan, C.; Jagadish, H.; Tan, K.; Tung, A.; Zhang, Z.: Finding K-dominant skylines in high dimensional space. In: Proceedings of ACM SIGMOD, pp. 503–514 (2006)
Huang Z., Xiang Y., Lin Z. (2010) l-Skydiv query: effectively improve the usefulness of skylines. Sci. China Inf. Sci. 53(9): 1785–1799
Lin, X.; Yuan, Y.; Zhang, Q.; Zhang, Y.: Selecting stars: the k most representative skyline operator. In: International Conference on Data Engineering, pp. 86–95 (2007)
Tao, Y.; Ding, L.; Lin, X.; Pei, J.: Distance-based representative skyline. In: International Conference on Data Engineering, pp. 892-903 (2009)
Lee J., You G., Hwang S. et al (2009) Personalized top-k skyline queries in high-dimensional space. Inf. Syst. 34(1): 45–61
Huang Z., Xiang Y., Zhang B. et al (2011) A clustering based approach for skyline diversity. Expert Syst. Appl. 38(7): 7984–7993
Berchtold, S.; Böhm, C.; Keim, D.A.; et al.: A cost model for nearest neighbor search in high-dimensional data space. In: Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 78–86. ACM (1997)
Lee J., You G., Hwang S. (2009) Personalized top-k skyline queries in high-dimensional space. Inf. Syst. 34(1): 45–61
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huang, Z., Zhang, J. & Tian, C. Efficient Processing of the Skyline-CL Query. Arab J Sci Eng 41, 2801–2811 (2016). https://doi.org/10.1007/s13369-015-2011-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-015-2011-4