Abstract
Many academic and industrial research activities have recently recognized the relevance of expressive models and effective frameworks for highly scalable data processing, such as MapReduce. This paper presents the novel Quasit programming model and runtime framework for stream processing in datacenters, with its original capabilities of i) allowing developers to choose among a large set of quality policies to associate with their processing tasks in a fine-grained way, and ii) effectively managing processing execution depending on the associated quality indications. The paper describes the Quasit programming model, via the primary design/implementation choices made in the Quasit runtime framework (available for download from the project Web site) to achieve maximum scalability, flexibility, and reusability. The first experiences with our prototype and the reported experimental results show the feasibility of our approach and its good performance in terms of both limited overhead and horizontal scalability.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Barroso, L., Dean, J., Holzle, U.: Web search for a planet: the Google cluster architecture. IEEE Micro 23(2), 22–28 (2003)
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM 51(1), 107–113 (2008)
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, vol. 41(3), pp. 59–72. ACM, New York (2007)
Abadi, D.J., Ahmad, Y., Balazinska, M., Cetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A.S., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., Zdonik, S.: The Design of the Borealis Stream Processing Engine. In: 2nd Biennial Conference on Innovative Data Systems Research (CIDR), pp. 277–289. VLDB Endowment (2005)
Amini, L., Andrade, H., Bhagwan, R., Eskesen, F., King, R., Park, Y., Venkatramani, C.: SPC: A distributed, scalable platform for data mining. In: Grossman, R., Connelly, S. (eds.) 4th International workshop on Data Mining Standards, Services and Platforms (DM-SS), pp. 27–37. ACM, New York (2006)
Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Ito, K., Motwani, R., Srivastava, U., Widom, J.: STREAM: The Stanford Data Stream Management System, Technical report, Stanford InfoLab (2004)
Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Seidman, G., Stonebraker, M., Tatbul, N., Zdonik, S.: Monitoring streams: a new class of data management applications. In: 28th International Conference on Very Large Data Bases (VLDB 2002), pp. 215–226. VLDB Endowment (2002)
Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: Distributed Stream Computing Platform. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW 2010), pp. 170–177. IEEE, Los Alamitos (2010)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google File System. ACM SIGOPS Operating Systems Rev. 37(5), 29–43 (2003)
Alves, D., Bizarro, P., Marques, P.: Flood: elastic streaming Map-Reduce. In: 4th ACM International Conference on Distributed Event-Based Systems (DEBS 2010), pp. 113–114. ACM, New York (2010)
Horey, J.: A programming framework for integrating web-based spatiotemporal sensor data with MapReduce capabilities. In: ACM SIGSPATIAL International Workshop on GeoStreaming, pp. 51–58. ACM, New York (2010)
Logothetis, D., Yocum, K.: Ad-hoc data processing in the cloud. Proceedings of the VLDB Endowment 1(2), 1472–1475 (2008)
Yang, H.-C., Dasdan, A., Hsiao, R., Parker, D.: Map-reduce-merge: simplified relational data processing on large clusters. In: 2007 ACM SIGMOD International Conference on Management of Data, pp. 1029–1040. ACM, New York (2007)
Kumar, V., Andrade, H., Gedik, B., Wu, K.-L.: DEDUCE: at the intersection of Map-Reduce and stream processing. In: Manolescu, I., Spaccapietra, S., Teubner, J., Kitsuregawa, M., Leger, A., Naumann, F., Ailamaki, A., Ozcan, F. (eds.) 13th International Conference on Extending Database Technology (EDBT 2010), pp. 657–662. ACM, New York (2010)
Gedik, B., Andrade, H., Wu, K.-L., Yu, P.S., Doo, M.: SPADE: the System S declarative stream processing engine. In: 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD 2008), pp. 1123–1134. ACM, New York (2008)
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: MapReduce Online. In: 7th USENIX Conference on Networked Systems Design and Implementation (NSDI 2010). USENIX Association, Berkeley (2010)
Ahmad, Y., Tatbul, N., Xing, W., Xing, Y., Zdonik, S., Berg, B., Cetintemel, U., Humphrey, M., Hwang, J.-H., Jhingran, A., Maskey, A., Papaemmanouil, O., Rasin, A.: Distributed operation in the Borealis stream processing engine. In: 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD 2005), pp. 882–884. ACM, New York (2005)
Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. The VLDB Journal The International Journal on Very Large Data Bases 12(2), 120–139 (2003)
Odersky, M., Altherr, P., Cremet, V., Emir, B., Maneth, S., Micheloud, S., Mihaylov, N., Schinz, M., Stenman, E., Zenger, M.: An Overview of the Scala Programming Language. Technical Report, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland (2004)
Emir, B., Odersky, M., Williams, J.: Matching Objects with Patterns. In: Ernst, E. (ed.) ECOOP 2007. LNCS, vol. 4609, pp. 273–298. Springer, Heidelberg (2007)
Guerraoui, R., Schiper, A.: Software-based replication for fault tolerance. Computer 30(4), 68–74 (1997)
Object Management Group: Data Distribution Service for Real-time Systems, version 1.2. Technical report, Object Management Group (2007)
Haller, P., Odersky, M.: Scala Actors: Unifying thread-based and event-based programming. Theoretical Computer Science 410(2-3), 202–220 (2009)
Lea, D.: A Java fork/join framework. In: ACM 2000 Conference on Java Grande (JAVA 2000), pp. 36–43. ACM, New York (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Bellavista, P., Corradi, A., Reale, A. (2013). The QUASIT Model and Framework for Scalable Data Stream Processing with Quality of Service. In: Borcea, C., Bellavista, P., Giannelli, C., Magedanz, T., Schreiner, F. (eds) Mobile Wireless Middleware, Operating Systems, and Applications. MOBILWARE 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 65. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36660-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-36660-4_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36659-8
Online ISBN: 978-3-642-36660-4
eBook Packages: Computer ScienceComputer Science (R0)