Abstract
The concept of workflows was implemented to mitigate the complexities involved in tasks related to scientific computing and business analytics. With time, they have found applications in many diverse fields and domains. Handling big data has given rise to many other issues like growing computing complexity, increasing data size, provisioning of resources and the need for such systems to enable working together of heterogeneous systems. As a result, traditional systems are deemed obsolete for this purpose. To meet the variable resource requirements, cloud has emerged as an ostensible solution. Execution and deployment of big data scientific workflows in the cloud is an area that requires research attention before a synergistic model for the same can be presented. This paper identifies open research problems associated with this domain, giving insights on specific issues like workflow scheduling and execution and deployment of big data scientific workflows in a multi-site cloud environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gao, S., Li, L., Goodchild, M.F.: A scalable geoprocessing workflow for big geo-data analysis and optimized geospatial feature conflation based on Hadoop. In: CyberGIS All Hands Meeting (CyberGIS AHM’13) (2013)
Hoffa, C., Mehta, G., Freeman, T., Deelman, E., Keahey, K., Berriman, B., Good, J.: On the use of cloud computing for scientific workflows. In: IEEE Fourth International Conference on eScience, 2008, eScience’08, pp. 640–645. IEEE (2008)
Kashyap, H., Ahmed, H.A., Hoque, N., Roy, S., Bhattacharyya, D.K.: Big data analytics in bioinformatics: a machine learning perspective. arXiv:1506.05101 (2015)
IDC. EMC Digital Universe with Research & Analysis. EMC.com. https://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm. Accessed 12 March 2018
Das, H., Naik, B., Behera, H.S.: Classification of diabetes mellitus disease (DMD): a data mining (DM) approach. In: Progress in Computing, Analytics and Networking, pp. 539–549. Springer, Singapore (2018)
Sahani, R., Rout, C., Badajena, J.C., Jena, A.K., Das, H.: Classification of intrusion detection using data mining techniques. In: Progress in Computing, Analytics and Networking, pp. 753–764. Springer, Singapore (2018)
Mishra, B.S.P., Das, H., Dehuri, S., Jagadev, A.K.: Cloud Computing for Optimization: Foundations, Applications, and Challenges, vol. 39. Springer (2018)
Pattnaik, P.K., Rautaray, S.S., Das, H., Nayak, J. (eds.): Progress in Computing, Analytics and Networking: Proceedings of ICCAN 2017, vol. 710. Springer (2018)
Khan, S., Shakil, K.A., Alam, M.: Cloud-based big data analytics—a survey of current research and future directions. In: Big Data Analytics, pp. 595–604. Springer, Singapore (2018)
Panigrahi, C.R., Tiwary, M., Pati, B., Das, H.: Big data and cyber foraging: future scope and challenges. In: Techniques and Environments for Big Data Analysis, pp. 75–100. Springer, Cham (2016)
Barik, R.K., Dubey, H., Misra, C., Borthakur, D., Constant, N., Sasane, S.A., Mankodiya, K.: Fog assisted cloud computing in era of Big Data and Internet-of-Things: systems, architectures, and applications. In: Cloud Computing for Optimization: Foundations, Applications, and Challenges, pp. 367–394. Springer, Cham (2018)
Barik, R.K., Tripathi, A., Dubey, H., Lenka, R.K., Pratik, T., Sharma, S., Das, H.: Mistgis: optimizing geospatial DATA analysis using mist computing. In: Progress in Computing, Analytics and Networking, pp. 733–742. Springer, Singapore (2018)
Reddy, K.H.K., Das, H., Roy, D.S.: A Data Aware Scheme for Scheduling Big-Data Applications with SAVANNA Hadoop. Futures of Network. CRC Press (2017)
Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13(4), 457–493 (2015)
Li, X., Song, J., Huang, B.: A scientific workflow management system architecture and its scheduling based on cloud service platform for manufacturing big data analytics. Int. J. Adv. Manuf. Technol. 84(1–4), 119–131 (2016)
Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25(5), 528–540 (2009)
John, S., Mohamed, M.: A network performance aware QoS based workflow scheduling for grid services. Int. Arab J. Inf. Technol. (2016)
Bux, M., Leser, U.: Parallelization in scientific workflow management systems. arXiv:1303.7195 (2013)
Chen, W., Deelman, E.: Partitioning and scheduling workflows across multiple sites with storage constraints. In: International Conference on Parallel Processing and Applied Mathematics, pp. 11–20. Springer, Berlin, Heidelberg (2011)
Görlach, K., Sonntag, M., Karastoyanova, D., Leymann, F., Reiter, M.: Conventional workflow technology for scientific simulation. In: Guide to e-Science, pp. 323–352. Springer, London (2011)
Zhao, Y., Hategan, M., Clifford, B., Foster, I., Laszewski, G.V., Nefedova, V., Raicu, I., Stef-Praun, T., Wilde, M.: Swift: fast, reliable, loosely coupled parallel computation. In: 2007 IEEE Congress on Services, pp. 199–206. IEEE (2007)
Deelman, E., Vahi, K., Juve, G., Rynge, M., Callaghan, S., Maechling, P.J., Mayani, R., et al.: Pegasus, a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015)
Missier, P., Soiland-Reyes, S., Owen, S., Tan, W., Nenadic, A., Dunlop, I., Williams, A., Oinn, T., Goble, C.: Taverna, reloaded. In: International Conference on Scientific and Statistical Database Management, pp. 471–481. Springer, Berlin, Heidelberg (2010)
Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: 16th International Conference on Scientific and Statistical Database Management, 2004. Proceedings, pp. 423–424. IEEE (2004)
Goecks, J., Nekrutenko, A., Taylor, J.: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11(8), R86 (2010)
Ogasawara, E., Dias, J., Oliveira, D., Porto, F., Valduriez, P., Mattoso, M.: An algebraic approach for data-centric scientific workflows. Proc. VLDB Endow. 4(12), 1328–1339 (2011)
Fahringer, T., Prodan, R., Duan, R., Hofer, J., Nadeem, F., Nerieri, F., Podlipnig, S., et al.: Askalon: a development and grid computing environment for scientific workflows. In: Workflows for e-Science, pp. 450–471. Springer, London (2007)
Curcin, V., Ghanem, M.: Scientific workflow systems-can one size fit all? In: Cairo International Biomedical Engineering Conference, 2008, CIBEC 2008, pp. 1–9. IEEE (2008)
Kacsuk, P., Farkas, Z., Kozlovszky, M., Hermann, G., Balasko, A., Karoczkai, K., Marton, I.: WS-PGRADE/gUSE generic DCI gateway framework for a large variety of user communities. J. Grid Comput. 10(4), 601–630 (2012)
Yildiz, U., Guabtni, A., Ngu, A.H.: Business versus scientific workflows: a comparative study. In: 2009 World Conference on In Services-I, pp. 340–343. IEEE (2009)
Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research challenges. J. Internet Serv. Appl. 1(1), 7–18 (2010)
Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the Kepler scientific workflow system. In: International Provenance and Annotation Workshop, pp. 118–132. Springer, Berlin, Heidelberg (2006)
Ganga, K., Karthik, S.: A fault tolerant approach in scientific workflow systems based on cloud computing. In: 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME), pp. 387–390. IEEE (2013)
Ostermann, S., Prodan, R., Fahringer, T.: Extending grids with cloud resource management for scientific computing. In: 10th IEEE/ACM International Conference on Grid Computing, 2009, pp. 42–49. IEEE (2009)
Sarkhel, P., Das, H., Vashishtha, L.K.: Task-scheduling algorithms in cloud environment. In: Computational Intelligence in Data Mining, pp. 553–562. Springer, Singapore (2017)
De AR Gonçalves, J.C., de Oliveira, D., Ocaña, K.A., Ogasawara, E., Mattoso, M.: Using domain-specific data to enhance scientific workflow steering queries. In: International Provenance and Annotation Workshop, pp. 152–167. Springer, Berlin, Heidelberg (2012)
Yu, J., Buyya, R.: A taxonomy of workflow management systems for grid computing. J. Grid Comput. 3(3–4), 171–200 (2005)
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific workflow management and the Kepler system. Concurr. Comput. Pract. Exp. 18(10), 1039–1065 (2006)
Wang, J., Altintas, I.: Early cloud experiences with the Kepler scientific workflow system. Procedia Comput. Sci. 9, 1630–1634 (2012)
Kim, J., Deelman, E., Gil, Y., Mehta, G., Ratnakar, V.: Provenance trails in the wings/Pegasus system. Concurr. Comput. Pract. Exp. 20(5), 587–597 (2008)
Mangala, N., Ch, J., Shashi, S., Subrata, C.: Galaxy workflow integration on Garuda grid. In: IEEE 21st International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 194–196 (2012)
Mattoso, M., Werner, C., Travassos, G.H., Braganholo, V., Ogasawara, E., Oliveira, D., Cruz, S., Martinho, W., Murta, L.: Towards supporting the life cycle of large scale scientific experiments. Int. J. Bus. Process Integr. Manag. 5(1), 79–92 (2010)
Terstyanszky, G., Kukla, T., Kiss, T., Kacsuk, P., Balaskó, Á., Farkas, Z.: Enabling scientific workflow sharing through coarse-grained interoperability. Future Gener. Comput. Syst. 37, 46–59 (2014)
Kacsuk, P.: Science Gateways for Distributed Computing Infrastructures. Springer International Publishing (2014). https://doi.org/10.1007/978-3-319-11268-8_10
Bergmann, R., Gil, Y.: Retrieval of semantic workflows with knowledge intensive similarity measures. In: International Conference on Case-Based Reasoning, pp. 17–31. Springer, Berlin, Heidelberg (2011)
Liu, B., Sotomayor, B., Madduri, R., Chard, K., Foster, I.: Deploying bioinformatics workflows on clouds with galaxy and Globus provision. In: 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, pp. 1087–1095 (2012)
Buyya, R., Yeo, C.S., Venugopal, S.: Market-oriented cloud computing: vision, hype, and reality for delivering it services as computing utilities. In: 10th IEEE International Conference on High Performance Computing and Communications, pp. 5–13 (2008)
Vahi, K., Harvey, I., Samak, T., Gunter, D., Evans, K., Rogers, D., Taylor, I., Goode, M., Silva, F., Al-Shkarchi, E., Mehta, G.: A general approach to real-time workflow monitoring. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 108–118 (2012)
Yuan, D., Cui, L., Liu, X.: Cloud data management for scientific workflows: research issues, methodologies, and state-of-the-art. In: 2014 10th International Conference on Semantics, Knowledge and Grids (SKG), pp. 21–28 (2014)
Oinn, T., Li, P., Kell, D.B., Goble, C., Goderis, A., Greenwood, M., Hull, D., Stevens, R., Turi, D., Zhao, J.: Taverna/my Grid: aligning a workflow system with the life sciences community. In: Workflows for e-Science, pp. 300–319. Springer, London (2007)
Kozlovszky, M., Karóczkai, K., Márton, I., Kacsuk, P., Gottdank, T.: DCI bridge: executing Ws-pgrade workflows in distributed computing infrastructures. In: Science Gateways for Distributed Computing Infrastructures, pp. 51–67. Springer, Cham (2014)
Litzkow, M.J., Livny, M., Mutka, M.W.: Condor—a hunter of idle workstations. In: Distributed Computing Systems, 8th International Conference on Semantics, Knowledge and Grids (SKG), pp. 104–111 (1988)
Brandic, I., Dustdar, S.: Grid vs Cloud—a technology comparison. IT-Inf. Technol. Methoden und innovative Anwendungen der Informatik und Informationstechnik 53(4), 173–179 (2011)
Ramakrishnan, A., Singh, G., Zhao, H., Deelman, E., Sakellariou, R., Vahi, K., Blackburn, K., Meyers, D., Samidi, M.: Scheduling data-intensive workflows onto storage-constrained distributed resources. In: Seventh IEEE International Symposium on Cluster Computing and the Grid, 2007, pp. 401–409. IEEE (2007)
Keahey, K., Freeman, T.: Contextualization: providing one-click virtual clusters. In: IEEE Fourth International Conference on eScience, 2008, eScience’08, pp. 301–308. IEEE (2008)
Vöckler, J.S., Juve, G., Deelman, E., Rynge, M., Berriman, B.: Experiences using cloud computing for a scientific workflow application. In: Proceedings of the 2nd International Workshop on Scientific Cloud Computing, pp. 15–24. ACM (2011)
Talia, D.: Clouds for Scalable Big Data Analytics. IEEE Computer Society. http://scholar.google.co.in/scholar_url?url=http://xa.yimg.com/kq/groups/16253916/1476905727/name/06515548.pdf&hl=en&sa=X&scisig=AAGBfm12aY-Nbu37oZYRuEqeqsdslzKfBQ&nossl=1&oi=scholarr&ved=0CCYQgAMoADAAahUKEwi3k4Hymv7GAhUHUKYKHdToBCM. Accessed 16 March 2018
Lin, C., Lu, S., Fei, X., Chebotko, A., Pai, D., Lai, D., Fotouhi, F., Hua, J.: A reference architecture for scientific workflow management systems and the VIEW SOA solution. IEEE Trans. Serv. Comput. 2(1), 79–92 (2009)
Zhao, Y., Li, Y., Lu, S., Raicu, I., Lin, C.: Devising a cloud scientific workflow platform for big data. In: 2014 IEEE World Congress on Services (SERVICES), pp. 393–401. IEEE (2014)
Juve, G., Deelman, E.: Scientific workflows in the cloud. In: Grids, Clouds and Virtualization, pp. 71–91. Springer, London (2011)
Bell, G., Hey, T., Szalay, A.: Beyond the data deluge. Science 323(5919), 1297–1298 (2009)
Das, H., Jena, A.K., Badajena, J.C., Pradhan, C., Barik, R.K.: Resource allocation in cooperative cloud environments. In: Progress in Computing, Analytics and Networking, pp. 825–841. Springer, Singapore (2018)
Malawski, M., Juve, G., Deelman, E., Nabrzyski, J.: Algorithms for cost-and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. Future Gener. Comput. Syst. 48, 1–18 (2015)
Kwok, Y.K., Ahmad, I.: Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Trans. Parallel Distrib. Syst. 7(5), 506–521 (1996)
Juve, G., Deelman, E.: Wrangler: virtual cluster provisioning for the cloud. In: Proceedings of the 20th International Symposium on High Performance Distributed Computing, pp. 277–278. ACM (2011)
Barolli, L., Chen, X., Xhafa, F.: Advances on cloud services and cloud computing. Concurr. Comput. Pract. Exp. 27(8), 1985–1987 (2015)
Ali, S.A., Alam, M.: A relative study of task scheduling algorithms in cloud computing environment. In: 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I), pp. 105–111. IEEE (2016)
Rodriguez, M.A., Buyya, R.: Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds. IEEE Trans. Cloud Comput. 2(2), 222–235 (2014)
Bux, M., Brandt, J., Witt, C., Dowling, J., Leser, U.: Hi-WAY: execution of scientific workflows on Hadoop YARN. In: Proceedings of the 20th International Conference on Extending Database Technology (EDBT), Venice, Italy (2017)
Nayak, J., Naik, B., Jena, A.K., Barik, R.K., Das, H.: Nature inspired optimizations in cloud computing: applications and challenges. In: Cloud Computing for Optimization: Foundations, Applications, and Challenges, pp. 1–26. Springer, Cham (2018)
Ritchie, G., Levine, J.: A fast, effective local search for scheduling independent jobs in heterogeneous computing environments (2003)
Falzon, G., Li, M.: Enhancing genetic algorithms for dependent job scheduling in grid computing environments. J. Supercomput. 62(1), 290–314 (2012)
Grosan, C., Abraham, A., Helvik, B.: Multiobjective evolutionary algorithms for scheduling jobs on computational grids. In: International Conference on Applied Computing, pp. 459–463 (2007)
Das, H., Jena, A.K., Nayak, J., Naik, B., Behera, H.S.: A novel PSO based back propagation learning-MLP (PSO-BP-MLP) for classification. In: Computational Intelligence in Data Mining, vol. 2, pp. 461–471. Springer, New Delhi (2015)
Gamal, A., Hamam, Y.: Task allocation for maximizing reliability of distributed systems: a simulated annealing approach. J. Parallel Distrib. Comput. 66(10), 1259–1266 (2006)
Filgueira, R., Ferreira da Silva, R., Krause, A., Deelman, E., Atkinson, M.: Asterism: Pegasus and dispel4py hybrid workflows for data-intensive science. In: 2016 Seventh International Workshop on Data-Intensive Computing in the Clouds (DataCloud), pp. 1–8. IEEE (2016)
Esteves, S., Veiga, L.: WaaS: workflow-as-a-service for the cloud with scheduling of continuous and data-intensive workflows. Comput. J. 59(3), 371–383 (2016)
Gerlach, W., Tang, W., Keegan, K., Harrison, T., Wilke, A., Bischof, J., Dsouza, M., et al.: Skyport-container-based execution environment management for multi-cloud scientific workflows. In: 2014 5th International Workshop on Data-Intensive Computing in the Clouds (DataCloud), pp. 25–32. IEEE (2014)
Wang, J., Korambath, P., Altintas, I., Davis, J., Crawl, D.: Workflow as a service in the cloud: architecture and scheduling algorithms. Procedia Comput. Sci. 29, 546–556 (2014)
Rodriguez, M.A., Buyya, R.: Scheduling dynamic workloads in multi-tenant scientific workflow as a service platforms. Future Gener. Comput. Syst. 79, 739–750 (2018)
Kaur, P., Mehta, S.: Resource provisioning and work flow scheduling in clouds using augmented shuffled frog leaping algorithm. J. Parallel Distrib. Comput. 101, 41–50 (2017)
Chu, S., Tsai, P., Pan, J.: Cat swarm optimization. In: Pacific Rim International Conference on Artificial Intelligence, pp. 854–858. Springer, Berlin, Heidelberg (2006)
Chu, S., Tsai, P.: Computational intelligence based on the behavior of cats. Int. J. Innov. Comput. Inf. Control 3(1), 163–173 (2007)
Sharafi, Y., Khanesar, M.A., Teshnehlab, M.: Discrete binary cat swarm optimization algorithm. In: 2013 3rd International Conference on Computer, Control & Communication (IC4), pp. 1–6. IEEE (2013)
Tsai, P.W., Pan, J.S., Chen, S.M., Liao, B.Y., Hao, S.P.: Parallel cat swarm optimization. In: 2008 International Conference on Machine Learning and Cybernetics, vol. 6, pp. 3328–3333. IEEE (2008)
Verma, A., Kaushal, S.: Cost-time efficient scheduling plan for executing workflows in the cloud. J. Grid Comput. 13(4), 495–506 (2015)
Ahmad, S.G., Liew, C.S., Munir, E.U., Ang, T.F., Khan, S.U.: A hybrid genetic algorithm for optimization of scheduling workflow applications in heterogeneous computing systems. J. Parallel Distrib. Comput. 87, 80–90 (2016)
Tao, F., Feng, Y., Zhang, L., Liao, T.W.: CLPS-GA: A case library and Pareto solution-based hybrid genetic algorithm for energy-aware cloud service scheduling. Appl. Soft Comput. 19, 264–279 (2014)
Kar, I., Parida, R.R., Das, H.: Energy aware scheduling using genetic algorithm in cloud data centers. In: International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), pp. 3545–3550. IEEE (2016)
Kar, I., Das, H.: Energy aware task scheduling using genetic algorithm in cloud datacentres. Int. J. Comput. Sci. Inf. Technol. Res. 4(1), 106–111 (2016)
Sahoo, A.K., Das, H.: Energy efficient scheduling using DVFS technique in cloud datacenters. Int. J. Comput. Sci. Inf. Technol. Res. 4(1), 59–66 (2016)
Verma, A., Kaushal, S.: A hybrid multi-objective particle swarm optimization for scientific workflow scheduling. Parallel Comput. 62, 1–19 (2017)
Ezzatti, P., Pedemonte, M., Martín, A.: An efficient implementation of the Min-Min heuristic. Comput. Oper. Res. 40(11), 2670–2676 (2013)
He, X., Sun, X., Laszewski, G.V.: QoS guided min-min heuristic for grid task scheduling. J. Comput. Sci. Technol. 18(4), 442–451 (2003)
Singh, M., Suri, P.K.: QPS Max-Min<> Min-Min: a QoS based predictive Max-Min, Min-Min switcher algorithm for job scheduling in a grid. Inf. Technol. J. 7(8), 1176–1181 (2008)
Tabak, E.K., Cambazoglu, B.B., Aykanat, C.: Improving the performance of independent task assignment heuristics minmin, maxmin and sufferage. IEEE Trans. Parallel Distrib. Syst. 25(5), 1244–1256 (2014)
Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in grid environments. In: 9th Heterogeneous Computing Workshop, 2000 (HCW 2000) Proceedings, pp. 349–363. IEEE (2000)
Chen, W., Zhang, J.: A set-based discrete PSO for cloud workflow scheduling with user-defined QoS constraints. In: 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 773–778. IEEE (2012)
Jianfang, C., Junjie, C., Qingshan, Z.: An optimized scheduling algorithm on a cloud workflow using a discrete particle swarm. Cybern. Inf. Technol. 14(1), 25–39 (2014)
Bahrami, M., Bozorg-Haddad, O., Chu, X.: Cat swarm optimization (CSO) algorithm. In: Advanced Optimization by Nature-Inspired Algorithms, pp. 9–18. Springer, Singapore (2018)
Eusuff, M., Lansey, K., Pasha, F.: Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization. Eng. Optim. 38(2), 129–154 (2006)
Liu, J.: Multisite management of scientific workflows in the cloud. Distributed, parallel, and cluster computing. Ph.D. dissertation, Universite de Montpellier (2016)
Liu, J., Pacitti, E., Valduriez, P., Oliveira, D., Mattoso, M.: Scientific workflow execution with multiple objectives in multisite clouds. In: BDA: Bases de Données Avancées (2016)
Pineda-Morales, L., Liu, J., Costan, A., Pacitti, E., Antoniu, G., Valduriez, P., Mattoso, M.: Managing hot metadata for scientific workflows on multisite clouds. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 390–397. IEEE (2016)
Tudoran, R., Costan, A., Antoniu, G.: Overflow: multi-site aware big data management for scientific workflows on clouds. IEEE Trans. Cloud Comput. 4(1), 76–89 (2016)
Ahmad, M.K.H.: Scientific workflow execution reproducibility using cloud-aware provenance. Ph.D. dissertation, University of the West of England (UWE) (2016)
Jrad, F., Tao, J., Streit, A.: A broker-based framework for multi-cloud workflows. In: Proceedings of the 2013 International Workshop on Multi-cloud Applications and Federated Clouds, pp. 61–68. ACM (2013)
Kozlowszky, M., Karóczkai, K., Marton, A., Balasko, A., Marosi, A., Kacsuk, P.: Enabling generic distributed computing infrastructure compatibility for workflow management systems. Comput. Sci. 13(3), 61 (2012)
Varghese, B., Wang, N., Barbhuiya, S., Kilpatrick, P., Nikolopoulos, D.S.: Challenges and opportunities in edge computing (2016). arXiv:1609.01967
Meurisch, C., Seeliger, A., Schmidt, B., Schweizer, I., Kaup, F., Mühlhäuser, M.: Upgrading wireless home routers for enabling large-scale deployment of cloudlets. In: International Conference on Mobile Computing, Applications, and Services, pp. 12–29. Springer, Cham (2015)
Chen, W., Deelman, E.: Integration of workflow partitioning and resource provisioning. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID 2012), pp. 764–768 (2012)
Tang, W., Jenkins, J., Meyer, F., Ross, R., Kettimuthu, R., Winkler, L., Yang, X., Lehman, T., Desai, N.: Data-aware resource scheduling for multicloud workflows: a fine-grained simulation approach. In: 2014 IEEE 6th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 887–892 (2014)
Yin, D., Kosar, T.: A data-aware workflow scheduling algorithm for heterogeneous distributed systems. In: International Conference on High Performance Computing and Simulation (HPCS), 2011, pp. 114–120. IEEE (2011)
Ghafarian, T., Javadi, B.: Cloud-aware data intensive workflow scheduling on volunteer computing systems. Future Gener. Comput. Syst. 51, 87–97 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Khan, S., Ali, S.A., Hasan, N., Shakil, K.A., Alam, M. (2019). Big Data Scientific Workflows in the Cloud: Challenges and Future Prospects. In: Das, H., Barik, R., Dubey, H., Roy, D. (eds) Cloud Computing for Geospatial Big Data Analytics. Studies in Big Data, vol 49. Springer, Cham. https://doi.org/10.1007/978-3-030-03359-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-03359-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03358-3
Online ISBN: 978-3-030-03359-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)