Abstract
Moldable scientific workflows represent a special class of scientific workflows where the tasks are written as distributed programs being able to exploit various amounts of computer resources. However, current cluster job schedulers require the user to specify the amount of resources per task manually. This often leads to suboptimal execution time and related cost of the whole workflow execution since many users have only limited experience and knowledge of the parallel efficiency and scaling. This paper proposes several mechanisms to automatically optimize the execution parameters of moldable workflows using genetic algorithms. The paper introduces a local optimization of workflow tasks, a global optimization of the workflow on systems with on-demand resource allocation, and a global optimization for systems with static resource allocation. Several objectives including the workflow makespan, computational cost and the percentage of idling nodes are investigated together with a trade-off parameter putting stress on one objective or another. The paper also discusses the structure and quality of several evolved workflow schedules and the possible reduction in makespan or cost. Finally, the computational requirements of evolutionary process together with the recommended genetic algorithm settings are investigated. The most complex workflows may be evolved in less than two minutes using the global optimization while in only 14 s using the local optimization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
The shift from processor power consumption to performance variations: fundamental implications at scale. Comput. Sci. - Res. Dev. 31(4), 197–205 (2016)
Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 1820 1967 Spring Joint Computer Conference, vol. 23, no. 4, pp. 483–485 (1967)
Bharathi, S., Chervenak, A., Deelman, E., Mehta, G., Su, M.-H., Vahi, K.: Characterization of scientific workflows. In: 2008 Third Workshop on Workflows in Support of Large-Scale Science, pp. 1–10. IEEE, November 2008
Bleuse, R., Hunold, S., Kedad-Sidhoum, S., Monna, F., Mounie, G., Trystram, D.: Scheduling independent moldable tasks on multi-cores with GPUs. IEEE Trans. Parallel Distrib. Syst. 28(9), 2689–2702 (2017)
Breukelaar, R., Demaine, E.D., Hohenberger, S., Hoogeboom, H.J., Kosters, W.A., Liben-Nowell, D.: Tetris is hard, even to approximate. Int. J. Comput. Geomet. Appl. 14, 41–68 (2004)
Chan, T.F., Mathew, T.P.: Domain decomposition algorithms. Acta Numer. 3, 61–143 (1994)
Chirkin, A.M., et al.: Execution time estimation for workflow scheduling. Future Gener. Comput. Syst. 75, 376–387 (2017)
Cotta, C., Fernández, A.J.: Evolutionary Scheduling. Studies in Computational Intelligence, vol. 49. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-48584-1
Dutot, P.-F., Netto, M.A.S., Goldman, A., Kon, F.: Scheduling moldable BSP tasks. In: Feitelson, D., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS (LNAI and LNB), vol. 3834, pp. 157–172. Springer, Heidelberg (2005). https://doi.org/10.1007/11605300_8
Feitelson, D.G., Rudolph, L.: Toward convergence in job schedulers for parallel supercomputers. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1996. LNCS (LNAI and LNB), vol. 1162, pp. 1–26. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0022284
Gad, A.F.: Geneticalgorithmpython: Building genetic algorithm in python (2021). https://github.com/ahmedfgad/GeneticAlgorithmPython/tree/05a069abf43146e7f8eb37f37c539523bf62ac9a
Hejazi, S.R., Saghafian, S.: Flowshop-scheduling problems with makespan criterion: a review. Int. J. Prod. Res. 43(14), 2895–2929 (2005)
Henderson, R.L.: Job scheduling under the portable batch system. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1995. LNCS (LNAI and LNB), vol. 949, pp. 279–294. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-60153-8_34
Hill, M.D.: What is scalability? ACM SIGARCH Comput. Archit. News 18(4), 18–21 (1990)
Hovestadt, M., Kao, O., Keller, A., Streit, A.: Scheduling in HPC resource management systems: queuing vs. planning. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS (LNAI and LNB), vol. 2862, pp. 1–20. Springer, Heidelberg (2003). https://doi.org/10.1007/10968987_1
Huang, K.-C., Huang, T.-C., Tsai, M.-J., Chang, H.-Y.: Moldable job scheduling for HPC as a service. In: Park, J., Stojmenovic, I., Choi, M., Xhafa, F. (eds.) Future Information Technology. Lecture Notes in Electrical Engineering, vol. 276, pp. 43–48. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-40861-8_7
Izadkhah, H.: Learning based genetic algorithm for task graph scheduling. Appl. Comput. Intell. Soft Comput. 2019, 15 (2019). Article ID 6543957. https://doi.org/10.1155/2019/6543957
Jansen, K., Land, F.: Scheduling monotone moldable jobs in linear time. In: 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 172–181. IEEE, May 2018
Jaros, J., Rendell, A.P., Treeby, B.E.: Full-wave nonlinear ultrasound simulation on distributed clusters with applications in high-intensity focused ultrasound. Int. J. High Perform. Comput. Appl. 30(2), 1094342015581024- (2015)
Jaros, M., Sasak, T., Treeby, B.E., Jaros, J.: Estimation of execution parameters for k-wave simulations. In: Kozubek, T., Arbenz, P., Jaroš, J., Říha, L., Šístek, J., Tichý, P. (eds.) HPCSE 2019. LNCS, vol. 12456, pp. 116–134. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67077-1_7
Klusacek, D., Podolnikova, G., Toth, S.: Complex job scheduling simulations with Alea 4. In: Tan, G. (ed.) Proceedings of the 9th EAI International Conference on Simulation Tools and Techniques, Belgium, pp. 124–129. ICST (2016)
Lamprecht, A.-L., Turner, K.J.: Scientific workflows. Int. J. Softw. Tools Technol. Transf. 18(6), 575–580 (2016)
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)
Omara, F.A., Arafa, M.M.: Genetic algorithms for task scheduling problem. J. Parallel Distrib. Comput. 70(1), 13–22 (2010)
Robert, Y., et al.: Task Graph Scheduling. Encyclopedia of Parallel Computing, pp. 2013–2025 (2011)
Srinivasan, S., Krishnamoorthy, S., Sadayappan, P.: A robust scheduling technology for moldable scheduling of parallel jobs. In: Proceedings IEEE International Conference on Cluster Computing CLUSTR-03, pp. 92–99. IEEE (2003)
Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Selective reservation strategies for backfill job scheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS (LNAI and LNB), vol. 2537, pp. 55–71. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36180-4_4
Sudholt, D.: Parallel evolutionary algorithms. In: Kacprzyk, J., Pedrycz, W. (eds.) Springer Handbook of Computational Intelligence, pp. 929–959. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-43505-2_46
Ye, D., Chen, D.Z., Zhang, G.: Online scheduling of moldable parallel tasks. J. Sched. 21(6), 647–654 (2018)
Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS (LNAI and LNB), vol. 2862, pp. 44–60. Springer, Heidelberg (2003). https://doi.org/10.1007/10968987_3
Acknowledgments
This project has received funding from the European Union’s Horizon 2020 research and innovation programme H2020 ICT 2016–2017 under grant agreement No 732411 and is an initiative of the Photonics Public Private Partnership. This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID: 90140). This work was supported by Brno University of Technology under project number FIT/FSI-J-21-7435.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jaros, M., Jaros, J. (2021). Performance-Cost Optimization of Moldable Scientific Workflows. In: Klusáček, D., Cirne, W., Rodrigo, G.P. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2021. Lecture Notes in Computer Science(), vol 12985. Springer, Cham. https://doi.org/10.1007/978-3-030-88224-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-88224-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88223-5
Online ISBN: 978-3-030-88224-2
eBook Packages: Computer ScienceComputer Science (R0)