Dynamic spawning of MPI processes applied to malleability
![Thumbnail](/xmlui/bitstream/handle/10234/203486/martin_2023_dynamic.pdf.jpg?sequence=4&isAllowed=y)
View/ Open
Impact
![Google Scholar](/xmlui/themes/Mirage2/images/uji/logo_google.png)
![Microsoft Academico](/xmlui/themes/Mirage2/images/uji/logo_microsoft.png)
Metadata
Show full item recordcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadata
Title
Dynamic spawning of MPI processes applied to malleabilityAuthor (s)
Date
2023-05-29Publisher
SAGE PublicationsISSN
1094-3420; 1741-2846Bibliographic citation
Martín-Álvarez I, Aliaga JI, Castillo M, Iserte S, Mayo R. Dynamic spawning of MPI processes applied to malleability. The International Journal of High Performance Computing Applications. 2024;38(2):69-93. doi:10.1177/10943420231176527Type
info:eu-repo/semantics/articlePublisher version
https://journals.sagepub.com/doi/10.1177/10943420231176527Version
info:eu-repo/semantics/acceptedVersionSubject
Abstract
Malleability allows computing facilities to adapt their workloads through resource management systems to maximize the throughput of the facility and the efficiency of the executed jobs. This technique is based on ... [+]
Malleability allows computing facilities to adapt their workloads through resource management systems to maximize the throughput of the facility and the efficiency of the executed jobs. This technique is based on reconfiguring a job to a different resource amount during execution and then continuing with it. One of the stages of malleability is the dynamic spawning of processes in execution time, where different decisions in this stage will affect how the next stage of data redistribution is performed, which is the most time-consuming stage. This paper describes different methods and strategies, defining eight different alternatives to spawn processes dynamically and indicates which one should be used depending on whether a strong or weak scaling application is being used. In addition, it is described for both types of applications which strategies benefit most the application performance or the system productivity. The results show that reducing the number of spawning processes by reusing the older ones can reduce reconfiguration time compared to the classical method by up to 2.6 times for expanding and up to 36 times for shrinking. Furthermore, the asynchronous strategy requires analysing the impact of oversubscription on application performance. [-]
Funder Name
MCIN/AEI/10.13039/ 501100011033 | Universitat Jaume I | Valencian Region Government and European Social Funds
Project code
PID2020-113656RB-C21 | UJI-B2019-36 | APOSTD/2020/026 | ACIF/2021/260
Rights
This item appears in the folowing collection(s)
- ICC_Articles [427]