Fine-grained bit-flip protection for relaxation methods
comunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONAquest recurs és restringit
https://doi.org/10.1016/j.jocs.2016.11.013 |
Metadades
Títol
Fine-grained bit-flip protection for relaxation methodsData de publicació
2019-09Editor
ElsevierTipus de document
info:eu-repo/semantics/articleVersió de l'editorial
https://www.sciencedirect.com/science/article/pii/S1877750316303891Versió
info:eu-repo/semantics/publishedVersionParaules clau / Matèries
Resum
Resilience is considered a challenging under-addressed issue that the high performance computing community (HPC) will have to face in order to produce reliable Exascale systems by the beginning of the next decade. As ... [+]
Resilience is considered a challenging under-addressed issue that the high performance computing community (HPC) will have to face in order to produce reliable Exascale systems by the beginning of the next decade. As part of a push toward a resilient HPC ecosystem, in this paper we propose an error-resilient iterative solver for sparse linear systems based on stationary component-wise relaxation methods. Starting from a plain implementation of the Jacobi iteration, our approach introduces a low-cost component-wise technique that detects bit-flips, rejecting some component updates, and turning the initial synchronized solver into an asynchronous iteration. Our experimental study with sparse incomplete factorizations from a collection of real-world applications, and a practical GPU implementation, exposes the convergence delay incurred by the fault-tolerant implementation and its practical performance. [-]
Proyecto de investigación
U.S. Department of Energy (Award Number DE-SC-0010042) and NVIDIA ; MINECO and FEDER (project CICYT TIN2014-53495-R).Drets d'accés
© 2016 Elsevier B.V. All rights reserved.
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
Apareix a les col.leccions
- ICC_Articles [424]