Compressed basis GMRES on high-performance graphics processing units
Ver/ Abrir
Impacto
Scholar |
Otros documentos de la autoría: Aliaga Estellés, José Ignacio; Anzt, Hartwig; Tomás Domínguez, Andrés Enrique; Quintana-Orti, Enrique S.; Grützmacher, Thomas
Metadatos
Mostrar el registro completo del ítemcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadatos
Título
Compressed basis GMRES on high-performance graphics processing unitsAutoría
Fecha de publicación
2022-08-05Editor
SageCita bibliográfica
Aliaga JI, Anzt H, Grützmacher T, Quintana-Ortí ES, Tomás AE. Compressed basis GMRES on high-performance graphics processing units. The International Journal of High Performance Computing Applications. 2023;37(2):82-100. doi:10.1177/10943420221115140Tipo de documento
info:eu-repo/semantics/articleVersión de la editorial
https://journals.sagepub.com/doi/full/10.1177/10943420221115140Versión
info:eu-repo/semantics/publishedVersionPalabras clave / Materias
Resumen
Krylov methods provide a fast and highly parallel numerical tool for the iterative solution of many large-scale sparse linear
systems. To a large extent, the performance of practical realizations of these methods is ... [+]
Krylov methods provide a fast and highly parallel numerical tool for the iterative solution of many large-scale sparse linear
systems. To a large extent, the performance of practical realizations of these methods is constrained by the communication
bandwidth in current computer architectures, motivating the investigation of sophisticated techniques to avoid, reduce,
and/or hide the message-passing costs (in distributed platforms) and the memory accesses (in all architectures). This article
leverages Ginkgo’s memory accessor in order to integrate a communication-reduction strategy into the (Krylov) GMRES
solver that decouples the storage format (i.e., the data representation in memory) of the orthogonal basis from the
arithmetic precision that is employed during the operations with that basis. Given that the execution time of the GMRES
solver is largely determined by the memory accesses, the cost of the datatype transforms can be mostly hidden, resulting in
the acceleration of the iterative step via a decrease in the volume of bits being retrieved from memory. Together with the
special properties of the orthonormal basis (whose elements are all bounded by 1), this paves the road toward the
aggressive customization of the storage format, which includes some floating-point as well as fixed-point formats with mild
impact on the convergence of the iterative process. We develop a high-performance implementation of the “compressed
basis GMRES” solver in the Ginkgo sparse linear algebra library using a large set of test problems from the SuiteSparse
Matrix Collection. We demonstrate robustness and performance advantages on a modern NVIDIA V100 graphics
processing unit (GPU) of up to 50% over the standard GMRES solver that stores all data in IEEE double-precision. [-]
Publicado en
The International Journal of High Performance Computing Applications 2023, 37 (2)Entidad financiadora
Ministerio de Ciencia, Innovación y Universidades (Spain) | Helmholtz Association | US Exascale Computing Project
Código del proyecto o subvención
PID2020-113656RB-C21 | PID2020-113656RB-C22 | VH-NG-1241 | 17-SC-20-SC
Derechos de acceso
© The Author(s) 2022
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Aparece en las colecciones
- ICC_Articles [423]