A mixed-precision algorithm for the solution of Lyapunov equations on hybrid CPU–GPU platforms
Impacte
Scholar |
Altres documents de l'autoria: Benner, Peter; Ezzatti, Pablo; Kressner, Daniel; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo
Metadades
Mostra el registre complet de l'elementcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONAquest recurs és restringit
http://dx.doi.org/10.1016/j.parco.2010.12.002 |
Metadades
Títol
A mixed-precision algorithm for the solution of Lyapunov equations on hybrid CPU–GPU platformsAutoria
Data de publicació
2011Editor
ElsevierISSN
0167-8191Cita bibliogràfica
Parallel Computing (Aug. 2011) vol. 37, no. 8, p. 439-450Tipus de document
info:eu-repo/semantics/articleVersió de l'editorial
http://www.sciencedirect.com/science/article/pii/S0167819110001560Paraules clau / Matèries
Resum
We describe a hybrid Lyapunov solver based on the matrix sign function, where the intensive parts of the computation are accelerated using a graphics processor (GPU) while executing the remaining operations on a ... [+]
We describe a hybrid Lyapunov solver based on the matrix sign function, where the intensive parts of the computation are accelerated using a graphics processor (GPU) while executing the remaining operations on a general-purpose multi-core processor (CPU). The initial stage of the iteration operates in single-precision arithmetic, returning a low-rank factor of an approximate solution. As the main computation in this stage consists of explicit matrix inversions, we propose a hybrid implementation of Gauß–Jordan elimination using look-ahead to overlap computations on GPU and CPU. To improve the approximate solution, we introduce an iterative refinement procedure that allows to cheaply recover full double-precision accuracy. In contrast to earlier approaches to iterative refinement for Lyapunov equations, this approach retains the low-rank factorization structure of the approximate solution. The combination of the two stages results in amixed-precision algorithm, that exploits the capabilities of both general-purpose CPUs and many-core GPUs and overlaps critical computations. Numerical experiments using real-world data and a platform equipped with two Intel Xeon QuadCore processors and an Nvidia Tesla C1060 show a significant efficiency gain of the hybrid method compared to a classical CPU implementation. [-]
Drets d'accés
© 2011 Elsevier Inc. All rights reserved
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
Apareix a les col.leccions
- ICC_Articles [430]