Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction
Visualitza/
Impacte
Scholar |
Altres documents de l'autoria: Benner, Peter; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo
Metadades
Mostra el registre complet de l'elementcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadades
Títol
Unleashing GPU acceleration for symmetric band linear algebra kernels and model reductionAutoria
Data de publicació
2015-12xmlui.dri2xhtml.METS-1.0.item-edition
Preprint, versió de l'autorEditor
© Springer International Publishing AGISSN
1573-7543Cita bibliogràfica
BENNER, Peter, et al. Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction. Cluster Computing, 2015, 18.4: 1351-1362.Tipus de document
info:eu-repo/semantics/articleVersió de l'editorial
http://rd.springer.com/article/10.1007/s10586-015-0489-xParaules clau / Matèries
Resum
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In ... [+]
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In particular, the matrix multiplication and the solution of linear systems are two key problems with efficient implementations (or kernels) for a variety of high per- formance parallel architectures. For these specific prob- lems, leveraging the structure of the associated matrices often leads to remarkable time and memory savings, as is the case, e.g., for symmetric band problems. In this work, we exploit the ample hardware concurrency of many-core graphics processors (GPUs) to accelerate the solution of symmetric positive definite band linear systems, introducing highly tuned versions of the corre- sponding LAPACK routines. The experimental results with the new GPU kernels reveal important reductions of the execution time when compared with tuned imple- mentations of the same operations provided in Intel’s MKL. In addition, we evaluate the performance of the GPU kernels when applied to the solution of model or- der reduction problems and the associated matrix equa- tions. [-]
Publicat a
Cluster Computing, 2015, 18.4: 1351-1362Drets d'accés
http://rightsstatements.org/vocab/CNE/1.0/
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Apareix a les col.leccions
- ICC_Articles [427]