Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction
View/ Open
Impact
Scholar |
Other documents of the author: Benner, Peter; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo
Metadata
Show full item recordcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadata
Title
Unleashing GPU acceleration for symmetric band linear algebra kernels and model reductionAuthor (s)
Date
2015-12xmlui.dri2xhtml.METS-1.0.item-edition
Preprint, versió de l'autorPublisher
© Springer International Publishing AGISSN
1573-7543Bibliographic citation
BENNER, Peter, et al. Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction. Cluster Computing, 2015, 18.4: 1351-1362.Type
info:eu-repo/semantics/articlePublisher version
http://rd.springer.com/article/10.1007/s10586-015-0489-xSubject
Abstract
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In ... [+]
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In particular, the matrix multiplication and the solution of linear systems are two key problems with efficient implementations (or kernels) for a variety of high per- formance parallel architectures. For these specific prob- lems, leveraging the structure of the associated matrices often leads to remarkable time and memory savings, as is the case, e.g., for symmetric band problems. In this work, we exploit the ample hardware concurrency of many-core graphics processors (GPUs) to accelerate the solution of symmetric positive definite band linear systems, introducing highly tuned versions of the corre- sponding LAPACK routines. The experimental results with the new GPU kernels reveal important reductions of the execution time when compared with tuned imple- mentations of the same operations provided in Intel’s MKL. In addition, we evaluate the performance of the GPU kernels when applied to the solution of model or- der reduction problems and the associated matrix equa- tions. [-]
Is part of
Cluster Computing, 2015, 18.4: 1351-1362Rights
http://rightsstatements.org/vocab/CNE/1.0/
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
This item appears in the folowing collection(s)
- ICC_Articles [427]