Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction
Ver/ Abrir
Impacto
Scholar |
Otros documentos de la autoría: Benner, Peter; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo
Metadatos
Mostrar el registro completo del ítemcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadatos
Título
Unleashing GPU acceleration for symmetric band linear algebra kernels and model reductionAutoría
Fecha de publicación
2015-12xmlui.dri2xhtml.METS-1.0.item-edition
Preprint, versió de l'autorEditor
© Springer International Publishing AGISSN
1573-7543Cita bibliográfica
BENNER, Peter, et al. Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction. Cluster Computing, 2015, 18.4: 1351-1362.Tipo de documento
info:eu-repo/semantics/articleVersión de la editorial
http://rd.springer.com/article/10.1007/s10586-015-0489-xPalabras clave / Materias
Resumen
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In ... [+]
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In particular, the matrix multiplication and the solution of linear systems are two key problems with efficient implementations (or kernels) for a variety of high per- formance parallel architectures. For these specific prob- lems, leveraging the structure of the associated matrices often leads to remarkable time and memory savings, as is the case, e.g., for symmetric band problems. In this work, we exploit the ample hardware concurrency of many-core graphics processors (GPUs) to accelerate the solution of symmetric positive definite band linear systems, introducing highly tuned versions of the corre- sponding LAPACK routines. The experimental results with the new GPU kernels reveal important reductions of the execution time when compared with tuned imple- mentations of the same operations provided in Intel’s MKL. In addition, we evaluate the performance of the GPU kernels when applied to the solution of model or- der reduction problems and the associated matrix equa- tions. [-]
Publicado en
Cluster Computing, 2015, 18.4: 1351-1362Derechos de acceso
http://rightsstatements.org/vocab/CNE/1.0/
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Aparece en las colecciones
- ICC_Articles [427]