Systematic derivation of time and power models for linear algebra kernels on multicore architectures
Impacto
Scholar |
Otros documentos de la autoría: Malossi, A. Cristiano I.; Ineichen, Yves; Bekas, Costas; Curioni, Alessandro; Quintana-Orti, Enrique S.
Metadatos
Mostrar el registro completo del ítemcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONEste recurso está restringido
http://dx.doi.org/10.1016/j.suscom.2015.02.001 |
Metadatos
Título
Systematic derivation of time and power models for linear algebra kernels on multicore architecturesAutoría
Fecha de publicación
2015-09Editor
ElsevierCita bibliográfica
MALOSSI, A. Cristiano I., et al. Systematic derivation of time and power models for linear algebra kernels on multicore architectures. Sustainable Computing: Informatics and Systems, 2015, vol. 7, p. 24-40.Tipo de documento
info:eu-repo/semantics/articleVersión de la editorial
http://www.sciencedirect.com/science/article/pii/S2210537915000037Palabras clave / Materias
Resumen
The power wall asks for a holistic effort from the high performance and scientific communities to develop power-aware tools and applications which ultimately drive the design of energy-efficient hardware. Toward this ... [+]
The power wall asks for a holistic effort from the high performance and scientific communities to develop power-aware tools and applications which ultimately drive the design of energy-efficient hardware. Toward this goal, we introduce a systematic methodology to derive reliable time and power models for algebraic kernels employing a bottom-up approach. This strategy helps to understand the contribution of the different kernels to the total energy consumption of applications, as well as to distinguish between the cost of fine-grain components such as arithmetic, memory access, and overheads introduced by, e.g., multithreading or reductions.
To study and validate our methodology, we initially focus on two key memory-bound BLAS-1 vector kernels: the dot product and the axpy operation. Subsequently, we show how these kernels can be composed to accurately predict the energy consumption of more heterogeneous algorithms, such as the Conjugate Gradient method, while tackling the elaborate memory hierarchy and the high degree of concurrency of today's processors; in particular, the evaluation of the models on the IBM® Blue Gene/Q supercomputer, as well as on the IBM® Power 755 server, reveals that average power consumption is captured at high accuracy, yet the models and the methodology are universal to be portable to any general-purpose multicore architecture. [-]
Publicado en
Sustainable Computing: Informatics and Systems Volume 7, September 2015Derechos de acceso
Copyright © 2015 Elsevier B.V. All rights reserved.
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
Aparece en las colecciones
- ICC_Articles [414]