Systematic derivation of time and power models for linear algebra kernels on multicore architectures
Impact
Scholar |
Other documents of the author: Malossi, A. Cristiano I.; Ineichen, Yves; Bekas, Costas; Curioni, Alessandro; Quintana-Orti, Enrique S.
Metadata
Show full item recordcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONThis resource is restricted
http://dx.doi.org/10.1016/j.suscom.2015.02.001 |
Metadata
Title
Systematic derivation of time and power models for linear algebra kernels on multicore architecturesAuthor (s)
Date
2015-09Publisher
ElsevierBibliographic citation
MALOSSI, A. Cristiano I., et al. Systematic derivation of time and power models for linear algebra kernels on multicore architectures. Sustainable Computing: Informatics and Systems, 2015, vol. 7, p. 24-40.Type
info:eu-repo/semantics/articlePublisher version
http://www.sciencedirect.com/science/article/pii/S2210537915000037Subject
Abstract
The power wall asks for a holistic effort from the high performance and scientific communities to develop power-aware tools and applications which ultimately drive the design of energy-efficient hardware. Toward this ... [+]
The power wall asks for a holistic effort from the high performance and scientific communities to develop power-aware tools and applications which ultimately drive the design of energy-efficient hardware. Toward this goal, we introduce a systematic methodology to derive reliable time and power models for algebraic kernels employing a bottom-up approach. This strategy helps to understand the contribution of the different kernels to the total energy consumption of applications, as well as to distinguish between the cost of fine-grain components such as arithmetic, memory access, and overheads introduced by, e.g., multithreading or reductions.
To study and validate our methodology, we initially focus on two key memory-bound BLAS-1 vector kernels: the dot product and the axpy operation. Subsequently, we show how these kernels can be composed to accurately predict the energy consumption of more heterogeneous algorithms, such as the Conjugate Gradient method, while tackling the elaborate memory hierarchy and the high degree of concurrency of today's processors; in particular, the evaluation of the models on the IBM® Blue Gene/Q supercomputer, as well as on the IBM® Power 755 server, reveals that average power consumption is captured at high accuracy, yet the models and the methodology are universal to be portable to any general-purpose multicore architecture. [-]
Is part of
Sustainable Computing: Informatics and Systems Volume 7, September 2015Rights
Copyright © 2015 Elsevier B.V. All rights reserved.
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/restrictedAccess
This item appears in the folowing collection(s)
- ICC_Articles [239]