Listar ICC_Articles por autoría "511c94cb-8547-4534-af1b-edcfd848481f"
Mostrando ítems 1-20 de 37
-
A pipeline structure for the block QR update in digital signal processing
Dolz, Manuel F.; Alventosa, Fran J.; Alonso-Jordá, Pedro Springer (2019-03)There exist problems in the field of digital signal processing, such as filtering of acoustic signals that require processing a large amount of data in real time. The beamforming algorithm, for instance, is a process that ... -
A similarity study of I/O traces via string kernels
Torres, Raul; Kunkel, Julian; Dolz, Manuel F.; Ludwig, Thomas Springer (2018-07)Understanding I/O for data-intense applications is the foundation for the optimization of these applications. The classification of the applications according to the expressed I/O access pattern eases the analysis. An ... -
A simulator to assess energy saving strategies and policies in HPC workloads
Quintana-Orti, Enrique S.; Mayo, Rafael; Iserte, Sergio; Fernández Fernández, Juan Carlos; Dolz, Manuel F. Association for Computing Machinery (ACM) (2012-07)In recent years power consumption of high performance computing (HPC) clusters has become a growing problem due, e.g., to the economic cost of electricity, the emission of car- bon dioxide (with negative impact on the ... -
Adapting concurrency throttling and voltage–frequency scaling for dense eigensolvers
Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Castaño Álvarez, María Asunción; Dolz, Manuel F.; Quintana-Orti, Enrique S. Springer Verlag (2015)We analyze power dissipation and energy consumption during the execution of high-performance dense linear algebra kernels on multi-core processors. On top of this analysis, we propose and evaluate several strategies to ... -
An adaptive offline implementation selector for heterogeneous parallel platforms
del Río Astorga, David; Dolz, Manuel F.; Sánchez García, Luis Miguel; Fernández Muñoz, Javier; García, J. Daniel Universidad de Salamanca (2017-03)Heterogeneous parallel platforms, comprising multiple processing units and architectures, have become a cornerstone in improving the overall performance and energy efficiency of scientific and engineering applications. ... -
An analytical methodology to derive power models based on hardware and software metrics
Dolz, Manuel F.; Kunkel, Julian; Chasapis, Konstantinos; Catalán, Sandra Springer Berlin Heidelberg (2015-09)The use of models to predict the power con- sumption of a system is an appealing alternative to wattmeters since they avoid hardware costs and are easy to deploy. In this paper, we present a systematic ... -
Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks
Castelló, Adrián; Catalán Carbó, Mar; Dolz, Manuel F.; Quintana-Orti, Enrique S.; Duato, José Springer (2022-01-10)For many distributed applications, data communication poses an important bottleneck from the points of view of performance and energy consumption. As more cores are integrated per node, in general the global performance ... -
Are our dense linear algebra libraries energy-friendly?. Time–power–energy trade-offs in BLAS and LAPACK
Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Dolz, Manuel F.; Quintana-Orti, Enrique S. Springer Berlin Heidelberg (2015-05)In this paper we conduct a detailed analysis of the sources of power dissipation and energy consumption during the execution of current dense linear algebra kernels on multicore processors, binding these two metrics together ... -
Assessing Power Monitoring Approaches for Energy and Power Analysis of Computers
El Mehdi Diouria, Mohammed; Dolz, Manuel F.; Glückc, Olivier; Lefèvre, Laurent; Alonso-Jordá, Pedro; Catalán, Sandra; Mayo, Rafael; Quintana-Orti, Enrique S. Elsevier (2014-06)Large-scale distributed systems (e.g., datacenters, HPC systems, clouds, large-scale networks, etc.) consume and will consume enormous amounts of energy. Therefore, accurately monitoring the power dissipation and energy ... -
Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems
Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Dolz, Manuel F.; Martín Huertas, Alberto F.; Mayo, Rafael; Quintana-Orti, Enrique S. Springer US (2014)We investigate the benefits that an energyaware implementation of the runtime in charge of the concurrent execution of ILUPACK —a sophisticated preconditioned iterative solver for sparse linear systems— produces on the ... -
Automatic generation of ARM NEON micro‑kernels for matrix multiplication
Alaejos, Guillermo; Martínez, Héctor; Castelló, Adrián; Dolz, Manuel F.; Igual, Francisco; Alonso-Jordá, Pedro; Quintana-Orti, Enrique S. Springer (2024-03-12)General matrix multiplication (gemm) is a fundamental kernel in scientifc computing and current frameworks for deep learning. Modern realisations of gemm are mostly written in C, on top of a small, highly tuned micro-kernel ... -
Balancing task- and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi
Dolz, Manuel F.; Igual, Francisco; Ludwig, Thomas; Piñuel, Luis; Quintana-Orti, Enrique S. Elsevier (2015-08)The emergence of new manycore architectures, such as the Intel Xeon Phi, poses new challenges in how to adapt existing libraries and applications to this type of systems. In particular, the exploitation of manycore ... -
BestOf: an online implementation selector for the training and inference of deep neural networks
Barrachina Mir, Sergio; Castelló, Adrián; Dolz, Manuel F.; Tomás, Andrés E. Springer (2022-05-20)Tuning and optimising the operations executed in deep learning frameworks is a fundamental task in accelerating the processing of deep neural networks (DNNs). However, this optimisation usually requires extensive manual ... -
Convolutional neural nets for estimating the run time and energy consumption of the sparse matrix-vector product
Barreda Vayá, Maria; Dolz, Manuel F.; Castaño Álvarez, María Asunción Sage (2020-08-26)Modeling the performance and energy consumption of the sparse matrix-vector product (SpMV) is essential to perform off-line analysis and, for example, choose a target computer architecture that delivers the best ... -
Detecting semantic violations of lock-free data structures through C++ contracts
López-Gómez, Javier; del Río Astorga, David; Dolz, Manuel F.; Fernández Muñoz, Javier; García, J. Daniel Springer (2019-03)The use of synchronization mechanisms in multithreaded applications is essential on shared-memory multi-core architectures. However, debugging parallel applications to avoid potential failures, such as data races or ... -
DVFS-control techniques for dense linear algebra operations on multi-core processors
Alonso-Jordá, Pedro; Dolz, Manuel F.; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Springer (2012-11)This paper analyzes the impact on power consumption of two DVFS-control strategies when applied to the execution of dense linear algebra operations on multi-core processors. The strategies considered here, prototyped as ... -
Efficient and portable GEMM-based convolution operators for deep neural network training on multicore processors
Barrachina Mir, Sergio; Dolz, Manuel F.; San Juan, Pablo; Quintana-Orti, Enrique S. Elsevier (2022-05-30)Convolutional Neural Networks (CNNs) play a crucial role in many image recognition and classification tasks, recommender systems, brain-computer interfaces, etc. As a consequence, there is a notable interest in developing ... -
Efficient and portable Winograd convolutions for multi-core processors
Dolz, Manuel F.; Martínez, Héctor; Castelló, Adrián; Alonso-Jordá, Pedro; Quintana-Orti, Enrique S. Springer (2023-02-12)We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, ... -
Energy-efficient execution of dense linear algebra algorithms on multi-core processors
Alonso-Jordá, Pedro; Dolz, Manuel F.; Mayo, Rafael; Quintana-Orti, Enrique S. Springer Verlag (2013-09)This paper addresses the efficient exploitation of task-level parallelism, present in many dense linear algebra operations, from the point of view of both computational performance and energy consumption. The strategies ... -
Enhancing performance and energy consumption of runtime schedulers for dense linear algebra
Alonso-Jordá, Pedro; Dolz, Manuel F.; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Wiley (2014-06)The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate ...