Buscar
Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems
(Springer US, 2014)
We investigate the benefits that an energyaware
implementation of the runtime in charge of
the concurrent execution of ILUPACK —a sophisticated
preconditioned iterative solver for sparse linear
systems— produces on the ...
Enhancing performance and energy consumption of runtime schedulers for dense linear algebra
(Wiley, 2014-06)
The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate ...
Solving dense generalized eigenproblems on multi-threaded architectures
(Elsevier, 2012-07)
We compare two approaches to compute a fraction of the spectrum of dense symmetric
definite generalized eigenproblems: one is based on the reduction to tridiagonal form,
and the other on the Krylov-subspace iteration. ...
GPU-based Dynamic Wave Field Synthesis using Fractional Delay Filters and Room Compensation
(IEEE, 2017-02)
Wave Field Synthesis (WFS) is a multichannel audio reproduction method, of a considerable computational
cost that renders an accurate spatial sound field using a large number of loudspeakers to emulate
virtual sound ...
On the performance of a GPU-based SoC in a distributed spatial audio system
(Springer, 2021-01-04)
Many current system-on-chip (SoC) devices are composed of low-power multicore processors combined with a small graphics accelerator (or GPU) offering a trade-off between computational capacity and low-power consumption. ...
The Impact of the Multi-core Revolution on Signal Processing
(Universidad Politécnica de Valencia, 2010)
This paper analyzes the influence of new multi- core and many-core architectures on Signal Processing. The article covers both the architectural design and the programming models of current general-purpose multi-core ...
Two-sided orthogonal reductions to condensed forms on asymmetric multicore processors
(Elsevier, 2018)
We investigate how to leverage the heterogeneous resources of an Asymmetric Multicore Processor (AMP) in order to deliver high performance in the reduction to condensed forms for the solution of dense eigenvalue and ...
Exploiting nested task-parallelism in the H-LU factorization
(Elsevier, 2019-04)
We address the parallelization of the LU factorization of hierarchical matrices (-matrices) arising from boundary element methods. Our approach exploits task-parallelism via the OmpSs programming model and runtime, which ...