• openAccess   Hyperspectral Unmixing on Multicore DSPs: Trading Off Performance for Energy 

      Castillo Catalán, María Isabel; Fernández Fernández, Juan Carlos; Igual, Francisco; Plaza, Antonio; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo IEEE (2014)
      Wider coverage of observation missions will increase onboard power restrictions while, at the same time, pose higher demands from the perspective of processing time, thus asking for the exploration of novel high-performance ...
    • closedAccess   Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors 

      Catalán, Sandra; Herrero Zaragoza, José R.; Igual, Francisco; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S.; Adeniyi-Jones, Chris Elsevier (2018-03)
      Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the BLAS ...
    • openAccess   Optimized Fundamental Signal Processing Operations For Energy Minimization on Heterogeneous Mobile Devices 

      BELLOCH, JOSE A.; Badía, José; Igual, Francisco; González, Alberto; Quintana-Orti, Enrique S. IEEE (2018-05)
      Numerous signal processing applications are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness constraints for user interactivity and, at the same time, must ...
    • openAccess   Out-of-Core Solution of Linear Systems on Graphic Processors 

      Castillo Catalán, María Isabel; Igual, Francisco; Mayo, Rafael; Rubio, Rafael; Quintana-Ortí, Gregorio; Quintana-Orti, Enrique S.; Van de Geijn, Robert A. Departament d' Enginyeria i Ciència dels Computadors, Universitat Jaume I (2008-05)
      We combine two high-level application programming interfaces to solve large-scale linear systems with the data stored on disk using current graphics processors. The result is a simple yet powerful tool that enables a ...
    • openAccess   Practical considerations for acoustic source localization in the IoT era: Platforms, energy efficiency, and performance 

      BELLOCH, JOSE A.; Badía, José; Igual, Francisco; Cobos, Maximo IEEE (2019-06)
      The rapid development of the Internet of Things (IoT) has posed important changes in the way emerging acoustic signal processing applications are conceived. While traditional acoustic processing applications have been ...
    • openAccess   Programming parallel dense matrix factorizations with look-ahead and OpenMP 

      Catalán, Sandra; Castelló, Adrián; Igual, Francisco; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S. Springer (2019)
      We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multi-threaded ...
    • closedAccess   Reduction to Condensed Forms for Symmetric Eigenvalue Problems on Multi-core Architectures 

      Bientinesi, Paolo; Igual, Francisco; Kressner, Daniel; Quintana-Orti, Enrique S. Springer Berlin Heidelberg (2010)
      We investigate the performance of the routines in LAPACK and the Successive Band Reduction (SBR) toolbox for the reduction of a dense matrix to tridiagonal form, a crucial preprocessing stage in the solution of the symmetric ...
    • openAccess   Revisiting conventional task schedulers to exploit asymmetry in multi-core architectures for dense linear algebra operations 

      Costero, Luis; Igual, Francisco; Olcoz, Katzalin; Catalán, Sandra; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S. Elsevier (2017)
      Dealing with asymmetry in the architecture opens a plethora of questions related with the performance- and energy-efficient scheduling of task-parallel applications. While there exist early attempts to tackle this problem, ...
    • closedAccess   Scheduling algorithms-by-blocks on small clusters 

      Igual, Francisco; Quintana-Ortí, Gregorio; Van de Geijn, Robert A. Wiley (2012-03-28)
      The arrival of multicore architectures has generated an interest in reformulating dense matrix computations as algorithms-by-blocks, where submatrices are units of data and computations with those blocks are units of ...
    • closedAccess   Solving dense generalized eigenproblems on multi-threaded architectures 

      Aliaga Estellés, José Ignacio; Bientinesi, Paolo; Davidovic, Davor; Di Napoli, Edoardo; Igual, Francisco; Quintana-Orti, Enrique S. Elsevier (2012-07)
      We compare two approaches to compute a fraction of the spectrum of dense symmetric definite generalized eigenproblems: one is based on the reduction to tridiagonal form, and the other on the Krylov-subspace iteration. ...
    • openAccess   Solving Dense Linear Systems on Graphics Processors 

      Barrachina Mir, Sergio; Castillo Catalán, María Isabel; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Departament d' Enginyeria i Ciència dels Computadors, Universitat Jaume I (2008-02)
      We present several algorithms to compute the solution of a linear system of equations on a GPU, as well as general techniques to improve their performance, such as padding and hybrid GPU-CPU computation. We also show how ...
    • openAccess   Solving weighted least squares (WLS) problems on arm-based architectures 

      BELLOCH, JOSE A.; Bank, Balázs; Igual, Francisco; Quintana-Orti, Enrique S.; Vidal, Antonio M. Springer (2016-11)
      The Weighted Least Squares algorithm (WLS) is applied to numerous optimization problems, but requires the use of high computational resources, especially when complex arithmetic is involved. This work aims to accelerate ...
    • closedAccess   Speeding up the log-polar transform with inexpensive parallel hardware: graphics units and multi-core architectures 

      Antonelli, Marco; Igual, Francisco; Ramos, Jose Francisco; Traver Roig, Vicente Javier Springer-Verlag (2012)
      Log-polar imaging is a kind of foveal, biologically inspired visual representation with advantageous properties in practical applications in computer vision, robotics, and other fields. While the cheapest, most flexible, ...
    • closedAccess   The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations 

      Igual, Francisco; Chan, Ernie; Quintana-Orti, Enrique S.; Quintana-Ortí, Gregorio; Van de Geijn, Robert A. Elsevier (2012)
      Parallel accelerators are playing an increasingly important role in scientific computing. However, it is perceived that their weakness nowadays is their reduced “programmability” in comparison with traditional general-purpose ...
    • closedAccess   Time and energy modeling of a high-performance multi-threaded Cholesky factorization 

      Catalán, Sandra; Igual, Francisco; Mayo, Rafael; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S. Springer (2016-02-05)
      We present accurate time and energy piece-wise models of high-performance multi-threaded implementations for the general matrix multiplication, triangular system solve with multiple right-hand sides, and symmetric rank-k ...
    • closedAccess   Time and energy modeling of high–performance Level-3 BLAS on x86 architectures 

      Alonso-Jordá, Pedro; Catalán, Sandra; Igual, Francisco; Mayo, Rafael; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S. Elsevier (2015-06)
      We present accurate piece-wise models for the time and energy costs of high performance implementations of both the matrix multiplication (gemm) and the triangular system solve with multiple right-hand sides (trsm) on x86 ...