• closedAccess   Blocked algorithms for the reduction to Hessenberg-triangular form revisited 

      Kagstrom, B.; Kressner, Daniel; Quintana-Orti, Enrique S.; Quintana-Ortí, Gregorio Springer (2008-09)
      We present two variants of Moler and Stewart’s algorithm for reducing a matrix pair to Hessenberg-triangular (HT) form with increased data locality in the access to the matrices. In one of these variants, a careful ...
    • openAccess   Characterization of Multicore Architectures using Task-Parallel ILU-type Preconditioned CG Solvers 

      Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Quintana-Orti, Enrique S. (2017-07-05)
      We investigate the eficiency of state-of-the-art multicore processors using a multi-threaded task-parallel implementation of the Conjugate Gradient (CG) method, accelerated with an incomplete LU (ILU) preconditioner. ...
    • openAccess   Characterizing the efficiency of multicore and manycore processors for the solution of sparse linear systems 

      Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S. Springer Berlin Heidelberg (2015-09)
      We analyze the efficiency of servers equipped with state-of-the-art general-purpose multicore processors as well as platforms based on accelerators such as graphics processing units (GPUs) and the Intel Xeon Phi. Following ...
    • openAccess   Communication in task-parallel ILU-preconditioned CG solversusing MPI + OmpSs 

      Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Flegar, Goran; Bollhöffer, Matthias; Quintana-Orti, Enrique S. Wiley (2017-11-10)
      We target the parallel solution of sparse linear systems via iterative Krylov subspace–based methods enhanced with incomplete LU (ILU)-type preconditioners on clusters of multicore processors. In order to tackle large-scale ...
    • openAccess   Compressed basis GMRES on high-performance graphics processing units 

      Aliaga Estellés, José Ignacio; Anzt, Hartwig; Tomás Domínguez, Andrés Enrique; Quintana-Orti, Enrique S.; Grützmacher, Thomas Sage (2022-08-05)
      Krylov methods provide a fast and highly parallel numerical tool for the iterative solution of many large-scale sparse linear systems. To a large extent, the performance of practical realizations of these methods is ...
    • openAccess   Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units 

      Aliaga Estellés, José Ignacio; Anzt, Hartwig; Grützmacher, Thomas; Quintana-Orti, Enrique S.; Tomás Domínguez, Andrés Enrique John Wiley and Sons (2021)
      We contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the ...
    • openAccess   Concurrent and Accurate RNA Sequencing on Multicore Platforms 

      Martínez Pérez, Héctor; Tárraga, Joaquín; Medina, Ignacio; Barrachina Mir, Sergio; Castillo Catalán, María Isabel; Dopazo, Joaquín; Quintana-Orti, Enrique S. Departament d'Enginyeria i Ciència dels Computadors, Universitat Jaume I (2013-04-02)
      In this paper we introduce a novel parallel pipeline for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, named HPG-aligner1, leverages the speed of the Burrows-Whe ...
    • openAccess   Concurrent and Accurate Short Read Mapping on Multicore Processors 

      Martínez Pérez, Héctor; Tárraga, Joaquín; Medina, Ignacio; Barrachina Mir, Sergio; Castillo Catalán, María Isabel; Dopazo, Joaquín; Quintana-Orti, Enrique S. IEEE (2015-09)
      We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, HPG Aligner SA1, exploits a suffix array to rapidly ...
    • closedAccess   Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures 

      Bientinesi, Paolo; Igual, Francisco; Kressner, Daniel; Petschow, Matthias; Quintana-Orti, Enrique S. Wiley (2011-11-10)
      We investigate the performance of the routines in LAPACK and the Successive Band Reduction (SBR) toolbox for the reduction of a dense matrix to tridiagonal form, a crucial preprocessing stage in the solution of the symmetric ...
    • openAccess   Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor 

      Dolz, Manuel F.; Martínez, Héctor; Alonso, Pedro; Quintana-Orti, Enrique S. IEEE (2022)
      The convolution operator is a crucial kernel for many computer vision and signal processing applications that rely on deep learning (DL) technologies. As such, the efficient implementation of this operator has received ...
    • openAccess   Deriving dense linear algebra libraries 

      Bientinesi, Paolo; Gunnels, John A.; Myers, Margaret E.; Quintana-Orti, Enrique S.; Rhodes, Tyler; Van de Geijn, Robert A.; Van Zee, Field G. Springer London (2013-11)
      Starting in the late 1960s computer scientists including Dijkstra and Hoare advocated goal- oriented programming and the formal derivation of algorithms. The chief impediment to realizing this for loop-based programs was ...
    • openAccess   DMR API: Improving cluster productivity by turning applications into malleable 

      Iserte, Sergio; Mayo, Rafael; Quintana-Orti, Enrique S.; Beltrán, Vicenç; Peña Monferrer, Antonio J. Elsevier (2018)
      Adaptive workloads can change on–the–fly the configuration of their jobs, in terms of number of processes. To carry out these job reconfigurations, we have designed a methodology which enables a job to communicate with ...
    • openAccess   DMRlib: Easy-coding and Efficient Resource Management for Job Malleability 

      Iserte, Sergio; Mayo, Rafael; Quintana-Orti, Enrique S.; Pena, Antonio J. IEEE (2020-09-09)
      Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the non-negligible ...
    • closedAccess   DVFS-control techniques for dense linear algebra operations on multi-core processors 

      Alonso-Jordá, Pedro; Dolz, Manuel F.; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Springer (2012-11)
      This paper analyzes the impact on power consumption of two DVFS-control strategies when applied to the execution of dense linear algebra operations on multi-core processors. The strategies considered here, prototyped as ...
    • openAccess   DVFS-Technique for Dense Linear Algebra Operations on Multi-Core Processors 

      Alonso-Jordá, Pedro; Dolz, Manuel F.; Mayo, Rafael; Quintana-Orti, Enrique S. Departament d' Enginyeria i Ciència dels Computadors, Universitat Jaume I (2011-05)
      This paper addresses the efficient explotation of task-level parallelism, present in many dense linear algebra operations, from the point of view of both computational performance and energy consumption. In particular, ...
    • openAccess   Dynamic Management of Resource Allocation for OmpSs Jobs 

      Iserte, Sergio; Peña Monferrer, Antonio J.; Mayo, Rafael; Quintana-Orti, Enrique S.; Beltran Querol, Vicenç Carretero Pérez, Jesús (2016-02)
      The main purpose of this thesis is to research in the relation between task-based programming models and resource management systems in order to provide a smart autonomous load-balancing and fault-tolerant system. Thus, ...
    • openAccess   Efficient and portable GEMM-based convolution operators for deep neural network training on multicore processors 

      Barrachina Mir, Sergio; Dolz, Manuel F.; San Juan, Pablo; Quintana-Orti, Enrique S. Elsevier (2022-05-30)
      Convolutional Neural Networks (CNNs) play a crucial role in many image recognition and classification tasks, recommender systems, brain-computer interfaces, etc. As a consequence, there is a notable interest in developing ...
    • openAccess   Efficient and portable Winograd convolutions for multi-core processors 

      Dolz, Manuel F.; Martínez, Héctor; Castelló, Adrián; Alonso-Jordá, Pedro; Quintana-Orti, Enrique S. Springer (2023-02-12)
      We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, ...
    • closedAccess   Efficient Implementation of Hyperspectral Anomaly Detection Techniques on GPUs and Multicore Processors 

      Molero, Jose M.; Garzon, E.M.; García, I.; Quintana-Orti, Enrique S.; Plaza, Antonio IEEE (2014)
      Anomaly detection is an important task for hyperspectral data exploitation. Although many algorithms have been developed for this purpose in recent years, due to the large dimensionality of hyperspectral image data, fast ...
    • openAccess   Efficient model order reduction of large-scale systems on multi-core platforms 

      Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo Springer (2011)
      We propose an efficient implementation of the Balanced Truncation (BT) method for model order reduction when the state-space matrix is symmetric (positive definite). Most of the computational effort required by this method ...