• openAccess   A Data-Parallel ILUPACK for Sparse General and Symmetric Indefinite Linear Systems 

      Aliaga Estellés, José Ignacio; Bollhöfer, Matthias; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S. Springer (2017-05-28)
      The solution of sparse linear systems of large dimension is a critical step in problems that span a diverse range of applications. For this reason, a number of iterative solvers have been developed, among which ILUPACK ...
    • openAccess   Adaptive precision solvers for sparse linear systems 

      Anzt, Hartwig; Dongarra, Jack; Quintana-Orti, Enrique S. ACM (2015)
      We formulate an implementation of a Jacobi iterative solver for sparse linear systems that iterates the distinct components of the solution with different precision in terms of mantissa length. Starting with very low ...
    • openAccess   Balanced and Compressed Coordinate Layout for the Sparse Matrix-Vector Product on GPUs 

      Aliaga Estellés, José Ignacio; Anzt, Hartwig; Quintana-Orti, Enrique S.; Tomás Domínguez, Andrés Enrique; Tsai, Yuhsiang M. Springer (2021)
      We contribute to the optimization of the sparse matrix-vector product on graphics processing units by introducing a variant of the coordinate sparse matrix layout that compresses the integer representation of the matrix ...
    • openAccess   Characterization of Multicore Architectures using Task-Parallel ILU-type Preconditioned CG Solvers 

      Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Quintana-Orti, Enrique S. (2017-07-05)
      We investigate the eficiency of state-of-the-art multicore processors using a multi-threaded task-parallel implementation of the Conjugate Gradient (CG) method, accelerated with an incomplete LU (ILU) preconditioner. ...
    • openAccess   Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor 

      Dolz, Manuel F.; Martínez, Héctor; Alonso, Pedro; Quintana-Orti, Enrique S. IEEE (2022)
      The convolution operator is a crucial kernel for many computer vision and signal processing applications that rely on deep learning (DL) technologies. As such, the efficient implementation of this operator has received ...
    • openAccess   Dynamic Management of Resource Allocation for OmpSs Jobs 

      Iserte, Sergio; Peña Monferrer, Antonio J.; Mayo, Rafael; Quintana-Orti, Enrique S.; Beltran Querol, Vicenç Carretero Pérez, Jesús (2016-02)
      The main purpose of this thesis is to research in the relation between task-based programming models and resource management systems in order to provide a smart autonomous load-balancing and fault-tolerant system. Thus, ...
    • openAccess   High Performance and Portable Convolution Operators for Multicore Processors 

      San Juan, Pablo; Castelló, Adrián; Dolz, Manuel F.; Alonso-Jordá, Pedro; Quintana-Orti, Enrique S. IEEE (2020-10)
      The considerable impact of Convolutional Neural Networks on many Artificial Intelligence tasks has led to the development of various high performance algorithms for the convolution operator present in this type of networks. ...
    • openAccess   Sobre el paralelismo anidado de tareas en la factorización LU de Matrices Jerárquicas 

      Carratalá-Sáez, Rocío; Quintana-Orti, Enrique S. Universidad de Extremadura (2019-09-20)
      En este artículo se presenta una versión paralela de la factorización LU de Matrices Jerárquicas (H-matrices) provenientes de Métodos de Elementos de Contorno (BEM). Estas matrices contienen estructuras internas cuya ...
    • closedAccess   Towards portable realizations of winograd-based convolution with vector intrinsics and OpenMP 

      Dolz, Manuel F.; Castelló, Adrián; Quintana-Orti, Enrique S. IEEE (2022)
      We take a step forward in the direction of developing high performance codes for the convolution, based on the Winograd transformation, that are easy to customize for different processor architectures. In our approach, ...
    • closedAccess   Tuning stationary iterative solvers for fault resilience 

      Anzt, Hartwig; Dongarra, Jack; Quintana-Orti, Enrique S. ACM. Association for Computing Machinery (2015)
      As the transistor’s feature size decreases following Moore’s Law, hardware will become more prone to permanent, intermittent, and transient errors, increasing the number of failures experienced by applications, and ...