Listar por tema "GPUs"

A customized precision format based on mantissa segmentation for accelerating sparse linear algebra

Grützmacher, Thomas; Cojean, Terry; Flegar, Goran; Göbel, Fritz; Anzt, Hartwig Wiley (2019)

In this work, we pursue the idea of radically decoupling the floating point format used for arithmetic operations from the format used to store the data in memory. We complement this idea with a customized precision memory ...

Accelerating BST Methods for Model Reduction with Graphics Processors

Benner, Peter; Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo Springer Berlin Heidelberg (2012)

Model order reduction of dynamical linear time-invariant system appears in many scientific and engineering applications. Numerically reliable SVD-based methods for this task require O(n3) floating-point arithmetic operations, ...

Accelerating Model Reduction of Large Linear Systems with Graphics Processors

Benner, Peter; Ezzatti, Pablo; Kressner, Daniel; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo Springer Berlin Heidelberg (2012)

Model order reduction of a dynamical linear time-invariant system appears in many applications from science and engineering. Numerically reliable SVD-based methods for this task require in general O(n3) floating-point ...

Accelerating the Lyapack library using GPUs

Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo Springer (2013)

Lyapack is a package for the solution of large-scale sparse problems arising in control theory. The package has a modular design, and is implemented as a Matlab toolbox, which renders it easy to utilize, modify and extend ...

Accelerating the SRP-PHAT algorithm on multi- and many-core platforms using OpenCL

Badía, José; BELLOCH, JOSE A.; Cobos, Maximo; Igual, Francisco; Quintana-Orti, Enrique S. Springer (2019-03)

The Steered Response Power with Phase Transform (SRP-PHAT) algorithm is a well-known method for sound source localization due to its robust performance in noisy and reverberant environments. This algorithm is used in a ...

Acceleration of PageRank with Customized Precision Based on Mantissa Segmentation

Grützmacher, Thomas; Cojean, Terry; Flegar, Goran; Anzt, Hartwig; Quintana-Orti, Enrique S. Association for Computing Machinery (ACM) (2020-03)

We describe the application of a communication-reduction technique for the PageRank algorithm that dynamically adapts the precision of the data access to the numerical requirements of the algorithm as the iteration converges. ...

An efficient GPU version of the preconditioned GMRES method

Aliaga Estellés, José Ignacio; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S. Springer (2019-03)

In a large number of scientific applications, the solution of sparse linear systems is the stage that concentrates most of the computational effort. This situation has motivated the study and development of several iterative ...

Balanced and Compressed Coordinate Layout for the Sparse Matrix-Vector Product on GPUs

Aliaga Estellés, José Ignacio; Anzt, Hartwig; Quintana-Orti, Enrique S.; Tomás Domínguez, Andrés Enrique; Tsai, Yuhsiang M. Springer (2021)

We contribute to the optimization of the sparse matrix-vector product on graphics processing units by introducing a variant of the coordinate sparse matrix layout that compresses the integer representation of the matrix ...

Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures

Bientinesi, Paolo; Igual, Francisco; Kressner, Daniel; Petschow, Matthias; Quintana-Orti, Enrique S. Wiley (2011-11-10)

We investigate the performance of the routines in LAPACK and the Successive Band Reduction (SBR) toolbox for the reduction of a dense matrix to tridiagonal form, a crucial preprocessing stage in the solution of the symmetric ...

Consumo energético de métodos iterativos para sistemas dispersos en procesadores gráficos

Pérez Badenes, Joaquín Universitat Jaume I (2016-12-09)

La resolución de sistemas de ecuaciones lineales dispersos de gran dimensión es una de las operaciones más comunes en aplicaciones científicas y de ingeniería. El aumento de sus tamaños propicia el desarrollo de técnicas ...

Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models

Castelló, Adrián; Pena, Antonio J.; Mayo, Rafael; Planas, Judit; Quintana-Orti, Enrique S.; Balaji, Pavan Springer (2016-06-21)

Directive-based programming models, such as OpenMP, OpenACC, and OmpSs, enable users to accelerate applications by using coprocessors with little effort. These devices offer significant computing power, but their use can ...

Extending lyapack for the solution of band Lyapunov equations on hybrid CPU–GPU platforms

Benner, Peter; Remón Gómez, Alfredo; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S. Springer Verlag (2015)

The solution of large-scale Lyapunov equations is an important tool for the solution of several engineering problems arising in optimal control and model order reduction. In this work, we investigate the case when the ...

FaST-LMM for Two-Way Epistasis Tests on High-Performance Clusters

Martínez Pérez, Héctor; Barrachina Mir, Sergio; Castillo Catalán, María Isabel; Quintana-Orti, Enrique S.; Rambla, Jordi; Farré, Xavier; Navarro, Arcadi Mary Ann Liebert (2018-08)

We introduce a version of the epistasis test in FaST-LMM for clusters of multithreaded processors. This new software maintains the sensitivity of the original FaST-LMM while delivering acceleration that is close to linear ...

Hierarchical approach for deriving a reproducible unblocked LU factorization

Iakymchuk, Roman; Graillat, Stef; Defour, David; Quintana-Orti, Enrique S. Sage (2019-03-17)

We propose a reproducible variant of the unblocked LU factorization for graphics processor units (GPUs). For this purpose, we build upon Level-1/2 BLAS kernels that deliver correctly-rounded and reproducible results for ...

Load-balancing Sparse Matrix Vector Product Kernels on GPUs

Anzt, Hartwig; Cojean, Terry; Yen-Chen, Chen; Dongarra, Jack; Flegar, Goran; Nayak, Pratik; Tomov, Stanimire; Tsai, Yuhsiang M.; Wang, Weichung Association for Computing Machinery (ACM) (2020-03)

Efficient processing of Irregular Matrices on Single Instruction, Multiple Data (SIMD)-type architectures is a persistent challenge. Resolving it requires innovations in the development of data formats, computational ...

Out-of-core macromolecular simulations on multithreaded architectures

Aliaga Estellés, José Ignacio; Badía, José; Castillo Catalán, María Isabel; Davidovic, Davor; Mayo, Rafael; Quintana-Orti, Enrique S. Wiley (2015)

We address the solution of large-scale eigenvalue problems that appear in the motion simulation o f com- plex macromolecules on multithreaded platforms, consisting of multicore processors and possibly a graphics processor ...

Solving Matrix Equations on Multi-Core and Many-Core Architectures

Benner, Peter; Ezzatti, Pablo; Mena, Hermann; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo MDPI (2013-12)

We address the numerical solution of Lyapunov, algebraic and differential Riccati equations, via the matrix sign function, on platforms equipped with general-purpose multicore processors and, optionally, one or more graphics ...

Solving “Large” Dense Matrix Problems on Multi-Core Processors and GPUs

Marqués-Andrés, Mercedes; Quintana-Ortí, Gregorio; Quintana-Orti, Enrique S.; Van de Geijn, Robert A. Departament d' Enginyeria i Ciència dels Computadors, Universitat Jaume I (2009-01)

Few realize that, for large matrices, many dense matrix computations achieve nearly the same performance when the matrices are stored on disk as when they are stored in a very large main memory. Similarly, few realize ...

Toward a modular precision ecosystem for high-performance computing

Anzt, Hartwig; Flegar, Goran; Grützmacher, Thomas; Quintana-Orti, Enrique S. Sage (2019-05)

With the memory bandwidth of current computer architectures being significantly slower than the (floating point) arithmetic performance, many scientific computations only leverage a fraction of the computational power in ...

Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction

Benner, Peter; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo © Springer International Publishing AG (2015-12)

Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In particular, ...