Buscar
Mostrando ítems 1-10 de 10
A Data-Parallel ILUPACK for Sparse General and Symmetric Indefinite Linear Systems
(Springer, 2017-05-28)
The solution of sparse linear systems of large dimension is a critical step in problems that span a diverse range of applications. For this reason, a number of iterative solvers have been developed, among which ILUPACK ...
High Performance and Portable Convolution Operators for Multicore Processors
(IEEE, 2020-10)
The considerable impact of Convolutional Neural Networks on many Artificial Intelligence
tasks has led to the development of various high performance algorithms for the convolution operator present in this type of networks. ...
Towards portable realizations of winograd-based convolution with vector intrinsics and OpenMP
(IEEE, 2022)
We take a step forward in the direction of developing high performance codes for the convolution, based on the Winograd transformation, that are easy to customize for different processor architectures. In our approach, ...
Dynamic Management of Resource Allocation for OmpSs Jobs
(Carretero Pérez, Jesús, 2016-02)
The main purpose of this thesis is to research in the relation between task-based programming models and resource management systems in order to provide a smart autonomous load-balancing and fault-tolerant system. Thus, ...
Tuning stationary iterative solvers for fault resilience
(ACM. Association for Computing Machinery, 2015)
As the transistor’s feature size decreases following Moore’s Law,
hardware will become more prone to permanent, intermittent, and
transient errors, increasing the number of failures experienced by
applications, and ...
Balanced and Compressed Coordinate Layout for the Sparse Matrix-Vector Product on GPUs
(Springer, 2021)
We contribute to the optimization of the sparse matrix-vector product on graphics processing units by introducing a variant of the coordinate sparse matrix layout that compresses the integer representation of the matrix ...
Adaptive precision solvers for sparse linear systems
(ACM, 2015)
We formulate an implementation of a Jacobi iterative solver for sparse linear systems that iterates the distinct components of the solution with different precision in terms of mantissa length. Starting with very low ...
Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor
(IEEE, 2022)
The convolution operator is a crucial kernel for
many computer vision and signal processing applications that
rely on deep learning (DL) technologies. As such, the efficient implementation of this operator has received ...
Sobre el paralelismo anidado de tareas en la factorización LU de Matrices Jerárquicas
(Universidad de Extremadura, 2019-09-20)
En este artículo se presenta una versión
paralela de la factorización LU de Matrices Jerárquicas (H-matrices) provenientes de Métodos de Elementos de Contorno (BEM). Estas matrices contienen estructuras internas cuya ...
Characterization of Multicore Architectures using Task-Parallel ILU-type Preconditioned CG Solvers
(2017-07-05)
We investigate the eficiency of state-of-the-art multicore processors using
a multi-threaded task-parallel implementation of the Conjugate Gradient
(CG) method, accelerated with an incomplete LU (ILU) preconditioner.
...