Buscar
Exploiting Task and Data Parallelism in ILUPACK's Preconditioned CG Solver on NUMA Architectures and Many-core Accelerators
(Elsevier, 2016-05)
We present specialized implementations of the preconditioned iterative linear
system solver in ILUPACK for Non-Uniform Memory Access (NUMA) platforms
and many-core hardware co-processors based on the Intel Xeon Phi
and ...
Performance versus energy consumption of hyperspectral unmixing algorithms on multi-core platforms
(SpringerOpen, 2013)
Hyperspectral imaging is a growing area in remote sensing in which an imaging spectrometer collects hundreds of images (at different wavelength channels) for the same area on the surface of the Earth. Hyperspectral images ...
Solving Matrix Equations on Multi-Core and Many-Core Architectures
(MDPI, 2013-12)
We address the numerical solution of Lyapunov, algebraic and differential Riccati equations, via the matrix sign function, on platforms equipped with general-purpose multicore processors and, optionally, one or more graphics ...
Adaptive precision in block‐Jacobi preconditioning for iterative sparse linear system solvers
(Wiley, 2019-03-25)
We propose an adaptive scheme to reduce communication overhead caused by data movement by selectively storing the diagonal blocks of a block‐Jacobi preconditioner in different precision formats (half, single, or double). ...
An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization
(Springer Verlag, 2008)
We pursue the scalable parallel implementation of the factor-
ization of band matrices with medium to large bandwidth targeting SMP
and multi-core architectures. Our approach decomposes the computation
into a large ...
Extending lyapack for the solution of band Lyapunov equations on hybrid CPU–GPU platforms
(Springer Verlag, 2015)
The solution of large-scale Lyapunov equations is an important tool for the solution of several engineering problems arising in optimal control and model order reduction. In this work, we investigate the case when the ...
Highly sensitive and ultrafast read mapping for RNA-seq analysis
(Oxford University Press, 2016)
As sequencing technologies progress, the amount of data produced grows exponentially, shifting
the bottleneck of discovery towards the data analysis phase. In particular, currently available mapping
solutions for RNA-seq ...
Characterizing the efficiency of multicore and manycore processors for the solution of sparse linear systems
(Springer Berlin Heidelberg, 2015-09)
We analyze the efficiency of servers equipped with state-of-the-art general-purpose multicore processors as well as platforms based on accelerators such as graphics processing units (GPUs) and the Intel Xeon Phi. Following ...
The libflame library for dense matrix computations
(IEEE Computer Society, 2009-11)
Researchers from the Formal Linear Algebra Method Environment (Flame) project have developed new methodologies for analyzing, designing, and implementing linear algebra libraries. These solutions, which have culminated in ...
Unleashing GPU acceleration for symmetric band linear algebra kernels and model reduction
(© Springer International Publishing AG, 2015-12)
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing (HPC) research efforts. In particular, ...