Listar ICC_Articles por autoría "7c1c0bf7-9583-4bf7-b497-213e4df60ccb"
Mostrando ítems 1-20 de 28
-
A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures
Quintana-Ortí, Gregorio; Igual, Francisco; Marqués-Andrés, Mercedes; Quintana-Orti, Enrique S.; Van de Geijn, Robert A. ACM (2012-08)Out-of-core implementations of algorithms for dense matrix computations have traditionally focused on optimal use of memory so as to minimize I/O, often trading programmability for performance. In this article we show how ... -
Accelerating the SRP-PHAT algorithm on multi- and many-core platforms using OpenCL
Badía, José; BELLOCH, JOSE A.; Cobos, Maximo; Igual, Francisco; Quintana-Orti, Enrique S. Springer (2019-03)The Steered Response Power with Phase Transform (SRP-PHAT) algorithm is a well-known method for sound source localization due to its robust performance in noisy and reverberant environments. This algorithm is used in a ... -
Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures
Heavner, Nathan; Igual, Francisco; Quintana-Ortí, Gregorio; MARTINSSON, GUNNAR Association for Computing Machinery (ACM) (2022-06)Randomized singular value decomposition (RSVD) is by now a well-established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently ... -
Algorithm 1033: Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Distributed-memory Architectures
Quintana-Ortí, Gregorio; Hernando, Fernando; Igual, Francisco Association for Computing Machinery (ACM) (2023-03)The minimum distance of a linear code is a key concept in information theory. Therefore, the time required by its computation is very important to many problems in this area. In this article, we introduce a family of ... -
Analytical Modeling is Enough for High Performance BLIS
Low, Tze Meng; Igual, Francisco; Smith, Tyler M.; Quintana-Orti, Enrique S. ACM (2016-09)We show how the BLAS-like Library Instantiation Software (BLIS) framework, which provides a more detailed layering of the GotoBLAS (now maintained as OpenBLAS) implementation, allows one to analytically determine tuning ... -
Architecture-Aware Con guration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors
Catalán, Sandra; Igual, Francisco; Mayo, Rafael; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S. Springer US (2016-09)Asymmetric multicore processors (AMPs) have recently emerged as an appealing technology for severely energy-constrained environments, especially in mobile appliances where heterogeneity in applications is mainstream. ... -
Automatic generation of ARM NEON micro‑kernels for matrix multiplication
Alaejos, Guillermo; Martínez, Héctor; Castelló, Adrián; Dolz, Manuel F.; Igual, Francisco; Alonso-Jordá, Pedro; Quintana-Orti, Enrique S. Springer (2024-03-12)General matrix multiplication (gemm) is a fundamental kernel in scientifc computing and current frameworks for deep learning. Modern realisations of gemm are mostly written in C, on top of a small, highly tuned micro-kernel ... -
Balancing task- and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi
Dolz, Manuel F.; Igual, Francisco; Ludwig, Thomas; Piñuel, Luis; Quintana-Orti, Enrique S. Elsevier (2015-08)The emergence of new manycore architectures, such as the Intel Xeon Phi, poses new challenges in how to adapt existing libraries and applications to this type of systems. In particular, the exploitation of manycore ... -
Color and texture analysis using emerging parallel architectures
Igual, Francisco; Mayo, Rafael; Hartley, Timothy; Çatalyürek, Ümit V.; Ruiz, Antonio; Ujaldon, Manuel SAGE Publications (2011-11)While image texture is effective for use in pattern-recognition and image-analysis algorithms, textural features are time-consuming to calculate on standard CPUs. Therefore, we present novel implementations of textural-feature ... -
Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures
Bientinesi, Paolo; Igual, Francisco; Kressner, Daniel; Petschow, Matthias; Quintana-Orti, Enrique S. Wiley (2011-11-10)We investigate the performance of the routines in LAPACK and the Successive Band Reduction (SBR) toolbox for the reduction of a dense matrix to tridiagonal form, a crucial preprocessing stage in the solution of the symmetric ... -
DVFS-control techniques for dense linear algebra operations on multi-core processors
Alonso-Jordá, Pedro; Dolz, Manuel F.; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Springer (2012-11)This paper analyzes the impact on power consumption of two DVFS-control strategies when applied to the execution of dense linear algebra operations on multi-core processors. The strategies considered here, prototyped as ... -
Enhancing performance and energy consumption of runtime schedulers for dense linear algebra
Alonso-Jordá, Pedro; Dolz, Manuel F.; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Wiley (2014-06)The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate ... -
Exploiting the capabilities of modern GPUs for dense matrix computations
Barrachina Mir, Sergio; Castillo Catalán, María Isabel; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S.; Quintana-Ortí, Gregorio John Wiley & Sons (2009)We present several algorithms to compute the solution of a linear system of equations on a graphics processor (GPU), as well as general techniques to improve their performance, such as padding and hybrid GPU-CPU computation. ... -
Extending OpenMP to Survive the Heterogeneous Multi-Core Era
Ayguadé, Eduardo; Badía Sala, Rosa María; Bellens, Pieter; Cabrera, Daniel; Durán, Alejandro; Ferrer, Roger; González, Marc; Igual, Francisco; Jiménez González, Daniel; Labarta Mancho, Jesús; Martinell, Luis; Martorell, Xavier; Mayo, Rafael; Pérez, Josep M.; Planas, Judit; Quintana-Orti, Enrique S. Springer US (2010)This paper advances the state-of-the-art in programming models for exploiting task-level parallelism on heterogeneous many-core systems, presenting a number of extensions to the OpenMP language inspired in the StarSs ... -
Fast Algorithms for the Computation of the Minimum Distance of a Random Linear Code
Hernando, Fernando; Igual, Francisco; Quintana-Ortí, Gregorio Association for Computing Machinery (ACM) (2019-06)The minimum distance of an error-correcting code is an important concept in information theory. Hence, computing the minimum distance of a code with a minimum computational cost is crucial to many problems in this area. ... -
Hyperspectral Unmixing on Multicore DSPs: Trading Off Performance for Energy
Castillo Catalán, María Isabel; Fernández Fernández, Juan Carlos; Igual, Francisco; Plaza, Antonio; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo IEEE (2014)Wider coverage of observation missions will increase onboard power restrictions while, at the same time, pose higher demands from the perspective of processing time, thus asking for the exploration of novel high-performance ... -
Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors
Catalán, Sandra; Herrero Zaragoza, José R.; Igual, Francisco; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S.; Adeniyi-Jones, Chris Elsevier (2018-03)Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the BLAS ... -
Optimized Fundamental Signal Processing Operations For Energy Minimization on Heterogeneous Mobile Devices
BELLOCH, JOSE A.; Badía, José; Igual, Francisco; González, Alberto; Quintana-Orti, Enrique S. IEEE (2018-05)Numerous signal processing applications are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness constraints for user interactivity and, at the same time, must ... -
Practical considerations for acoustic source localization in the IoT era: Platforms, energy efficiency, and performance
BELLOCH, JOSE A.; Badía, José; Igual, Francisco; Cobos, Maximo IEEE (2019-06)The rapid development of the Internet of Things (IoT) has posed important changes in the way emerging acoustic signal processing applications are conceived. While traditional acoustic processing applications have been ... -
Programming parallel dense matrix factorizations with look-ahead and OpenMP
Catalán, Sandra; Castelló, Adrián; Igual, Francisco; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S. Springer (2019)We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multi-threaded ...