Listar ICC_Articles por autoría "4545fd2f-f57a-4bb5-82e1-833c9688ed26"

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization With Partial Pivoting

Catalán, Sandra; Herrero Zaragoza, José R.; Quintana-Orti, Enrique S.; Rodríguez Sánchez, Rafael; Van de Geijn, Robert A. IEEE (2019-01)

We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques target ...

A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures

Quintana-Ortí, Gregorio; Igual, Francisco D.; Marqués-Andrés, Mercedes; Quintana-Orti, Enrique S.; Van de Geijn, Robert A. ACM (2012-08)

Out-of-core implementations of algorithms for dense matrix computations have traditionally focused on optimal use of memory so as to minimize I/O, often trading programmability for performance. In this article we show how ...

An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

Quintana-Ortí, Gregorio; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo; Van de Geijn, Robert A. Springer Verlag (2008)

We pursue the scalable parallel implementation of the factor- ization of band matrices with medium to large bandwidth targeting SMP and multi-core architectures. Our approach decomposes the computation into a large ...

Deriving dense linear algebra libraries

Bientinesi, Paolo; Gunnels, John A.; Myers, Margaret E.; Quintana-Orti, Enrique S.; Rhodes, Tyler; Van de Geijn, Robert A.; Van Zee, Field G. Springer London (2013-11)

Starting in the late 1960s computer scientists including Dijkstra and Hoare advocated goal- oriented programming and the formal derivation of algorithms. The chief impediment to realizing this for loop-based programs was ...

Families of Algorithms for Reducing a Matrix to Condensed Form

Van Zee, Field G.; Van de Geijn, Robert A.; Quintana-Ortí, Gregorio; Elizondo, G. Joseph ACM (2012-11)

In a recent paper it was shown how memory traffic can be diminished by reformulating the classic algorithm for reducing a matrix to bidiagonal form, a preprocess when computing the singular values of a dense matrix. The ...

Householder QR Factorization With Randomization for Column Pivoting (HQRRP)

MARTINSSON, GUNNAR; Quintana-Ortí, Gregorio; Heavner, Nathan; Van de Geijn, Robert A. Society for Industrial and Applied Mathematics (2017)

A fundamental problem when adding column pivoting to the Householder QR fac- torization is that only about half of the computation can be cast in terms of high performing matrix- matrix multiplications, which greatly ...

Programming matrix algorithms-by-blocks for thread-level parallelism

Quintana-Ortí, Gregorio; Quintana-Orti, Enrique S.; Van de Geijn, Robert A.; Van Zee, Field G.; Chan, Ernie Association for Computing Machinery (2009-07)

With the emergence of thread-level parallelism as the primary means for continued improvement of performance, the programmability issue has reemerged as an obstacle to the use of architectural advances. We argue that ...

Restructuring the Tridiagonal and Bidiagonal QR Algorithms for Performance

Van Zee, Field G.; Van de Geijn, Robert A.; Quintana-Ortí, Gregorio ACM Digital Library (2014-04)

We show how both the tridiagonal and bidiagonal QR algorithms can be restructured so that they be- come rich in operations that can achieve near-peak performance on a modern processor. The key is a novel, cache-friendly ...

Scheduling algorithms-by-blocks on small clusters

Igual, Francisco D.; Quintana-Ortí, Gregorio; Van de Geijn, Robert A. Wiley (2012-03-28)

The arrival of multicore architectures has generated an interest in reformulating dense matrix computations as algorithms-by-blocks, where submatrices are units of data and computations with those blocks are units of ...

The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations

Igual, Francisco D.; Chan, Ernie; Quintana-Orti, Enrique S.; Quintana-Ortí, Gregorio; Van de Geijn, Robert A. Elsevier (2012)

Parallel accelerators are playing an increasingly important role in scientific computing. However, it is perceived that their weakness nowadays is their reduced “programmability” in comparison with traditional general-purpose ...

The libflame library for dense matrix computations

Van Zee, Field G.; Chan, Ernie; Van de Geijn, Robert A.; Quintana-Ortí, Gregorio; Quintana-Orti, Enrique S. IEEE Computer Society (2009-11)

Researchers from the Formal Linear Algebra Method Environment (Flame) project have developed new methodologies for analyzing, designing, and implementing linear algebra libraries. These solutions, which have culminated in ...

Using desktop computers to solve large-scale dense linear algebra problems

Quintana-Orti, Enrique S.; Marqués-Andrés, Mercedes; Quintana-Ortí, Gregorio; Van de Geijn, Robert A. Springer Science+Business Media (2011-11)

We provide experimental evidence that current desktop computers feature enough computational power to solve large-scale dense linear algebra problems. While the high computational cost of the numerical methods for solving ...