Buscar

Mostrando ítems 21-30 de 145

1
2
3
4
5
6
. . .
15

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization With Partial Pivoting

Catalán, Sandra; Herrero Zaragoza, José R.; Quintana-Orti, Enrique S.; Rodríguez Sánchez, Rafael; Van de Geijn, Robert A. (IEEE, 2019-01)

We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques target ...

iMODS: internal coordinates normal mode analysis server

López Blanco, José R.; Aliaga Estellés, José Ignacio; Quintana-Orti, Enrique S.; Chacón, Pablo (Oxford University Press, 2014)

Normal mode analysis (NMA) in internal (dihedral) coordinates naturally reproduces the collective functional motions of biological macromolecules. iMODS facilitates the exploration of such modes and generates feasible ...

Acceleration of PageRank with Customized Precision Based on Mantissa Segmentation

Grützmacher, Thomas; Cojean, Terry; Flegar, Goran; Anzt, Hartwig; Quintana-Orti, Enrique S. (Association for Computing Machinery (ACM), 2020-03)

We describe the application of a communication-reduction technique for the PageRank algorithm that dynamically adapts the precision of the data access to the numerical requirements of the algorithm as the iteration converges. ...

Deriving dense linear algebra libraries

Bientinesi, Paolo; Gunnels, John A.; Myers, Margaret E.; Quintana-Orti, Enrique S.; Rhodes, Tyler; Van de Geijn, Robert A.; Van Zee, Field G. (Springer London, 2013-11)

Starting in the late 1960s computer scientists including Dijkstra and Hoare advocated goal- oriented programming and the formal derivation of algorithms. The chief impediment to realizing this for loop-based programs was ...

DMRlib: Easy-coding and Efficient Resource Management for Job Malleability

Iserte, Sergio; Mayo, Rafael; Quintana-Orti, Enrique S.; Pena, Antonio J. (IEEE, 2020-09-09)

Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the non-negligible ...

Accelerating the Lyapack library using GPUs

Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.; Remón Gómez, Alfredo (Springer, 2013)

Lyapack is a package for the solution of large-scale sparse problems arising in control theory. The package has a modular design, and is implemented as a Matlab toolbox, which renders it easy to utilize, modify and extend ...

Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD

Rodríguez Sánchez, Rafael; Catalán, Sandra; Herrero, José R.; Quintana-Orti, Enrique S.; Tomás Domínguez, Andrés Enrique (Springer Verlag, 2019)

We address the reduction to compact band forms, via unitary similarity transformations, for the solution of symmetric eigenvalue problems and the computation of the singular value decomposition (SVD). Concretely, in the ...

Toward a modular precision ecosystem for high-performance computing

Anzt, Hartwig; Flegar, Goran; Grützmacher, Thomas; Quintana-Orti, Enrique S. (Sage, 2019-05)

With the memory bandwidth of current computer architectures being significantly slower than the (floating point) arithmetic performance, many scientific computations only leverage a fraction of the computational power in ...

A complete and efficient CUDA-sharing solution for HPC clusters

Peña Monferrer, Antonio J.; Reaño, Carlos; Silla, Federico; Mayo, Rafael; Quintana-Orti, Enrique S.; Duato, José (Elsevier, 2014)

In this paper we detail the key features, architectural design, and implementation of rCUDA, an advanced framework to enable remote and transparent GPGPU acceleration in HPC clusters. rCUDA allows decoupling GPUs from ...

Leveraging task-parallelism in message-passing dense matrix factorizations using SMPSs

Martín, Alberto F.; Reyes, Ruymán; Badía Sala, Rosa María; Quintana-Orti, Enrique S. (Elsevier, 2014)

In this paper, we investigate how to exploit task-parallelism during the execution of the Cholesky factorization on clusters of multicore processors with the SMPSs programming model. Our analysis reveals that the major ...

1
2
3
4
5
6
. . .
15

Autoría

Mayo, Rafael (35)

Igual, Francisco (28)

Condiciones de acceso

Open Access (78)

Restricted Access (67)