Now showing items 1-20 of 30

    • closedAccess   A complete and efficient CUDA-sharing solution for HPC clusters 

      Peña Monferrer, Antonio J.; Reaño, Carlos; Silla, Federico; Mayo, Rafael; Quintana-Orti, Enrique S.; Duato, José Elsevier (2014)
      In this paper we detail the key features, architectural design, and implementation of rCUDA, an advanced framework to enable remote and transparent GPGPU acceleration in HPC clusters. rCUDA allows decoupling GPUs from ...
    • closedAccess   A parallel solver for huge dense linear systems  

      Badía, José; Movilla, Jose L.; Climente, Juan I.; Castillo Catalán, María Isabel; Marqués-Andrés, Mercedes; Mayo, Rafael; Quintana-Orti, Enrique S.; Planelles, Josep Elsevier (2011-11)
      HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to ...
    • openAccess   A simulator to assess energy saving strategies and policies in HPC workloads 

      Quintana-Orti, Enrique S.; Mayo, Rafael; Iserte, Sergio; Fernández Fernández, Juan Carlos; Dolz, Manuel F. Association for Computing Machinery (ACM) (2012-07)
      In recent years power consumption of high performance computing (HPC) clusters has become a growing problem due, e.g., to the economic cost of electricity, the emission of car- bon dioxide (with negative impact on the ...
    • openAccess   A Survey on Malleability Solutions for High-Performance Distributed Computing 

      Aliaga Estellés, José Ignacio; Castillo, Maribel; Iserte, Sergio; Martín Álvarez, Iker; Mayo, Rafael MDPI (2022-05-22)
      Maintaining a high rate of productivity, in terms of completed jobs per unit of time, in High-Performance Computing (HPC) facilities is a cornerstone in the next generation of exascale supercomputers. Process malleability ...
    • openAccess   Analysis of Threading Libraries for High Performance Computing 

      Castelló, Adrián; Mayo, Rafael; Seo, Sangmin; Balaji, Pavan; Quintana-Orti, Enrique S.; Peña Monferrer, Antonio J. IEEE (2020-01-30)
      With the appearance of multi-many core machines, applications and runtime systems evolved in order to exploit the new on-node concurrency that brought new software paradigms. POSIX threads (Pthreads) was widely-adopted for ...
    • openAccess   Architecture-Aware Con guration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors 

      Catalán, Sandra; Igual, Francisco; Mayo, Rafael; Rodríguez Sánchez, Rafael; Quintana-Orti, Enrique S. Springer US (2016-09)
      Asymmetric multicore processors (AMPs) have recently emerged as an appealing technology for severely energy-constrained environments, especially in mobile appliances where heterogeneity in applications is mainstream. ...
    • openAccess   Assessing Power Monitoring Approaches for Energy and Power Analysis of Computers 

      El Mehdi Diouria, Mohammed; Dolz, Manuel F.; Glückc, Olivier; Lefèvre, Laurent; Alonso-Jordá, Pedro; Catalán, Sandra; Mayo, Rafael; Quintana-Orti, Enrique S. Elsevier (2014-06)
      Large-scale distributed systems (e.g., datacenters, HPC systems, clouds, large-scale networks, etc.) consume and will consume enormous amounts of energy. Therefore, accurately monitoring the power dissipation and energy ...
    • openAccess   Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems 

      Aliaga Estellés, José Ignacio; Barreda Vayá, Maria; Dolz, Manuel F.; Martín Huertas, Alberto F.; Mayo, Rafael; Quintana-Orti, Enrique S. Springer US (2014)
      We investigate the benefits that an energyaware implementation of the runtime in charge of the concurrent execution of ILUPACK —a sophisticated preconditioned iterative solver for sparse linear systems— produces on the ...
    • closedAccess   Color and texture analysis using emerging parallel architectures 

      Igual, Francisco; Mayo, Rafael; Hartley, Timothy; Çatalyürek, Ümit V.; Ruiz, Antonio; Ujaldon, Manuel SAGE Publications (2011-11)
      While image texture is effective for use in pattern-recognition and image-analysis algorithms, textural features are time-consuming to calculate on standard CPUs. Therefore, we present novel implementations of textural-feature ...
    • openAccess   DMRlib: Easy-coding and Efficient Resource Management for Job Malleability 

      Iserte, Sergio; Mayo, Rafael; Quintana-Orti, Enrique S.; Pena, Antonio J. IEEE (2020-09-09)
      Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the non-negligible ...
    • closedAccess   DVFS-control techniques for dense linear algebra operations on multi-core processors 

      Alonso-Jordá, Pedro; Dolz, Manuel F.; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Springer (2012-11)
      This paper analyzes the impact on power consumption of two DVFS-control strategies when applied to the execution of dense linear algebra operations on multi-core processors. The strategies considered here, prototyped as ...
    • openAccess   Dynamic reconfiguration of noniterative scientific applications A case study with HPG aligner 

      Iserte, Sergio; Martínez Pérez, Héctor; Barrachina Mir, Sergio; Castillo Catalán, María Isabel; Mayo, Rafael; Peña Monferrer, Antonio J. Springer (2018-09)
      Several studies have proved the benefits of job malleability, that is, the capacity of an application to adapt its parallelism to a dynamically changing number of allocated processors. The most remarkable advantages of ...
    • closedAccess   Energy-efficient execution of dense linear algebra algorithms on multi-core processors 

      Alonso-Jordá, Pedro; Dolz, Manuel F.; Mayo, Rafael; Quintana-Orti, Enrique S. Springer Verlag (2013-09)
      This paper addresses the efficient exploitation of task-level parallelism, present in many dense linear algebra operations, from the point of view of both computational performance and energy consumption. The strategies ...
    • closedAccess   Enhancing performance and energy consumption of runtime schedulers for dense linear algebra 

      Alonso-Jordá, Pedro; Dolz, Manuel F.; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S. Wiley (2014-06)
      The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate ...
    • closedAccess   Exploiting the capabilities of modern GPUs for dense matrix computations 

      Barrachina Mir, Sergio; Castillo Catalán, María Isabel; Igual, Francisco; Mayo, Rafael; Quintana-Orti, Enrique S.; Quintana-Ortí, Gregorio John Wiley & Sons (2009)
      We present several algorithms to compute the solution of a linear system of equations on a graphics processor (GPU), as well as general techniques to improve their performance, such as padding and hybrid GPU-CPU computation. ...
    • openAccess   Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models 

      Castelló, Adrián; Pena, Antonio J.; Mayo, Rafael; Planas, Judit; Quintana-Orti, Enrique S.; Balaji, Pavan Springer (2016-06-21)
      Directive-based programming models, such as OpenMP, OpenACC, and OmpSs, enable users to accelerate applications by using coprocessors with little effort. These devices offer significant computing power, but their use can ...
    • closedAccess   Extending OpenMP to Survive the Heterogeneous Multi-Core Era 

      Ayguadé, Eduardo; Badía Sala, Rosa María; Bellens, Pieter; Cabrera, Daniel; Durán, Alejandro; Ferrer, Roger; González, Marc; Igual, Francisco; Jiménez González, Daniel; Labarta Mancho, Jesús; Martinell, Luis; Martorell, Xavier; Mayo, Rafael; Pérez, Josep M.; Planas, Judit; Quintana-Orti, Enrique S. Springer US (2010)
      This paper advances the state-of-the-art in programming models for exploiting task-level parallelism on heterogeneous many-core systems, presenting a number of extensions to the OpenMP language inspired in the StarSs ...
    • openAccess   GSaaS: A service to cloudify and schedule GPUs 

      Iserte, Sergio; Peña-Ortiz, Raúl; Gutiérrez-Aguado, Juan; Claver, José M.; Mayo, Rafael IEEE (2018-07)
      Cloud technology is an attractive infrastructure solution that provides customers with an almost unlimited on-demand computational capacity using a pay-per-use approach, and allows data centers to increase their energy and ...
    • closedAccess   Improving the user experience of the rCUDA remote GPU virtualization framework 

      Reaño, Carlos; Silla, Federico; Castelló, Adrián; Peña Monferrer, Antonio J.; Mayo, Rafael; Quintana-Orti, Enrique S. Wiley (2014-10)
      Graphics processing units (GPUs) are being increasingly embraced by the high-performance computing community as an effective way to reduce execution time by accelerating parts of their applications. remote CUDA (rCUDA) was ...
    • closedAccess   Large-scale linear system solver using secondary storage: self-energy in hybrid nanostructures 

      Badía, José; Movilla, Jose L.; Climente, Juan I.; Castillo Catalán, María Isabel; Marqués-Andrés, Mercedes; Mayo, Rafael; Quintana-Orti, Enrique S.; Planelles, Josep Elsevier (2011-02)
      We present a Fortran library which can be used to solve large-scale dense linear systems, Ax=b. The library is based on the LU decomposition included in the parallel linear algebra library PLAPACK and on its out-of-core ...