Mostrar el registro sencillo del ítem
Exploiting Task and Data Parallelism in ILUPACK's Preconditioned CG Solver on NUMA Architectures and Many-core Accelerators
dc.contributor.author | Aliaga Estellés, José Ignacio | |
dc.contributor.author | Badía Sala, Rosa María | |
dc.contributor.author | Barreda Vayá, Maria | |
dc.contributor.author | Bollhöffer, Matthias | |
dc.contributor.author | Dufrechou, Ernesto | |
dc.contributor.author | Ezzatti, Pablo | |
dc.contributor.author | Quintana-Orti, Enrique S. | |
dc.date.accessioned | 2016-12-16T11:37:43Z | |
dc.date.available | 2016-12-16T11:37:43Z | |
dc.date.issued | 2016-05 | |
dc.identifier.citation | ALIAGA ESTELLÉS, José Ignacio; BADÍA SALA, Rosa María; BARREDA VAYÁ, María; BOLLHÖFFER, Matthias; DUFRECHOU, Ernesto; EZZATTI, Pablo; QUINTANA ORTÍ, Enrique S. Exploiting Task and Data Parallelism in ILUPACK's Preconditioned CG Solver on NUMA Architectures and Many-core Accelerators. Parallel Computing (2016), v. 54, pp. 97-107 | ca_CA |
dc.identifier.uri | http://hdl.handle.net/10234/165072 | |
dc.description.abstract | We present specialized implementations of the preconditioned iterative linear system solver in ILUPACK for Non-Uniform Memory Access (NUMA) platforms and many-core hardware co-processors based on the Intel Xeon Phi and graphics accelerators. For the conventional x86 architectures, our approach exploits task parallelism via the OmpSs runtime as well as a messagepassing implementation based on MPI, respectively yielding a dynamic and static schedule of the work to the cores, with di erent numeric semantics to those of the sequential ILUPACK. For the graphics processor we exploit data parallelism by o -loading the computationally expensive kernels to the accelerator while keeping the numeric semantics of the sequential case. | ca_CA |
dc.description.sponsorShip | The authors from the Universitat Jaume I were supported by the projects EU FP7 318793 (Exa2Green), TIN2011-23283 of the Ministerio de Economía y Competitividad (MINECO) and EU FEDER, and P11B2013-20 of the Fundació Caixa Castelló-Bancaixa and UJI. Rosa M. Badia was supported by project TIN2012-34557 of MINECO and EU FEDER, and by the Generalitat de Catalunya (contract 2009-SGR-980). María Barreda was supported by the FPU program of the Ministerio de Educación, Cultura y Deporte. | ca_CA |
dc.format.extent | 20 p. | ca_CA |
dc.format.mimetype | application/pdf | ca_CA |
dc.language.iso | eng | ca_CA |
dc.publisher | Elsevier | ca_CA |
dc.relation.isPartOf | Parallel Computing (2016), v. 54 | ca_CA |
dc.rights.uri | http://rightsstatements.org/vocab/CNE/1.0/ | * |
dc.subject | Sparse linear systems | ca_CA |
dc.subject | Reconditioned Conjugate Gradient solver | ca_CA |
dc.subject | Task and data parallelism | ca_CA |
dc.subject | Multi-core processors | ca_CA |
dc.subject | Intel Xeon Phi | ca_CA |
dc.subject | Graphics processing units (GPUs) | ca_CA |
dc.title | Exploiting Task and Data Parallelism in ILUPACK's Preconditioned CG Solver on NUMA Architectures and Many-core Accelerators | ca_CA |
dc.type | info:eu-repo/semantics/article | ca_CA |
dc.identifier.doi | http://dx.doi.org/10.1016/j.parco.2015.12.004 | |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca_CA |
dc.relation.publisherVersion | http://www.sciencedirect.com/science/article/pii/S0167819115001581 | ca_CA |
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(ones)
-
ICC_Articles [430]