Exploiting Task and Data Parallelism in ILUPACK's Preconditioned CG Solver on NUMA Architectures and Many-core Accelerators

Aliaga Estellés, José Ignacio; Badía Sala, Rosa María; Barreda Vayá, Maria; Bollhöffer, Matthias; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.

dc.contributor.author	Aliaga Estellés, José Ignacio
dc.contributor.author	Badía Sala, Rosa María
dc.contributor.author	Barreda Vayá, Maria
dc.contributor.author	Bollhöffer, Matthias
dc.contributor.author	Dufrechou, Ernesto
dc.contributor.author	Ezzatti, Pablo
dc.contributor.author	Quintana-Orti, Enrique S.
dc.date.accessioned	2016-12-16T11:37:43Z
dc.date.available	2016-12-16T11:37:43Z
dc.date.issued	2016-05
dc.identifier.citation	ALIAGA ESTELLÉS, José Ignacio; BADÍA SALA, Rosa María; BARREDA VAYÁ, María; BOLLHÖFFER, Matthias; DUFRECHOU, Ernesto; EZZATTI, Pablo; QUINTANA ORTÍ, Enrique S. Exploiting Task and Data Parallelism in ILUPACK's Preconditioned CG Solver on NUMA Architectures and Many-core Accelerators. Parallel Computing (2016), v. 54, pp. 97-107	ca_CA
dc.identifier.uri	http://hdl.handle.net/10234/165072
dc.description.abstract	We present specialized implementations of the preconditioned iterative linear system solver in ILUPACK for Non-Uniform Memory Access (NUMA) platforms and many-core hardware co-processors based on the Intel Xeon Phi and graphics accelerators. For the conventional x86 architectures, our approach exploits task parallelism via the OmpSs runtime as well as a messagepassing implementation based on MPI, respectively yielding a dynamic and static schedule of the work to the cores, with di erent numeric semantics to those of the sequential ILUPACK. For the graphics processor we exploit data parallelism by o -loading the computationally expensive kernels to the accelerator while keeping the numeric semantics of the sequential case.	ca_CA
dc.description.sponsorShip	The authors from the Universitat Jaume I were supported by the projects EU FP7 318793 (Exa2Green), TIN2011-23283 of the Ministerio de Economía y Competitividad (MINECO) and EU FEDER, and P11B2013-20 of the Fundació Caixa Castelló-Bancaixa and UJI. Rosa M. Badia was supported by project TIN2012-34557 of MINECO and EU FEDER, and by the Generalitat de Catalunya (contract 2009-SGR-980). María Barreda was supported by the FPU program of the Ministerio de Educación, Cultura y Deporte.	ca_CA
dc.format.extent	20 p.	ca_CA
dc.format.mimetype	application/pdf	ca_CA
dc.language.iso	eng	ca_CA
dc.publisher	Elsevier	ca_CA
dc.relation.isPartOf	Parallel Computing (2016), v. 54	ca_CA
dc.rights.uri	http://rightsstatements.org/vocab/CNE/1.0/	*
dc.subject	Sparse linear systems	ca_CA
dc.subject	Reconditioned Conjugate Gradient solver	ca_CA
dc.subject	Task and data parallelism	ca_CA
dc.subject	Multi-core processors	ca_CA
dc.subject	Intel Xeon Phi	ca_CA
dc.subject	Graphics processing units (GPUs)	ca_CA
dc.title	Exploiting Task and Data Parallelism in ILUPACK's Preconditioned CG Solver on NUMA Architectures and Many-core Accelerators	ca_CA
dc.type	info:eu-repo/semantics/article	ca_CA
dc.identifier.doi	http://dx.doi.org/10.1016/j.parco.2015.12.004
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca_CA
dc.relation.publisherVersion	http://www.sciencedirect.com/science/article/pii/S0167819115001581	ca_CA

Ficheros en el ítem

Nombre:: Aliaga_2016_Exploiting.pdf
Tamaño:: 817.9Kb
Formato:: PDF
Descripción:: Preprint

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

ICC_Articles [430]

Mostrar el registro sencillo del ítem