Mostrar el registro sencillo del ítem

dc.contributor.authorAliaga Estellés, José Ignacio
dc.contributor.authorAnzt, Hartwig
dc.contributor.authorGrützmacher, Thomas
dc.contributor.authorQuintana-Orti, Enrique S.
dc.contributor.authorTomás Domínguez, Andrés Enrique
dc.date.accessioned2021-11-04T08:12:21Z
dc.date.available2021-11-04T08:12:21Z
dc.date.issued2021
dc.identifier.citationAliaga, JI, Anzt, H, Grützmacher, T, Quintana-Ortí, ES, Tomás, AE. Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units. Concurrency Computat Pract Exper. 2021;e6515. https://doi.org/10.1002/cpe.6515ca_CA
dc.identifier.issn1532-0634
dc.identifier.issn1532-0626
dc.identifier.urihttp://hdl.handle.net/10234/195374
dc.description.abstractWe contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our approach is multi-platform, in the sense that the realizations for (general-purpose) multicore processors as well as graphics accelerators (GPUs) are built upon common principles, but differ in the implementation details, which are adapted to avoid thread divergence in the GPU case or maximize compression element-wise (i.e., for each matrix entry) for multicore architectures. Our evaluation on the two last generations of NVIDIA GPUs as well as Intel and AMD processors demonstrate the benefits of the new kernels when compared with the optimized implementations of the sparse matrix-vector product in NVIDIA's cuSPARSE and Intel's MKL, respectively.ca_CA
dc.format.extent13 p.ca_CA
dc.language.isoengca_CA
dc.publisherJohn Wiley and Sonsca_CA
dc.relation.isPartOfConcurrency and Computation: Practice and Experience, 2021ca_CA
dc.rightsCopyright © John Wiley & Sons, Inc.ca_CA
dc.rights.urihttp://rightsstatements.org/vocab/CNE/1.0/ca_CA
dc.subjectcompressionca_CA
dc.subjectcoordinate sparse matrix formatca_CA
dc.subjectgraphics processing units (GPUs)ca_CA
dc.subjectmulticoreprocessors (CPUs)ca_CA
dc.subjectsparse matrix-vector productca_CA
dc.subjectworkload balancingca_CA
dc.titleCompression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing unitsca_CA
dc.typeinfo:eu-repo/semantics/articleca_CA
dc.identifier.doihttps://doi.org/10.1002/cpe.6515
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca_CA
dc.relation.publisherVersionhttps://onlinelibrary.wiley.com/doi/full/10.1002/cpe.6515ca_CA
dc.description.sponsorshipJ. I. Aliaga, E. S. Quintana-Ortí, and A. E. Tomás were supported by TIN2017-82972-R of the Spanish MINECO. H. Anzt and T. Grützmacher were supported by the “Impuls und Vernetzungsfond” of the Helmholtz Association under grant VH-NG-1241 and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. The authors would like to thank the Steinbuch Centre for Computing (SCC) of the Karlsruhe Institute of Technology for providing access to an NVIDIA A100 GPU.
dc.type.versioninfo:eu-repo/semantics/acceptedVersionca_CA
project.funder.nameMinisterio de Ciencia, Innovación y Universidades (España)ca_CA
project.funder.nameHelmholtz Associationca_CA
project.funder.nameUnited States Department of Energy (DOE)ca_CA
oaire.awardNumberTIN2017-82972ca_CA
oaire.awardNumberVH-NG-1241ca_CA
oaire.awardNumber17-SC-20-SCca_CA


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem