Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units
Visualitza/
Impacte
Scholar |
Altres documents de l'autoria: Aliaga Estellés, José Ignacio; Anzt, Hartwig; Grützmacher, Thomas; Quintana-Orti, Enrique S.; Tomás Domínguez, Andrés Enrique
Metadades
Mostra el registre complet de l'elementcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadades
Títol
Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing unitsAutoria
Data de publicació
2021Editor
John Wiley and SonsISSN
1532-0634; 1532-0626Cita bibliogràfica
Aliaga, JI, Anzt, H, Grützmacher, T, Quintana-Ortí, ES, Tomás, AE. Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units. Concurrency Computat Pract Exper. 2021;e6515. https://doi.org/10.1002/cpe.6515Tipus de document
info:eu-repo/semantics/articleVersió de l'editorial
https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.6515Versió
info:eu-repo/semantics/acceptedVersionParaules clau / Matèries
Resum
We contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and ... [+]
We contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our approach is multi-platform, in the sense that the realizations for (general-purpose) multicore processors as well as graphics accelerators (GPUs) are built upon common principles, but differ in the implementation details, which are adapted to avoid thread divergence in the GPU case or maximize compression element-wise (i.e., for each matrix entry) for multicore architectures. Our evaluation on the two last generations of NVIDIA GPUs as well as Intel and AMD processors demonstrate the benefits of the new kernels when compared with the optimized implementations of the sparse matrix-vector product in NVIDIA's cuSPARSE and Intel's MKL, respectively. [-]
Publicat a
Concurrency and Computation: Practice and Experience, 2021Entitat finançadora
Ministerio de Ciencia, Innovación y Universidades (España) | Helmholtz Association | United States Department of Energy (DOE)
Codi del projecte o subvenció
TIN2017-82972 | VH-NG-1241 | 17-SC-20-SC
Drets d'accés
Copyright © John Wiley & Sons, Inc.
http://rightsstatements.org/vocab/CNE/1.0/
info:eu-repo/semantics/openAccess
http://rightsstatements.org/vocab/CNE/1.0/
info:eu-repo/semantics/openAccess
Apareix a les col.leccions
- ICC_Articles [423]