Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors

Aliaga Estellés, José Ignacio; Anzt, Hartwig; Quintana-Orti, Enrique S.; Tomás Domínguez, Andrés Enrique

dc.contributor.author	Aliaga Estellés, José Ignacio
dc.contributor.author	Anzt, Hartwig
dc.contributor.author	Quintana-Orti, Enrique S.
dc.contributor.author	Tomás Domínguez, Andrés Enrique
dc.date.accessioned	2023-10-06T06:56:03Z
dc.date.available	2023-10-06T06:56:03Z
dc.date.issued	2023-08-04
dc.identifier.citation	ALIAGA, José I., et al. Sparse matrix‐vector and matrix‐multivector products for the truncated SVD on graphics processors. Concurrency and Computation: Practice and Experience, 2023, p. e7871.	ca_CA
dc.identifier.uri	http://hdl.handle.net/10234/204435
dc.description.abstract	Many practical algorithms for numerical rank computations implement an iterative procedure that involves repeated multiplications of a vector, or a collection of vectors, with both a sparse matrix A and its transpose. Unfortunately, the realization of these sparse products on current high performance libraries often deliver much lower arithmetic throughput when the matrix involved in the product is transposed. In this work, we propose a hybrid sparse matrix layout, named CSRC, that combines the flexibility of some well-known sparse formats to offer a number of appealing properties: (1) CSRC can be obtained at low cost from the popular CSR (compressed sparse row) format; (2) CSRC has similar storage requirements as CSR; and especially, (3) the implementation of the sparse product kernels delivers high performance for both the direct product and its transposed variant on modern graphics accelerators thanks to a significant reduction of atomic operations compared to a conventional implementation based on CSR. This solution thus renders considerably higher performance when integrated into an iterative algorithm for the truncated singular value decomposition (SVD), such as the randomized SVD or, as demonstrated in the experimental results, the block Golub–Kahan–Lanczos algorithm.	ca_CA
dc.description.sponsorShip	Funding for open access charge: CRUE-Universitat Jaume I
dc.format.extent	12 p.	ca_CA
dc.format.mimetype	application/pdf	ca_CA
dc.language.iso	eng	ca_CA
dc.publisher	Wiley	ca_CA
dc.rights	© 2023 The Authors. Concurrency and Computation: Practice and Experience published by John Wiley & Sons Ltd.	ca_CA
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	ca_CA
dc.subject	graphics processing units	ca_CA
dc.subject	singular value decomposition	ca_CA
dc.subject	sparse matrix-multivector product	ca_CA
dc.subject	sparse matrix-vector product	ca_CA
dc.title	Sparse matrix-vector and matrix-multivector products for the truncated SVD on graphics processors	ca_CA
dc.type	info:eu-repo/semantics/article	ca_CA
dc.identifier.doi	https://doi.org/10.1002/cpe.7871
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca_CA
dc.type.version	info:eu-repo/semantics/publishedVersion	ca_CA
project.funder.name	US Exascale Computing Project	ca_CA
project.funder.name	U.S. Department of Energy Office of Science	ca_CA
project.funder.name	European High-Performance Computing Joint Undertaking (JU)	ca_CA
project.funder.name	European Union's Horizon 2020 Research and Innovation Programme	ca_CA
project.funder.name	Spanish National Plan for Scientific and Technical Research and Innovation (MCIN/AEI/10.13039/501100011033)	ca_CA
project.funder.name	Universitat Jaume I	ca_CA
oaire.awardNumber	17-SC-20-SC	ca_CA
oaire.awardNumber	955558 (eFlows4HPC project)	ca_CA
oaire.awardNumber	PID2020-113656RB	ca_CA
oaire.awardNumber	UJI-B2021-58	ca_CA

Ficheros en el ítem

Nombre:: 86643.pdf
Tamaño:: 1.354Mb
Formato:: PDF
Descripción:: Versió editorial

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

ICC_Articles [424]

Mostrar el registro sencillo del ítem