Mostrar el registro sencillo del ítem
Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures
dc.contributor.author | Heavner, Nathan | |
dc.contributor.author | Igual, Francisco | |
dc.contributor.author | Quintana-Ortí, Gregorio | |
dc.contributor.author | MARTINSSON, GUNNAR | |
dc.date.accessioned | 2022-10-06T11:33:01Z | |
dc.date.available | 2022-10-06T11:33:01Z | |
dc.date.issued | 2022-06 | |
dc.identifier.citation | N. Heavner, F. D. Igual, G. Quintana-Ortí, and P. G. Martinsson. 2022. Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures. ACM Trans. Math. Softw. 48, 2, Article 21 (June 2022), 42 pages. https://doi.org/10.1145/3507466 | ca_CA |
dc.identifier.issn | 0098-3500 | |
dc.identifier.issn | 1557-7295 | |
dc.identifier.uri | http://hdl.handle.net/10234/200216 | |
dc.description.abstract | Randomized singular value decomposition (RSVD) is by now a well-established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin RSVD, the recently proposed algorithm “randUTV” computes a full factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations such as matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods such as column-pivoted QR in most high-performance computational settings. In this article, optimized randUTV implementations are presented for both shared-memory and distributed-memory computing environments. For shared memory, randUTV is redesigned in terms of an algorithm-by-blocks that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard blocked algorithm based on a purely fork-join approach. The distributed-memory implementation is based on the ScaLAPACK library. The performance of our new codes compares favorably with competing factorizations available on both shared-memory and distributed-memory architectures. | ca_CA |
dc.format.extent | 42 p. | ca_CA |
dc.format.mimetype | application/pdf | ca_CA |
dc.language.iso | eng | ca_CA |
dc.publisher | Association for Computing Machinery (ACM) | ca_CA |
dc.relation.isPartOf | ACM Transactions on Mathematical Software (TOMS), 2022, vol. 48, no 2 | ca_CA |
dc.rights | Copyright © ACM, Inc. | ca_CA |
dc.rights.uri | http://rightsstatements.org/vocab/CNE/1.0/ | ca_CA |
dc.subject | mathematics of computing | ca_CA |
dc.subject | computations on matrices | ca_CA |
dc.title | Algorithm 1022: Efficient Algorithms for Computing a Rank-Revealing UTV Factorization on Parallel Computing Architectures | ca_CA |
dc.type | info:eu-repo/semantics/article | ca_CA |
dc.identifier.doi | https://doi.org/10.1145/3507466 | |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca_CA |
dc.relation.publisherVersion | https://dl.acm.org/doi/full/10.1145/3507466 | ca_CA |
dc.type.version | info:eu-repo/semantics/submittedVersion | ca_CA |
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(ones)
-
ICC_Articles [425]