Mostrar el registro sencillo del ítem
Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures
dc.contributor.author | Martinez, Hector | |
dc.contributor.author | Catalán, Sandra | |
dc.contributor.author | Castelló, Adrián | |
dc.contributor.author | Quintana-Orti, Enrique S. | |
dc.date.accessioned | 2024-07-17T09:53:48Z | |
dc.date.available | 2024-07-17T09:53:48Z | |
dc.date.issued | 2024-05-24 | |
dc.identifier.citation | Martínez, Héctor, et al. "Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures." Journal of Systems Architecture (2024): 103186. | ca_CA |
dc.identifier.issn | 1383-7621 | |
dc.identifier.uri | http://hdl.handle.net/10234/208225 | |
dc.description.abstract | We present high performance, multi-threaded implementations of three GEMM-based convolution algorithms for multicore processors with ARM and RISC-V architectures. The codes are integrated into CONVLIB, a library that has the following unique features: (1) scripts to automatically generate a key component of GEMM, known as the micro-kernel, which is typically written in assembly language; (2) a modified analytical model to automatically tune the algorithms to the underlying cache architecture; (3) the ability to select four hyper-parameters: micro-kernel, cache parameters, parallel loop, and GEMM algorithm dynamically between calls to the library, without recompiling it; and (4) a driver to identify the best hyper-parameters. In addition, we provide a detailed performance evaluation of the convolution algorithms, on five ARM and RISC-V processors, and we publicly release the codes. | ca_CA |
dc.format.extent | 40 p. | ca_CA |
dc.format.mimetype | application/pdf | ca_CA |
dc.language.iso | eng | ca_CA |
dc.publisher | Elsevier | ca_CA |
dc.relation.isPartOf | Journal of Systems Architecture, 153 (2024) 103186 | ca_CA |
dc.relation.uri | Data will be made available on request. | ca_CA |
dc.rights | 1383-7621/© 2024 Published by Elsevier B.V. | ca_CA |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | ca_CA |
dc.subject | deep learning | ca_CA |
dc.subject | high performance computing | ca_CA |
dc.subject | low-power devices | ca_CA |
dc.title | Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures | ca_CA |
dc.type | info:eu-repo/semantics/article | ca_CA |
dc.identifier.doi | https://doi.org/10.1016/j.sysarc.2024.103186 | |
dc.rights.accessRights | info:eu-repo/semantics/embargoedAccess | ca_CA |
dc.type.version | info:eu-repo/semantics/acceptedVersion | ca_CA |
project.funder.name | MCIN/AEI/10.13039/501100011033 | ca_CA |
project.funder.name | Generalitat Valenciana | ca_CA |
project.funder.name | Junta de Andalucía | ca_CA |
project.funder.name | European Union ‘‘NextGenerationEU’’/PRTR | ca_CA |
project.funder.name | Universitat Jaume I | ca_CA |
oaire.awardNumber | PID2020-113 656RB-C22 | ca_CA |
oaire.awardNumber | PROMETEO 2023-CIPROM/2022/20 | ca_CA |
oaire.awardNumber | POSTDOC_21_00025 | ca_CA |
oaire.awardNumber | RYC2021-033973-I | ca_CA |
oaire.awardNumber | UJI-2023-04 | ca_CA |
dc.subject.ods | 9. Industria, innovacion e infraestructura | ca_CA |
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(ones)
-
ICC_Articles [427]