Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures

Martinez, Hector; Catalán, Sandra; Castelló, Adrián; Quintana-Orti, Enrique S.

dc.contributor.author	Martinez, Hector
dc.contributor.author	Catalán, Sandra
dc.contributor.author	Castelló, Adrián
dc.contributor.author	Quintana-Orti, Enrique S.
dc.date.accessioned	2024-07-17T09:53:48Z
dc.date.available	2024-07-17T09:53:48Z
dc.date.issued	2024-05-24
dc.identifier.citation	Martínez, Héctor, et al. "Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures." Journal of Systems Architecture (2024): 103186.	ca_CA
dc.identifier.issn	1383-7621
dc.identifier.uri	http://hdl.handle.net/10234/208225
dc.description.abstract	We present high performance, multi-threaded implementations of three GEMM-based convolution algorithms for multicore processors with ARM and RISC-V architectures. The codes are integrated into CONVLIB, a library that has the following unique features: (1) scripts to automatically generate a key component of GEMM, known as the micro-kernel, which is typically written in assembly language; (2) a modified analytical model to automatically tune the algorithms to the underlying cache architecture; (3) the ability to select four hyper-parameters: micro-kernel, cache parameters, parallel loop, and GEMM algorithm dynamically between calls to the library, without recompiling it; and (4) a driver to identify the best hyper-parameters. In addition, we provide a detailed performance evaluation of the convolution algorithms, on five ARM and RISC-V processors, and we publicly release the codes.	ca_CA
dc.format.extent	40 p.	ca_CA
dc.format.mimetype	application/pdf	ca_CA
dc.language.iso	eng	ca_CA
dc.publisher	Elsevier	ca_CA
dc.relation.isPartOf	Journal of Systems Architecture, 153 (2024) 103186	ca_CA
dc.relation.uri	Data will be made available on request.	ca_CA
dc.rights	1383-7621/© 2024 Published by Elsevier B.V.	ca_CA
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	ca_CA
dc.subject	deep learning	ca_CA
dc.subject	high performance computing	ca_CA
dc.subject	low-power devices	ca_CA
dc.title	Parallel GEMM-based convolutions for deep learning on multicore ARM and RISC-V architectures	ca_CA
dc.type	info:eu-repo/semantics/article	ca_CA
dc.identifier.doi	https://doi.org/10.1016/j.sysarc.2024.103186
dc.rights.accessRights	info:eu-repo/semantics/embargoedAccess	ca_CA
dc.type.version	info:eu-repo/semantics/acceptedVersion	ca_CA
project.funder.name	MCIN/AEI/10.13039/501100011033	ca_CA
project.funder.name	Generalitat Valenciana	ca_CA
project.funder.name	Junta de Andalucía	ca_CA
project.funder.name	European Union ‘‘NextGenerationEU’’/PRTR	ca_CA
project.funder.name	Universitat Jaume I	ca_CA
oaire.awardNumber	PID2020-113 656RB-C22	ca_CA
oaire.awardNumber	PROMETEO 2023-CIPROM/2022/20	ca_CA
oaire.awardNumber	POSTDOC_21_00025	ca_CA
oaire.awardNumber	RYC2021-033973-I	ca_CA
oaire.awardNumber	UJI-2023-04	ca_CA
dc.subject.ods	9. Industria, innovacion e infraestructura	ca_CA

Ficheros en el ítem

Nombre:: martínez_2023_parallel.pdf
Tamaño:: 663.8Kb
Formato:: PDF
Descripción:: Postprint

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

ICC_Articles [427]

Mostrar el registro sencillo del ítem