Mostrar el registro sencillo del ítem

dc.contributor.authorDolz, Manuel F.
dc.contributor.authorMartínez, Héctor
dc.contributor.authorAlonso, Pedro
dc.contributor.authorQuintana-Orti, Enrique S.
dc.date.accessioned2023-03-06T10:38:28Z
dc.date.available2023-03-06T10:38:28Z
dc.date.issued2022
dc.identifier.citationDOLZ, Manuel F., et al. Convolution Operators for Deep Learning Inference on the Fujitsu A64FX Processor. En 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). IEEE, 2022. p. 1-10.ca_CA
dc.identifier.isbn9781665451550
dc.identifier.urihttp://hdl.handle.net/10234/201928
dc.descriptionPonència presentada a 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) celebrat a Bordeaux, França.ca_CA
dc.description.abstractThe convolution operator is a crucial kernel for many computer vision and signal processing applications that rely on deep learning (DL) technologies. As such, the efficient implementation of this operator has received considerable attention in the past few years for a fair range of processor architectures. In this paper, we follow the technology trend toward integrating long SIMD (single instruction, multiple data) arithmetic units into high performance multicore processors to analyse the benefits of this type of hardware acceleration for latency-constrained DL workloads. For this purpose, we implement and optimise for the Fujitsu processor A64FX, three distinct methods for the calculation of the convolution, namely, the lowering approach, a blocked variant of the direct convolution algorithm, and the Winograd minimal filtering algorithm. Our experimental results include an extensive evaluation of the parallel scalability of these three methods and a comparison of their global performance using three popular DL models and a representative dataset.ca_CA
dc.format.extent10 p.ca_CA
dc.format.mimetypeapplication/pdfca_CA
dc.language.isoengca_CA
dc.publisherIEEEca_CA
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/ca_CA
dc.subjectConvolutional neural networksca_CA
dc.subjecthigh performanceca_CA
dc.subjectSIMD arithmetic unitsca_CA
dc.subjectARM-based A64FX processorca_CA
dc.titleConvolution Operators for Deep Learning Inference on the Fujitsu A64FX Processorca_CA
dc.typeinfo:eu-repo/semantics/conferenceObjectca_CA
dc.identifier.doihttps://doi.org/10.1109/SBAC-PAD55451.2022.00027
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca_CA
dc.relation.publisherVersionhttps://ieeexplore.ieee.org/document/9980987/authors#authorsca_CA
dc.type.versioninfo:eu-repo/semantics/publishedVersionca_CA
project.funder.nameMinisterio de Ciencia, Innovación y Universidadesca_CA
project.funder.nameGeneralitat Valencianaca_CA
project.funder.nameEuropean High Performance Computing Joint Undertaking (JU)ca_CA
oaire.awardNumberTIN2017- 82972ca_CA
oaire.awardNumberPrometeo/2019/109ca_CA
oaire.awardNumberCDEIGENT/2018/014ca_CA
oaire.awardNumberGrant agreement No 955558ca_CA


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

http://creativecommons.org/licenses/by-nc-nd/4.0/
Excepto si se señala otra cosa, la licencia del ítem se describe como: http://creativecommons.org/licenses/by-nc-nd/4.0/