Mostrar el registro sencillo del ítem

dc.contributor.authorBarrachina Mir, Sergio
dc.contributor.authorCastelló, Adrián
dc.contributor.authorDolz, Manuel F.
dc.contributor.authorLow, Tze Meng
dc.contributor.authorMartinez, Hector
dc.contributor.authorQuintana-Orti, Enrique S.
dc.contributor.authorUpasana, Sridhar
dc.contributor.authorTomás Domínguez, Andrés Enrique
dc.date.accessioned2023-01-30T08:14:06Z
dc.date.available2023-01-30T08:14:06Z
dc.date.issued2022-12-20
dc.identifier.citationBarrachina, S., Castelló, A., Dolz, M. F., Low, T. M., Martínez, H., Quintana-Ortí, E. S., ... & Tomás, A. E. (2023). Reformulating the direct convolution for high-performance deep learning inference on ARM processors. Journal of Systems Architecture, 135, 102806.ca_CA
dc.identifier.issn1383-7621
dc.identifier.urihttp://hdl.handle.net/10234/201463
dc.description.abstractWe present two high-performance implementations of the convolution operator via the direct algorithm that outperform the so-called lowering approach based on the im2col transform plus the gemm kernel on an ARMv8-based processor. One of our methods presents the additional advantage of zero-memory overhead while the other employs an additional yet rather moderate workspace, substantially smaller than that required by the im2col+gemm solution. In contrast with a previous implementation of a similar zero-memory overhead direct convolution, this work exhibits the key advantage of preserving the conventional NHWC data layout for the input/output activations of the convolution layers.ca_CA
dc.description.sponsorShipFunding for open access charge: CRUE-Universitat Jaume I
dc.format.extent13 p.ca_CA
dc.format.mimetypeapplication/pdfca_CA
dc.language.isoengca_CA
dc.publisherElsevierca_CA
dc.relation.isPartOfJournal of Systems Architecture 135 (2023) 102806ca_CA
dc.rights1383-7621/© 2022 The Author(s). Published by Elsevier B.V.ca_CA
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/ca_CA
dc.subjectconvolutionca_CA
dc.subjectdirect algorithmca_CA
dc.subjectdeep learningca_CA
dc.subjecthigh performanceca_CA
dc.subjectARMv8 architectureca_CA
dc.titleReformulating the direct convolution for high-performance deep learning inference on ARM processorsca_CA
dc.typeinfo:eu-repo/semantics/articleca_CA
dc.identifier.doihttps://doi.org/10.1016/j.sysarc.2022.102806
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca_CA
dc.type.versioninfo:eu-repo/semantics/publishedVersionca_CA
project.funder.nameGeneralitat Valencianaca_CA
project.funder.nameJunta de Andalucíaca_CA
project.funder.nameEuropean High-Performance Computing Joint Undertaking (JU)ca_CA
project.funder.nameEuropean Union’s Horizon 2020ca_CA
oaire.awardNumberPID2020-113656RB-C21/-C22ca_CA
oaire.awardNumberMCIN/AEI/10.13039/501100011033ca_CA
oaire.awardNumberFJC2019-039222-Ica_CA
oaire.awardNumberMCIN/AEI/10.13039/501100011033ca_CA
oaire.awardNumberCDEIGENT/2018/014ca_CA
oaire.awardNumberPOSTDOC_21_00025ca_CA
oaire.awardNumber955558ca_CA


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

1383-7621/© 2022 The Author(s). Published by Elsevier B.V.
Excepto si se señala otra cosa, la licencia del ítem se describe como: 1383-7621/© 2022 The Author(s). Published by Elsevier B.V.