Reformulating the direct convolution for high-performance deep learning inference on ARM processors
Impact
Scholar |
Other documents of the author: Barrachina Mir, Sergio; Castelló, Adrián; Dolz, Manuel F.; Low, Tze Meng; Martinez, Hector; Quintana-Orti, Enrique S.; Upasana, Sridhar; Tomás Domínguez, Andrés Enrique
Metadata
Show full item recordcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadata
Title
Reformulating the direct convolution for high-performance deep learning inference on ARM processorsAuthor (s)
Date
2022-12-20Publisher
ElsevierISSN
1383-7621Bibliographic citation
Barrachina, S., Castelló, A., Dolz, M. F., Low, T. M., Martínez, H., Quintana-Ortí, E. S., ... & Tomás, A. E. (2023). Reformulating the direct convolution for high-performance deep learning inference on ARM processors. Journal of Systems Architecture, 135, 102806.Type
info:eu-repo/semantics/articleVersion
info:eu-repo/semantics/publishedVersionSubject
Abstract
We present two high-performance implementations of the convolution operator via the direct algorithm that outperform the so-called lowering approach based on the im2col transform plus the gemm kernel on an ARMv8-based ... [+]
We present two high-performance implementations of the convolution operator via the direct algorithm that outperform the so-called lowering approach based on the im2col transform plus the gemm kernel on an ARMv8-based processor. One of our methods presents the additional advantage of zero-memory overhead while the other employs an additional yet rather moderate workspace, substantially smaller than that required by the im2col+gemm solution. In contrast with a previous implementation of a similar zero-memory overhead direct convolution, this work exhibits the key advantage of preserving the conventional NHWC data layout for the input/output activations of the convolution layers. [-]
Is part of
Journal of Systems Architecture 135 (2023) 102806Funder Name
Generalitat Valenciana | Junta de Andalucía | European High-Performance Computing Joint Undertaking (JU) | European Union’s Horizon 2020
Project code
PID2020-113656RB-C21/-C22 | MCIN/AEI/10.13039/501100011033 | FJC2019-039222-I | MCIN/AEI/10.13039/501100011033 | CDEIGENT/2018/014 | POSTDOC_21_00025 | 955558
Rights
1383-7621/© 2022 The Author(s). Published by Elsevier B.V.
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
This item appears in the folowing collection(s)
- ICC_Articles [420]