Reformulating the direct convolution for high-performance deep learning inference on ARM processors
Visualitza/
Impacte
Scholar |
Altres documents de l'autoria: Barrachina Mir, Sergio; Castelló, Adrián; Dolz, Manuel F.; Low, Tze Meng; Martinez, Hector; Quintana-Orti, Enrique S.; Upasana, Sridhar; Tomás Domínguez, Andrés Enrique
Metadades
Mostra el registre complet de l'elementcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadades
Títol
Reformulating the direct convolution for high-performance deep learning inference on ARM processorsAutoria
Data de publicació
2022-12-20Editor
ElsevierISSN
1383-7621Cita bibliogràfica
Barrachina, S., Castelló, A., Dolz, M. F., Low, T. M., Martínez, H., Quintana-Ortí, E. S., ... & Tomás, A. E. (2023). Reformulating the direct convolution for high-performance deep learning inference on ARM processors. Journal of Systems Architecture, 135, 102806.Tipus de document
info:eu-repo/semantics/articleVersió
info:eu-repo/semantics/publishedVersionParaules clau / Matèries
Resum
We present two high-performance implementations of the convolution operator via the direct algorithm that outperform the so-called lowering approach based on the im2col transform plus the gemm kernel on an ARMv8-based ... [+]
We present two high-performance implementations of the convolution operator via the direct algorithm that outperform the so-called lowering approach based on the im2col transform plus the gemm kernel on an ARMv8-based processor. One of our methods presents the additional advantage of zero-memory overhead while the other employs an additional yet rather moderate workspace, substantially smaller than that required by the im2col+gemm solution. In contrast with a previous implementation of a similar zero-memory overhead direct convolution, this work exhibits the key advantage of preserving the conventional NHWC data layout for the input/output activations of the convolution layers. [-]
Publicat a
Journal of Systems Architecture 135 (2023) 102806Entitat finançadora
Generalitat Valenciana | Junta de Andalucía | European High-Performance Computing Joint Undertaking (JU) | European Union’s Horizon 2020
Codi del projecte o subvenció
PID2020-113656RB-C21/-C22 | MCIN/AEI/10.13039/501100011033 | FJC2019-039222-I | MCIN/AEI/10.13039/501100011033 | CDEIGENT/2018/014 | POSTDOC_21_00025 | 955558
Drets d'accés
1383-7621/© 2022 The Author(s). Published by Elsevier B.V.
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Apareix a les col.leccions
- ICC_Articles [420]