Reformulating the direct convolution for high-performance deep learning inference on ARM processors
Impacto
Scholar |
Otros documentos de la autoría: Barrachina Mir, Sergio; Castelló, Adrián; Dolz, Manuel F.; Low, Tze Meng; Martinez, Hector; Quintana-Orti, Enrique S.; Upasana, Sridhar; Tomás Domínguez, Andrés Enrique
Metadatos
Mostrar el registro completo del ítemcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadatos
Título
Reformulating the direct convolution for high-performance deep learning inference on ARM processorsAutoría
Fecha de publicación
2022-12-20Editor
ElsevierISSN
1383-7621Cita bibliográfica
Barrachina, S., Castelló, A., Dolz, M. F., Low, T. M., Martínez, H., Quintana-Ortí, E. S., ... & Tomás, A. E. (2023). Reformulating the direct convolution for high-performance deep learning inference on ARM processors. Journal of Systems Architecture, 135, 102806.Tipo de documento
info:eu-repo/semantics/articleVersión
info:eu-repo/semantics/publishedVersionPalabras clave / Materias
Resumen
We present two high-performance implementations of the convolution operator via the direct algorithm that outperform the so-called lowering approach based on the im2col transform plus the gemm kernel on an ARMv8-based ... [+]
We present two high-performance implementations of the convolution operator via the direct algorithm that outperform the so-called lowering approach based on the im2col transform plus the gemm kernel on an ARMv8-based processor. One of our methods presents the additional advantage of zero-memory overhead while the other employs an additional yet rather moderate workspace, substantially smaller than that required by the im2col+gemm solution. In contrast with a previous implementation of a similar zero-memory overhead direct convolution, this work exhibits the key advantage of preserving the conventional NHWC data layout for the input/output activations of the convolution layers. [-]
Publicado en
Journal of Systems Architecture 135 (2023) 102806Entidad financiadora
Generalitat Valenciana | Junta de Andalucía | European High-Performance Computing Joint Undertaking (JU) | European Union’s Horizon 2020
Código del proyecto o subvención
PID2020-113656RB-C21/-C22 | MCIN/AEI/10.13039/501100011033 | FJC2019-039222-I | MCIN/AEI/10.13039/501100011033 | CDEIGENT/2018/014 | POSTDOC_21_00025 | 955558
Derechos de acceso
1383-7621/© 2022 The Author(s). Published by Elsevier B.V.
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Aparece en las colecciones
- ICC_Articles [414]