Performance–energy trade‑ofs of deep learning convolution algorithms on ARM processors
Ver/ Abrir
Impacto
Scholar |
Otros documentos de la autoría: Dolz, Manuel F.; Barrachina Mir, Sergio; Martínez, Héctor; Castelló, Adrián; Maciá, Antonio; Fabregat Llueca, German; Tomás, Andrés E.
Metadatos
Mostrar el registro completo del ítemcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadatos
Título
Performance–energy trade‑ofs of deep learning convolution algorithms on ARM processorsAutoría
Fecha de publicación
2023Editor
SpringerCita bibliográfica
Dolz, M.F., Barrachina, S., Martínez, H. et al. Performance–energy trade-offs of deep learning convolution algorithms on ARM processors. J Supercomput (2023). https://doi.org/10.1007/s11227-023-05050-4Tipo de documento
info:eu-repo/semantics/articleVersión de la editorial
https://link.springer.com/article/10.1007/s11227-023-05050-4Versión
info:eu-repo/semantics/publishedVersionPalabras clave / Materias
Resumen
In this work, we assess the performance and energy efciency of high-performance
codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) ... [+]
In this work, we assess the performance and energy efciency of high-performance
codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) inference on a series of
ARM-based processor architectures. Specifcally, we evaluate the NVIDIA Denver2
and Carmel processors, as well as the ARM Cortex-A57 and Cortex-A78AE CPUs
as part of a recent set of NVIDIA Jetson platforms. The performance–energy evaluation is carried out using the ResNet-50 v1.5 convolutional neural network (CNN)
on varying confgurations of convolution algorithms, number of threads/cores, and
operating frequencies on the tested processor cores. The results demonstrate that the
best throughput is obtained on all platforms with the Winograd convolution operator
running on all the cores at their highest frequency. However, if the goal is to reduce
the energy footprint, there is no rule of thumb for the optimal confguration. [-]
Publicado en
The Journal of Supercomputing, 2023.Entidad financiadora
CRUE-CSIC agreement with Springer Nature | Agencia Estatal de Investigación | Generalitat Valenciana | Juna de Andalucía
Código del proyecto o subvención
PID2020-113656RB-C21/C22 | CDEIGENT/2018/014 | POSTDOC_21_00025 | FJC2019-039222-I | PRE2021-099284
Derechos de acceso
© The Author(s) 2023
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Aparece en las colecciones
- ICC_Articles [419]