Efficient and portable Winograd convolutions for multi-core processors
![Thumbnail](/xmlui/bitstream/handle/10234/202584/dolz_2023_efficient.pdf.jpg?sequence=4&isAllowed=y)
Ver/ Abrir
Impacto
![Google Scholar](/xmlui/themes/Mirage2/images/uji/logo_google.png)
![Microsoft Academico](/xmlui/themes/Mirage2/images/uji/logo_microsoft.png)
Metadatos
Mostrar el registro completo del ítemcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadatos
Título
Efficient and portable Winograd convolutions for multi-core processorsAutoría
Fecha de publicación
2023-02-12Editor
SpringerISSN
0920-8542; 1573-0484Cita bibliográfica
Dolz, M.F., Martínez, H., Castelló, A. et al. Efficient and portable Winograd convolutions for multi-core processors. J Supercomput 79, 10589–10610 (2023). https://doi.org/10.1007/s11227-023-05088-4Tipo de documento
info:eu-repo/semantics/articleVersión
info:eu-repo/semantics/publishedVersionPalabras clave / Materias
Resumen
We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, ... [+]
We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, augmenting the portability of the solution is achieved via the introduction of vector instructions from Intel SSE/AVX2/AVX512 and ARM NEON/SVE to exploit the single-instruction multiple-data capabilities of current processors as well as OpenMP pragmas to exploit multi-threaded parallelism. While this comes at the cost of sacrificing a fraction of the computational performance, our experimental results on three distinct processors, with Intel Xeon Skylake, ARM Cortex A57 and Fujitsu A64FX processors, show that the impact is affordable and still renders a Winograd-based solution that is competitive when compared with the lowering GEMM-based convolution. [-]
Publicado en
The Journal of Supercomputing (2023) 79:10589–10610Datos relacionados
The ImageNet dataset used for the current study is publicly available from the web. See https://www.image-net.org/.Entidad financiadora
CRUE-CSIC | Generalitat Valenciana | Junta de Andalucía
Código del proyecto o subvención
PID2020-113656RB-C21/C22 | MCIN/AEI/10.13039/501100011033 | CDEIGENT/2018/014 | POSTDOC_21_00025 | FJC2019-039222-I | MCIN/AEI/10.13039/501100011033
Derechos de acceso
© The Author(s) 2023
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Aparece en las colecciones
- ICC_Articles [423]