Listar ICC_Articles por autoría "cb0a5025-3e3c-47d6-a9f9-917521abc79f"
Mostrando ítems 1-3 de 3
-
Automatic generation of ARM NEON micro‑kernels for matrix multiplication
Alaejos, Guillermo; Martínez, Héctor; Castelló, Adrián; Dolz, Manuel F.; Igual, Francisco; Alonso-Jordá, Pedro; Quintana-Orti, Enrique S. Springer (2024-03-12)General matrix multiplication (gemm) is a fundamental kernel in scientifc computing and current frameworks for deep learning. Modern realisations of gemm are mostly written in C, on top of a small, highly tuned micro-kernel ... -
Efficient and portable Winograd convolutions for multi-core processors
Dolz, Manuel F.; Martínez, Héctor; Castelló, Adrián; Alonso-Jordá, Pedro; Quintana-Orti, Enrique S. Springer (2023-02-12)We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, ... -
Performance–energy trade‑ofs of deep learning convolution algorithms on ARM processors
Dolz, Manuel F.; Barrachina Mir, Sergio; Martínez, Héctor; Castelló, Adrián; Maciá, Antonio; Fabregat Llueca, German; Tomás, Andrés E. Springer (2023)In this work, we assess the performance and energy efciency of high-performance codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) ...