Optimising Convolutions for Deep Learning Inference On ARM Cortex-M Processors

Maciá-Lillo, Antonio; Barrachina Mir, Sergio; Fabregat Llueca, German; Dolz, Manuel F.

dc.contributor.author	Maciá-Lillo, Antonio
dc.contributor.author	Barrachina Mir, Sergio
dc.contributor.author	Fabregat Llueca, German
dc.contributor.author	Dolz, Manuel F.
dc.date.accessioned	2024-05-13T10:33:59Z
dc.date.available	2024-05-13T10:33:59Z
dc.date.issued	2024-04-30
dc.identifier.citation	Maciá, A., Barrachina Mir, S., Fabregat Llueca, G., & Dolz, M. F. (2024). “Optimising Convolutions for Deep Learning Inference on ARM Cortex-M Processors”. in IEEE Internet of Things Journal. https://doi.org/10.1109/JIOT.2024.3395335	ca_CA
dc.identifier.issn	2327-4662
dc.identifier.uri	http://hdl.handle.net/10234/207311
dc.description.abstract	We perform a series of optimisations on the convolution operator within the ARM CMSIS-NN library to improve the performance of deep learning tasks on Arduino development boards equipped with ARM Cortex-M4 and M7 microcontrollers. To this end, we develop custom microkernels that efficiently handle the internal computations required by the convolution operator via the lowering approach and the direct method, and we design two techniques to avoid register spilling. We also take advantage of all the RAM on the Arduino boards by reusing it as a scratchpad for the convolution filters. The integration of these techniques into CMSIS-NN, when invoked by TensorFlow Lite for microcontrollers for quantised versions of VGG, SqueezeNet, ResNet, and MobileNet-like convolutional neural networks enhances the overall inference speed by a factor ranging from 1.13× to 1.50×.	ca_CA
dc.description.sponsorShip	Funding for open access charge: CRUE-Universitat Jaume I
dc.format.extent	16 p.	ca_CA
dc.format.mimetype	application/pdf	ca_CA
dc.language.iso	eng	ca_CA
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	ca_CA
dc.relation.isPartOf	IEEE Internet of Things Journal, 2024	ca_CA
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	ca_CA
dc.subject	ARM Cortex-M	ca_CA
dc.subject	CMSIS-NN	ca_CA
dc.subject	Convolution	ca_CA
dc.subject	Convolution operator	ca_CA
dc.subject	Deep learning	ca_CA
dc.subject	Edge computing	ca_CA
dc.subject	High performance	ca_CA
dc.subject	Inference algorithms	ca_CA
dc.subject	Microcontrollers	ca_CA
dc.subject	Optimization	ca_CA
dc.subject	Program processors	ca_CA
dc.subject	Random access memory	ca_CA
dc.subject	Registers	ca_CA
dc.subject	Signal processing algorithms	ca_CA
dc.title	Optimising Convolutions for Deep Learning Inference On ARM Cortex-M Processors	ca_CA
dc.type	info:eu-repo/semantics/article	ca_CA
dc.identifier.doi	10.1109/JIOT.2024.3395335
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca_CA
dc.relation.publisherVersion	https://ieeexplore.ieee.org/document/10513367	ca_CA
dc.type.version	info:eu-repo/semantics/publishedVersion	ca_CA
project.funder.name	European Union NextGenerationEU	ca_CA
oaire.awardNumber	TED2021-129334B	ca_CA

Ficheros en el ítem

Nombre:: 10.1109JIOT.2024.3395335.pdf
Tamaño:: 1.990Mb
Formato:: PDF

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

ICC_Articles [425]

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como: http://creativecommons.org/licenses/by-nc-nd/4.0/