Mostrar el registro sencillo del ítem
Optimising Convolutions for Deep Learning Inference On ARM Cortex-M Processors
dc.contributor.author | Maciá-Lillo, Antonio | |
dc.contributor.author | Barrachina Mir, Sergio | |
dc.contributor.author | Fabregat Llueca, German | |
dc.contributor.author | Dolz, Manuel F. | |
dc.date.accessioned | 2024-05-13T10:33:59Z | |
dc.date.available | 2024-05-13T10:33:59Z | |
dc.date.issued | 2024-04-30 | |
dc.identifier.citation | Maciá, A., Barrachina Mir, S., Fabregat Llueca, G., & Dolz, M. F. (2024). “Optimising Convolutions for Deep Learning Inference on ARM Cortex-M Processors”. in IEEE Internet of Things Journal. https://doi.org/10.1109/JIOT.2024.3395335 | ca_CA |
dc.identifier.issn | 2327-4662 | |
dc.identifier.uri | http://hdl.handle.net/10234/207311 | |
dc.description.abstract | We perform a series of optimisations on the convolution operator within the ARM CMSIS-NN library to improve the performance of deep learning tasks on Arduino development boards equipped with ARM Cortex-M4 and M7 microcontrollers. To this end, we develop custom microkernels that efficiently handle the internal computations required by the convolution operator via the lowering approach and the direct method, and we design two techniques to avoid register spilling. We also take advantage of all the RAM on the Arduino boards by reusing it as a scratchpad for the convolution filters. The integration of these techniques into CMSIS-NN, when invoked by TensorFlow Lite for microcontrollers for quantised versions of VGG, SqueezeNet, ResNet, and MobileNet-like convolutional neural networks enhances the overall inference speed by a factor ranging from 1.13× to 1.50×. | ca_CA |
dc.description.sponsorShip | Funding for open access charge: CRUE-Universitat Jaume I | |
dc.format.extent | 16 p. | ca_CA |
dc.format.mimetype | application/pdf | ca_CA |
dc.language.iso | eng | ca_CA |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | ca_CA |
dc.relation.isPartOf | IEEE Internet of Things Journal, 2024 | ca_CA |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | ca_CA |
dc.subject | ARM Cortex-M | ca_CA |
dc.subject | CMSIS-NN | ca_CA |
dc.subject | Convolution | ca_CA |
dc.subject | Convolution operator | ca_CA |
dc.subject | Deep learning | ca_CA |
dc.subject | Edge computing | ca_CA |
dc.subject | High performance | ca_CA |
dc.subject | Inference algorithms | ca_CA |
dc.subject | Microcontrollers | ca_CA |
dc.subject | Optimization | ca_CA |
dc.subject | Program processors | ca_CA |
dc.subject | Random access memory | ca_CA |
dc.subject | Registers | ca_CA |
dc.subject | Signal processing algorithms | ca_CA |
dc.title | Optimising Convolutions for Deep Learning Inference On ARM Cortex-M Processors | ca_CA |
dc.type | info:eu-repo/semantics/article | ca_CA |
dc.identifier.doi | 10.1109/JIOT.2024.3395335 | |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | ca_CA |
dc.relation.publisherVersion | https://ieeexplore.ieee.org/document/10513367 | ca_CA |
dc.type.version | info:eu-repo/semantics/publishedVersion | ca_CA |
project.funder.name | European Union NextGenerationEU | ca_CA |
oaire.awardNumber | TED2021-129334B | ca_CA |
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(ones)
-
ICC_Articles [425]