Evaluating the soft error sensitivity of a GPU-based SoC for matrixmultiplication
Visualitza/
Impacte
Scholar |
Altres documents de l'autoria: León, Germán; Badía, José; BELLOCH, JOSE A.; LINDOSO, ALMUDENA; Entrena, Luis
Metadades
Mostra el registre complet de l'elementcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/8620
comunitat-uji-handle4:
INVESTIGACIONMetadades
Títol
Evaluating the soft error sensitivity of a GPU-based SoC for matrixmultiplicationData de publicació
2020Editor
ElsevierISSN
0026-2714Cita bibliogràfica
LEÓN, Germán, et al. Evaluating the soft error sensitivity of a GPU-based SoC for matrix multiplication. Microelectronics Reliability, 2020, vol. 114, p. 113856.Tipus de document
info:eu-repo/semantics/articleVersió de l'editorial
https://www.sciencedirect.com/science/article/pii/S0026271420304558Versió
info:eu-repo/semantics/submittedVersionParaules clau / Matèries
Resum
System-on-Chip (SoC) devices can be composed of low-power multicore processors combined with a small graphics accelerator
(or GPU) which offers a trade-off between computational capacity and low-power consumption. ... [+]
System-on-Chip (SoC) devices can be composed of low-power multicore processors combined with a small graphics accelerator
(or GPU) which offers a trade-off between computational capacity and low-power consumption. In this work we use the LLFI-GPU
fault injection tool on one of these devices to compare the sensitivity to soft errors of two different CUDA versions of matrix
multiplication benchmark. Specifically, we perform fault injection campaigns on a Jetson TK1 development kit, a board equipped
with a SoC including an NVIDIA ”Kepler“ Graphics Processing Unit (GPU). We evaluate the effect of modifying the size of the
problem and also the thread-block size on the behaviour of the algorithms. Our results show that the block version of the matrix
multiplication benchmark that leverages the shared memory of the GPU is not only faster than the element-wise version, but it is
also much more resilient to soft errors. We also use the cuda-gdb debugger to analyze the main causes of the crashes in the code
due to soft errors. Our experiments show that most of the errors are due to accesses to invalid positions of the different memories
of the GPU, which causes that the block version suffers a higher percentage of this kind of errors. [-]
Publicat a
Microelectronics Reliability, 2020, vol. 114.Entitat finançadora
Gobierno de España | European Commission | Generalitat Valenciana
Codi del projecte o subvenció
TIN2017-82972-R | ESP2015-68245-C4-1-P | PROMETEO/2019/109
Drets d'accés
0026-2714/ © 2020 Elsevier Ltd. All rights reserved.
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/openAccess
http://rightsstatements.org/vocab/InC/1.0/
info:eu-repo/semantics/openAccess
Apareix a les col.leccions
- ICC_Articles [418]