Mostrar el registro sencillo del ítem

dc.contributor.authorBadia, Jose M.
dc.contributor.authorAmor-Martin, Adrian
dc.contributor.authorBELLOCH, JOSE A.
dc.contributor.authorGarcia-Castillo, Luis Emilio
dc.date.accessioned2023-01-30T08:48:17Z
dc.date.available2023-01-30T08:48:17Z
dc.date.issued2022-12-02
dc.identifier.citationBADIA, Jose M., et al. Strategies to parallelize a finite element mesh truncation technique on multi-core and many-core architectures. The Journal of Supercomputing, 79, 7648–7664 (2023).ca_CA
dc.identifier.issn0920-8542
dc.identifier.issn1573-0484
dc.identifier.urihttp://hdl.handle.net/10234/201465
dc.description.abstractAchieving maximum parallel performance on multi-core CPUs and many-core GPUs is a challenging task depending on multiple factors. These include, for example, the number and granularity of the computations or the use of the memories of the devices. In this paper, we assess those factors by evaluating and comparing different parallelizations of the same problem on a multiprocessor containing a CPU with 40 cores and four P100 GPUs with Pascal architecture. We use, as study case, the convolutional operation behind a non-standard finite element mesh truncation technique in the context of open region electromagnetic wave propagation problems. A total of six parallel algorithms implemented using OpenMP and CUDA have been used to carry out the comparison by leveraging the same levels of parallelism on both types of platforms. Three of the algorithms are presented for the first time in this paper, including a multi-GPU method, and two others are improved versions of algorithms previously developed by some of the authors. This paper presents a thorough experimental evaluation of the parallel algorithms on a radar cross-sectional prediction problem. Results show that performance obtained on the GPU clearly overcomes those obtained in the CPU, much more so if we use multiple GPUs to distribute both data and computations. Accelerations close to 30 have been obtained on the CPU, while with the multi-GPU version accelerations larger than 250 have been achieved.ca_CA
dc.description.sponsorShipFunding for open access charge: CRUE-Universitat Jaume I
dc.format.extent17 p.ca_CA
dc.format.mimetypeapplication/pdfca_CA
dc.language.isoengca_CA
dc.publisherSpringerca_CA
dc.rights© The Author(s) 2022ca_CA
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/ca_CA
dc.subjectParallel computingca_CA
dc.subjectCUDAca_CA
dc.subjectOpenMPca_CA
dc.subjectFinite elementsca_CA
dc.subjectGPUca_CA
dc.titleStrategies to parallelize a finite element mesh truncation technique on multi-core and many-core architecturesca_CA
dc.typeinfo:eu-repo/semantics/articleca_CA
dc.identifier.doihttps://doi.org/10.1007/s11227-022-04975-6
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca_CA
dc.type.versioninfo:eu-repo/semantics/publishedVersionca_CA
project.funder.nameGobierno de Españaca_CA
project.funder.nameGeneralitat Valencianaca_CA
project.funder.nameGobierno de la Comunidad de Madridca_CA
oaire.awardNumberPID2020-113656RB-C21ca_CA
oaire.awardNumberPID2019-106455GB-C21ca_CA
oaire.awardNumberPROMETEO/2019/109ca_CA
oaire.awardNumberMIMACUHSPACE-CM-UC3Mca_CA


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

© The Author(s) 2022
Excepto si se señala otra cosa, la licencia del ítem se describe como: © The Author(s) 2022