Enabling big data analytics in the hybrid cloud using iterative MapReduce
![Thumbnail](/xmlui/bitstream/handle/10234/158986/Clemente_2016_Enabling.pdf.jpg?sequence=4&isAllowed=y)
Visualitza/
Impacte
![Google Scholar](/xmlui/themes/Mirage2/images/uji/logo_google.png)
![Microsoft Academico](/xmlui/themes/Mirage2/images/uji/logo_microsoft.png)
Metadades
Mostra el registre complet de l'elementcomunitat-uji-handle:10234/9
comunitat-uji-handle2:10234/7036
comunitat-uji-handle3:10234/146069
comunitat-uji-handle4:
INVESTIGACIONMetadades
Títol
Enabling big data analytics in the hybrid cloud using iterative MapReduceAutoria
Data de publicació
2015-12Editor
HAL-InriaCita bibliogràfica
CLEMENTE CASTELLÓ, Francisco José; BOGDAN, Nicolae; KATRINIS, Kostas; RAFIQUE, M. Mustafa; MAYO, Rafael; FERNÁNDEZ FERNÁNDEZ, Juan Carlos; LORETI, Daniela. Enabling big data analytics in the hybrid cloud using iterative MapReduce. UCC'15: The 8th IEEE/ACM International Conference on Utility and Cloud Computing, Dec 2015, Limassol, Cyprus. < hal-01207186 >Tipus de document
info:eu-repo/semantics/conferenceObjectVersió de l'editorial
https://hal.inria.fr/hal-01207186/enParaules clau / Matèries
Resum
The cloud computing model has seen tremendous commercial success through its materialization via two prominent models to date, namely public and private cloud. Recently, a third model combining the former two ... [+]
The cloud computing model has seen tremendous commercial success through its materialization via two prominent models to date, namely public and private cloud. Recently, a third model combining the former two service modelsas on-/off-premise resources has been receiving significant market traction: hybrid cloud. While state of art techniques that address workload performance prediction and efficient workload execution over hybrid cloud setups exist, how to address data-intensive workloads - including Big Data Analytics - in similar environments is nascent. This paper addresses
this gap by taking on the challenge of bursting over hybrid clouds for the benefit of accelerating iterative MapReduce applications. We first specify the challenges associated with data locality and data movement in such setups. Subsequently, we propose a novel technique to address the locality issue, without requiring changes to the MapReduce framework or the underlying storage layer. In addition, we contribute with a performance prediction methodology that combines modeling with micro-benchmarks to estimate completion time for iterative MapReduce applications, which enables users to estimate cost-to-solution before committing extra resources
from public clouds. We show through experimentation in a dual-Openstack hybrid cloud setup that our solutions manage to bring substantial improvement at predictable cost-control for two real-life iterative MapReduce applications: large-scale machine learning and text analysis. [-]
Drets d'accés
http://rightsstatements.org/vocab/CNE/1.0/
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess