Result: GUESS: monitorinG join qUery Execution in Serverless and Serverful spark

Title:
GUESS: monitorinG join qUery Execution in Serverless and Serverful spark
Contributors:
A Symbolic and Human-centric view of dAta MANagement (SHAMAN), GESTION DES DONNÉES ET DE LA CONNAISSANCE (IRISA-D7), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), University of Oklahoma (OU)
Source:
International Conference on Database Systems for Advanced Applications (DASFAA), Jul 2024, Gifu, Japan
Publisher Information:
CCSD, 2024.
Publication Year:
2024
Collection:
collection:UNIV-RENNES1
collection:CNRS
collection:UNIV-UBS
collection:INSA-RENNES
collection:ENSSAT
collection:IRISA
collection:IRISA_SET
collection:CENTRALESUPELEC
collection:UR1-HAL
collection:UR1-MATH-STIC
collection:UR1-UFR-ISTIC
collection:TEST-UR-CSS
collection:UNIV-RENNES
collection:UR1-MATH-NUM
collection:DDRS-TEST-CJ
Subject Geographic:
Original Identifier:
HAL: hal-04544429
Document Type:
Conference conferenceObject<br />Conference papers
Language:
English
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edshal.hal.04544429v1
Database:
HAL

Further Information

This paper proposes a monitoring system called GUESS tocompare the performance and energy consumption of join query processingon Spark in Serverless and Serverful environments. The systemcollects metrics on resource utilization, query execution times, and powerusage through Prometheus, Grafana, Spark History Server, and Open-Manage Enterprise Power Manager. These metrics are visualized throughan intuitive web dashboard to enable easy comparison between Serverlessand Serverful Spark workloads. Experimental results using the TPC-Hbenchmark show that the Serverless environment consumes less energythan the Serverful environment due to on-demand resource allocation.However, the Serverful environment exhibits better query performance,especially for workloads with known resource requirements. GUESS providesinsights into optimizing resource efficiency and query performancewhen deploying Spark analytic workloads.