Treffer: Tuplex ; robust, efficient analytics when Python rules
Title:
Tuplex ; robust, efficient analytics when Python rules
Authors:
Source:
Proceedings of the VLDB Endowment ; volume 12, issue 12, page 1958-1961 ; ISSN 2150-8097
Publisher Information:
Association for Computing Machinery (ACM)
Publication Year:
2019
Document Type:
Fachzeitschrift
article in journal/newspaper
Language:
English
DOI:
10.14778/3352063.3352109
Availability:
Accession Number:
edsbas.E2FF36DE
Database:
BASE
Weitere Informationen
Spark became the defacto industry standard as an execution engine for data preparation, cleaning, distributed machine learning, streaming and, warehousing over raw data. However, with the success of Python the landscape is shifting again; there is a strong demand for tools which better integrate with the Python landscape and do not have the impedance mismatch like Spark. In this paper, we demonstrate Tuplex (short for tuples and exceptions ), a Python-native data preparation framework that allows users to develop and deploy pipelines faster and more robustly while providing bare-metal execution times through code compilation whenever possible.