Treffer: TeraHeap

Title:
TeraHeap
Source:
ACM Transactions on Programming Languages and Systems
Publication Year:
2024
Collection:
Australian National University: ANU Digital Collections
Document Type:
Fachzeitschrift article in journal/newspaper
Language:
English
DOI:
10.1145/3700593
Rights:
Publisher Copyright: © 2024 Copyright held by the owner/author(s).
Accession Number:
edsbas.DBF30768
Database:
BASE

Weitere Informationen

Big data analytics frameworks, such as Spark and Giraph, need to process and cache massive datasets that do not always fit on the managed heap. Therefore, frameworks temporarily move long-lived objects outside the heap (off-heap) on a fast storage device. However, this practice results in (1) high serialization/deserialization (S/D) cost and (2) high memory pressure when off-heap objects are moved back for processing. In this article, we propose TeraHeap, a system that eliminates S/D overhead and expensive GC scans for a large portion of objects in analytics frameworks. TeraHeap relies on three concepts: (1) It eliminates S/D by extending the managed runtime (JVM) to use a second high-capacity heap (H2) over a fast storage device. (2) It offers a simple hint-based interface, allowing analytics frameworks to leverage object knowledge to populate H2. (3) It reduces GC cost by fencing the collector from scanning H2 objects while maintaining the illusion of a single managed heap, ensuring memory safety. We implement TeraHeap in OpenJDK8 and OpenJDK17 and evaluate it with fifteen widely used applications in two real-world big data frameworks, Spark and Giraph. We find that for the same DRAM size, TeraHeap improves performance by up to 73% and 28% compared to native Spark and Giraph. Also, it can still provide better performance by consuming up to and less DRAM than native Spark and Giraph, respectively. TeraHeap can also be used for in-memory frameworks and applying it to the Neo4j Graph Data Science library improves its performance by up to 26%. Finally, it outperforms Panthera, a state-of-the-art garbage collector for hybrid DRAM-NVM memories, by up to 69%. ; We thankfully acknowledge the support of the European Commission under the Horizon 2020 Framework Programme for Research and Innovation through the projects AERO (Grant agreement No. 10048318). Iacovos G. Kolokasis is also supported by the Meta Research PhD Fellowship and the State Scholarship Foundation of Cyprus. ; Peer-reviewed