Result: Synopses for Summarizing Spatial Data Streams
Further Information
In today’s data-driven landscape, geospatial streams are pivotal in diverse fields, ranging from sociology to network engineeringand to meteorology. A key challenge in utilizing these streams is to efficiently compute aggregates over ad-hoc spatial ranges,possibly with additional predicates on the stream items. For each application scenario, different aggregates become relevant, suchas the number of distinct items, the frequency of each item, or even the variance of the frequencies of the items that fall withina spatial range.Storing the entire stream for computing these aggregates is impractical in scenarios that involve fast-paced and unboundedstreams, due to prohibitive storage costs and query execution delays. To address this, we propose two sketches, SpatialSketchand DynSketch, that support aggregate queries with different types of aggregates. Both sketches require small space, and theycan summarize fast-paced streams and estimate the aggregates, with accuracy guarantees. Importantly, they support new diversefunctionalities, in a plug-and-play manner, without requiring novel theoretical analysis. In addition to the theoretical contribution,we evaluate SpatialSketch and DynSketch experimentally. Our experiments demonstrate that the two sketches outperform thestate of the art, and that they can be used for addressing novel functionalities for which there exist no small-space solutions todate.