LocationSpark: A distributed in-memory data management system for big spatial data
Author | Tang, Mingjie |
Author | Yu, Yongyang |
Author | Malluhi, Qutaibah M. |
Author | Ouzzani, Mourad |
Author | Aref, Walid G. |
Available date | 2024-07-17T07:14:38Z |
Publication Date | 2015 |
Publication Name | Proceedings of the VLDB Endowment |
Resource | Scopus |
Identifier | http://dx.doi.org/10.14778/3007263.3007310 |
ISSN | 21508097 |
Abstract | We present LocationSpark, a spatial data processing system built on top of Apache Spark, a widely used distributed data processing system. LocationSpark offers a rich set of spatial query operators, e.g., range search, kNN, spatio-textual operation, spatial-join, and kNN-join. To achieve high performance, LocationSpark employs various spatial indexes for in-memory data, and guarantees that immutable spatial indexes have low overhead with fault tolerance. In addition, we build two new layers over Spark, namely a query scheduler and a query executor. The query scheduler is responsible for mitigating skew in spatial queries, while the query executor selects the best plan based on the indexes and the nature of the spatial queries. Furthermore, to avoid unnecessary network communication overhead when processing overlapped spatial data, We embed an efficient spatial Bloom filter into LocationSpark's indexes. Finally, LocationSpark tracks frequently accessed spatial data, and dynamically ushes less frequently accessed data into disk. We evaluate our system on real workloads and demonstrate that it achieves an order of magnitude performance gain over a baseline framework. |
Language | en |
Publisher | VLDB Endowment |
Subject | Data handling Fault tolerance Location Scheduling Spatial distribution Data management system Distributed data processing Network communication overhead Performance Gain Real workloads Spatial data processing Spatial indexes Spatial queries Information management |
Type | Conference Paper |
Pagination | 1565-1568 |
Issue Number | 13 |
Volume Number | 9 |
Files in this item
Files | Size | Format | View |
---|---|---|---|
There are no files associated with this item. |
This item appears in the following Collection(s)
-
Computer Science & Engineering [2402 items ]