LocationSpark: A distributed in-memory data management system for big spatial data
Abstract
We present LocationSpark, a spatial data processing system built on top of Apache Spark, a widely used distributed data processing system. LocationSpark offers a rich set of spatial query operators, e.g., range search, kNN, spatio-textual operation, spatial-join, and kNN-join. To achieve high performance, LocationSpark employs various spatial indexes for in-memory data, and guarantees that immutable spatial indexes have low overhead with fault tolerance. In addition, we build two new layers over Spark, namely a query scheduler and a query executor. The query scheduler is responsible for mitigating skew in spatial queries, while the query executor selects the best plan based on the indexes and the nature of the spatial queries. Furthermore, to avoid unnecessary network communication overhead when processing overlapped spatial data, We embed an efficient spatial Bloom filter into LocationSpark's indexes. Finally, LocationSpark tracks frequently accessed spatial data, and dynamically ushes less frequently accessed data into disk. We evaluate our system on real workloads and demonstrate that it achieves an order of magnitude performance gain over a baseline framework.
Collections
- Computer Science & Engineering [2402 items ]
Related items
Showing items related by title, author, creator and subject.
-
TOWARDS AN UNDERSTANDING OF SPATIALITY OF INDETERMINATE SPACES: DOHA MIGRANT LABOURERS AS SPATIAL ACTOR
Khalfani, Fatma Abdullah (2015 , Master Thesis)This study investigated publicly accessible spaces where the city’s normal forces of control have not shaped their perception, usage and occupancy. The so-called indeterminate spaces were examined in traditional Doha ... -
A real-time early warning seismic event detection algorithm using smart geo-spatial bi-axial inclinometer nodes for Industry 4.0 applications
Tariq H.; Touati F.; Al-Hitmi M.A.E.; Crescini D.; Mnaouer A.B. ( MDPI AG , 2019 , Article)Earthquakes are one of the major natural calamities as well as a prime subject of interest for seismologists, state agencies, and ground motion instrumentation scientists. The real-time data analysis of multi-sensor ... -
Dual-polarized spatial-temporal propagation measurement and modeling in uma o2i scenario at 3.5 GHz
Zhang, Ruonan; Xu, Haochen; Du, Xiaojiang; Zhou, Deyun; Guizani, Mohsen ( Institute of Electrical and Electronics Engineers Inc. , 2019 , Article)Outdoor-to-indoor (O2I) coverage in urban areas by using the sub-6 GHz (sub-6G) band is important in the fifth generation (5G) mobile communication system. The spatial-temporal propagation characteristics in different ...