LocationSpark: A distributed in-memory data management system for big spatial data
Abstract
We present LocationSpark, a spatial data processing system built on top of Apache Spark, a widely used distributed data processing system. LocationSpark offers a rich set of spatial query operators, e.g., range search, kNN, spatio-textual operation, spatial-join, and kNN-join. To achieve high performance, LocationSpark employs various spatial indexes for in-memory data, and guarantees that immutable spatial indexes have low overhead with fault tolerance. In addition, we build two new layers over Spark, namely a query scheduler and a query executor. The query scheduler is responsible for mitigating skew in spatial queries, while the query executor selects the best plan based on the indexes and the nature of the spatial queries. Furthermore, to avoid unnecessary network communication overhead when processing overlapped spatial data, We embed an efficient spatial Bloom filter into LocationSpark's indexes. Finally, LocationSpark tracks frequently accessed spatial data, and dynamically ushes less frequently accessed data into disk. We evaluate our system on real workloads and demonstrate that it achieves an order of magnitude performance gain over a baseline framework.
Collections
- Computer Science & Engineering [2426 items ]
Related items
Showing items related by title, author, creator and subject.
-
TOWARDS AN UNDERSTANDING OF SPATIALITY OF INDETERMINATE SPACES: DOHA MIGRANT LABOURERS AS SPATIAL ACTOR
Khalfani, Fatma Abdullah (2016 , Master Thesis)This study investigated publicly accessible spaces where the city’s normal forces of control have not shaped their perception, usage and occupancy. The so-called indeterminate spaces were examined in traditional Doha ... -
Pedestrian flow characteristics through different angled bends: Exploring the spatial variation of velocity
Hannun, Jamal; Dias, Charitha; Taha, Alaa Hasan; Almutairi, Abdulaziz; Alhajyaseen, Wael; Sarvi, Majid; Al-Bosta, Salim... more authors ... less authors ( Public Library of Science (PLOS) , 2022 , Article)Common geometrical layouts could potentially be bottlenecks, particularly during emergency and high density situations. When pedestrians are interacting with such complex geometrical settings, the congestion effect might ... -
Spatial Associations between COVID-19 Incidence Rates and Work Sectors: Geospatial Modeling of Infection Patterns among Migrants in Oman
Mansour, Shawky; Abulibdeh, Ammar; Alahmadi, Mohammed; Al-Said, Adham; Al-Said, Alkhattab; Watmough, Gary; Atkinson, Peter M.... more authors ... less authors ( Routledge , 2022 , Article)Migrants are among the groups most vulnerable to infection with viruses due to the social and economic conditions in which they live. Therefore, spatial modeling of virus transmission among migrants is important for ...