LocationSpark: A distributed in-memory data management system for big spatial data
المؤلف | Tang, Mingjie |
المؤلف | Yu, Yongyang |
المؤلف | Malluhi, Qutaibah M. |
المؤلف | Ouzzani, Mourad |
المؤلف | Aref, Walid G. |
تاريخ الإتاحة | 2024-07-17T07:14:38Z |
تاريخ النشر | 2015 |
اسم المنشور | Proceedings of the VLDB Endowment |
المصدر | Scopus |
المعرّف | http://dx.doi.org/10.14778/3007263.3007310 |
الرقم المعياري الدولي للكتاب | 21508097 |
الملخص | We present LocationSpark, a spatial data processing system built on top of Apache Spark, a widely used distributed data processing system. LocationSpark offers a rich set of spatial query operators, e.g., range search, kNN, spatio-textual operation, spatial-join, and kNN-join. To achieve high performance, LocationSpark employs various spatial indexes for in-memory data, and guarantees that immutable spatial indexes have low overhead with fault tolerance. In addition, we build two new layers over Spark, namely a query scheduler and a query executor. The query scheduler is responsible for mitigating skew in spatial queries, while the query executor selects the best plan based on the indexes and the nature of the spatial queries. Furthermore, to avoid unnecessary network communication overhead when processing overlapped spatial data, We embed an efficient spatial Bloom filter into LocationSpark's indexes. Finally, LocationSpark tracks frequently accessed spatial data, and dynamically ushes less frequently accessed data into disk. We evaluate our system on real workloads and demonstrate that it achieves an order of magnitude performance gain over a baseline framework. |
اللغة | en |
الناشر | VLDB Endowment |
الموضوع | Data handling Fault tolerance Location Scheduling Spatial distribution Data management system Distributed data processing Network communication overhead Performance Gain Real workloads Spatial data processing Spatial indexes Spatial queries Information management |
النوع | Conference |
الصفحات | 1565-1568 |
رقم العدد | 13 |
رقم المجلد | 9 |
الملفات في هذه التسجيلة
الملفات | الحجم | الصيغة | العرض |
---|---|---|---|
لا توجد ملفات لها صلة بهذه التسجيلة. |
هذه التسجيلة تظهر في المجموعات التالية
-
علوم وهندسة الحاسب [2402 items ]