Efficient Parallel Skyline Query Processing for High-Dimensional Data
المؤلف | Tang M. |
المؤلف | Yu Y. |
المؤلف | Aref W.G. |
المؤلف | Malluhi Q.M. |
المؤلف | Ouzzani M. |
تاريخ الإتاحة | 2020-03-03T06:19:04Z |
تاريخ النشر | 2018 |
اسم المنشور | IEEE Transactions on Knowledge and Data Engineering |
المصدر | Scopus |
الرقم المعياري الدولي للكتاب | 10414347 |
الملخص | Given a set of multidimensional data points, skyline queries retrieve those points that are not dominated by any other points in the set. Due to the ubiquitous se of skyline queries, such as in preference-based query answering and decision making, and the large amount of data that these queries have to deal with, enabling their scalable processing is of critical importance. However, there are several outstanding challenges that have not been well addressed. More specifically, in this paper, we are tackling the data straggler and data skew challenges introduced by distributed skyline query processing, as well as the ensuing high computation cost of merging skyline candidates. We thus introduce a new efficient three-phase approach for large scale processing of skyline queries. In the first preprocessing phase, the data is partitioned along the Z-order curve. We utilize a novel data partitioning approach that formulates data partitioning as an optimization problem to minimize the size of intermediate data. In the second phase, each compute node partitions the input data points into disjoint subsets, and then performs the skyline computation on each subset to produce skyline candidates in parallel. In the final phase, we build an index and employ an efficient algorithm to merge the generated skyline candidates. Extensive experiments demonstrate that the proposed skyline algorithm achieves more than one order of magnitude enhancement in performance compared to existing state-of-the-art approaches. |
راعي المشروع | The authors would like to thank reviewers for their insightful comments on the paper, as these comments led us to an improvement of the work. This publication was made possible by award NPRP4-1534-1-247 from the Qatar National Research Fund (a member of The Qatar Foundation) and by the National Science Foundation Grant IIS 1117766. |
اللغة | en |
الناشر | IEEE Computer Society |
الموضوع | high-dimensional data parallel computing query processing Skyline query |
النوع | Article |
الصفحات | 1838 - 1851 |
رقم العدد | 10 |
رقم المجلد | 30 |
الملفات في هذه التسجيلة
الملفات | الحجم | الصيغة | العرض |
---|---|---|---|
لا توجد ملفات لها صلة بهذه التسجيلة. |
هذه التسجيلة تظهر في المجموعات التالية
-
علوم وهندسة الحاسب [2426 items ]