عرض بسيط للتسجيلة

المؤلفAl-Rasbi, Sara
المؤلفElsayed, Tamer
تاريخ الإتاحة2024-11-05T06:05:20Z
تاريخ النشر2020
اسم المنشور2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies, ICIoT 2020
المصدرScopus
المعرّفhttp://dx.doi.org/10.1109/ICIoT48696.2020.9089558
معرّف المصادر الموحدhttp://hdl.handle.net/10576/60887
الملخصSearch engines have to deal with a huge amount of data in scalable and efficient ways to produce effective search results. In this paper, we address the problem of building an efficient and scalable experimental search engine over Spark, an in-memory distributed big data processing framework. The proposed system, SparkIR, can serve as a research framework for conducting information retrieval (IR) experiments. SparkIR supports document-based partitioning scheme for indexing and document-at-a-time (DAAT) for query evaluation. Moreover, it offers static pruning (using champion list) to improve the retrieval efficiency. We evaluated the performance of SparkIR using ClueWeb12-B13 collection that contains about 50M English Web pages. Experiments over different subsets of the collection showed that SparkIR exhibits reasonable efficiency and scalability performance overall for both indexing and retrieval.
اللغةen
الناشرInstitute of Electrical and Electronics Engineers Inc.
الموضوعBig Data
Distributed Systems
Efficiency
Information Retrieval
Scalability
Spark
SparkIR
العنوانCan We Build a Search Engine over Spark?
النوعConference Paper
الصفحات345-350
dc.accessType Full Text


الملفات في هذه التسجيلة

Thumbnail

هذه التسجيلة تظهر في المجموعات التالية

عرض بسيط للتسجيلة