Finding the Best of Both Worlds: Faster and More Robust Top-k Document Retrieval
Author | Khattab, Omar |
Author | Hammoud, Mohammad |
Author | Elsayed, Tamer |
Available date | 2024-11-05T06:05:20Z |
Publication Date | 2020 |
Publication Name | SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval |
Resource | Scopus |
Identifier | http://dx.doi.org/10.1145/3397271.3401076 |
Abstract | Many top-k document retrieval strategies have been proposed based on the WAND and MaxScore heuristics and yet, from recent work, it is surprisingly difficult to identify the "fastest" strategy. This becomes even more challenging when considering various retrieval criteria, like different ranking models and values of k. In this paper, we conduct the first extensive comparison between ten effective strategies, many of which were never compared before to our knowledge, examining their efficiency under five representative ranking models. Based on a careful analysis of the comparison, we propose LazyBM, a remarkably simple retrieval strategy that bridges the gap between the best performing WAND-based and MaxScore-based approaches. Empirically, LazyBM considerably outperforms all of the considered strategies across ranking models, values of k, and index configurations under both mean and tail query latency. |
Sponsor | We thank Yousuf Ahmad, Reem Suwaileh, and Mucahid Kutlu for valuable discussions and insights. This publication was made possible by NPRP grant# NPRP 7-1330-2-483 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors. |
Language | en |
Publisher | Association for Computing Machinery, Inc |
Subject | dynamic pruning efficiency query evaluation web search |
Type | Conference |
Pagination | 1031-1040 |
Files in this item
This item appears in the following Collection(s)
-
Computer Science & Engineering [2402 items ]