Finding the Best of Both Worlds: Faster and More Robust Top-k Document Retrieval
Abstract
Many top-k document retrieval strategies have been proposed based on the WAND and MaxScore heuristics and yet, from recent work, it is surprisingly difficult to identify the "fastest" strategy. This becomes even more challenging when considering various retrieval criteria, like different ranking models and values of k. In this paper, we conduct the first extensive comparison between ten effective strategies, many of which were never compared before to our knowledge, examining their efficiency under five representative ranking models. Based on a careful analysis of the comparison, we propose LazyBM, a remarkably simple retrieval strategy that bridges the gap between the best performing WAND-based and MaxScore-based approaches. Empirically, LazyBM considerably outperforms all of the considered strategies across ranking models, values of k, and index configurations under both mean and tail query latency.
Collections
- Computer Science & Engineering [2402 items ]