HBKU at TREC 2020: Conversational Multi-Stage Retrieval with Pseudo-Relevance Feedback
Abstract
Passage retrieval in a conversational context is extremely challenging due to limited data resources. Information seeking in a conversational setting may contain omissions, implied context, and topic shifts. TREC CAsT promotes research in this field by aiming to create a reusable dataset for open-domain conversational information seeking (CIS). The track achieves this goal by defining a passage retrieval task in a multi-turn conversation setting. Understanding conversation context and history is a key factor in this challenge. This solution addresses this challenge by implementing a multi-stage retrieval pipeline inspired by last year's winning algorithm. The first stage in this retrieval process is a historical query expansion step from last year's winning algorithm where context is extracted from historical queries in the conversation. The second stage is the addition of a pseudo-relevance feedback step where the query is expanded using top-k retrieved passages. Finally, a pre-trained BERT passage re-ranker is used. The solution performed better than the median results of other submitted runs with an NDCG@3 of 0.3127 for the best performing run.
DOI/handle
http://hdl.handle.net/10576/60890Collections
- Computer Science & Engineering [2402 items ]