BACKGROUND LINKING OF NEWS ARTICLES
Abstract
Nowadays, it is very rare to find a single news article that solely contains all the information about a certain subject or event. This dissertation focuses on addressing the news background linking problem, which aims to find news resources that can be linked to a query news article to help readers understand its background and context. We propose multiple approaches for addressing both the effectiveness and the efficiency aspects of this problem. We first conduct a qualitative analysis of some query articles and its background links to understand the notion of background relevance. Based on the insights drawn from this analysis, we propose two approaches for reranking a candidate set of background links that integrate the lexical and the semantic matching signals between the query article and its candidate links. We experimented with both Transformer-based models and Large Language Models, in a Zero-shot setting, to semantically link query articles to its background links. We conducted our experiments on theWashington Post news dataset, specifically released for news background linking, and we show that our approaches achieve a new state of the art performance for this problem. We moreover propose a query reduction approach that can speedup the retrieval of candidate links by up to 13.3x times through reducing the query article into representative search queries that are employed to retrieve the required background links in an ad-hoc setting. Our work further opens future research directions in addressing the news background linking problem through the analysis of the performance of our proposed approaches on different query articles, paving the way for enhancing the overall quality and depth of news reporting.
DOI/handle
http://hdl.handle.net/10576/62817Collections
- Computing [103 items ]