Light-weight, Conservative, yet Effective: Scalable Real-time Tweet Summarization
Author | Suwaileh, Reem |
Author | Hasanain, Maram |
Author | Elsayed, Tamer |
Available date | 2024-03-11T06:03:08Z |
Publication Date | 2016 |
Publication Name | 25th Text REtrieval Conference, TREC 2016 - Proceedings |
Resource | Scopus |
Abstract | Microblogging platforms and Twitter specifically have become a major resource for exploring diverse topics of interest that vary from the world's breaking news to other topics such as sports, science, religion and even personal daily updates. Nevertheless, one by herself cannot easily follow her topics of interest while tackling the challenges that stem from the Twitter timeline nature. Among those challenges is the huge amount of posted tweets that are either not interesting, noisy, or redundant. Additionally, one cannot survive with manual techniques to summarize tweets related to topics that are discussed on the stream and are developed rapidly. In this paper, we tackle the problem of summarizing a stream of tweets given a pre-defined set of topics in the context of Qatar University's participation in TREC-2016 Real-Time Summarization (RTS) track. We participated in both push notification and e-mail digest scenarios. Given a set of users' interest profiles, our RTS system for push notifications scenario adopts a light-weight and conservative filtering strategy that monitors the continuous stream of tweets over a pipeline of multiple stages, while maintaining a scalable processing of a large number of interest profiles. For the e-mail digest scenario, we adopted a similar but even simpler approach. At the end of each day, a list of potentially relevant tweets is retrieved using a query of topic title terms that is issued against an index of all streamed tweets of that day. Our push-notification runs exhibited the best performance among all submitted automatic runs in the push notification task this year. Moreover, our best-performing email-digest run was the second-best among all submitted automatic runs in the email-digest task this year. However, the evaluation results show that the performance is still away from being adopted in practice. |
Sponsor | This work was made possible by NPRP grant# NPRP 6-1377-1-257 and NPRP grant# NPRP 7-1313-1-245 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors. |
Language | en |
Publisher | National Institute of Standards and Technology (NIST) |
Subject | Electronic mail Information retrieval Pipeline processing systems User profile Conservative filtering Light weight Manual techniques Micro-blogging platforms Performance Qatar university Real- time Sport science Summarization systems User interest profile Social networking (online) |
Type | Conference Paper |
Files in this item
Files | Size | Format | View |
---|---|---|---|
There are no files associated with this item. |
This item appears in the following Collection(s)
-
Computer Science & Engineering [2402 items ]