Improving tweet timeline generation by predicting optimal retrieval depth
المؤلف | Hasanain, Maram |
المؤلف | Elsayed, Tamer |
المؤلف | Magdy, Walid |
تاريخ الإتاحة | 2024-11-05T06:05:21Z |
تاريخ النشر | 2015 |
اسم المنشور | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
المصدر | Scopus |
المعرّف | http://dx.doi.org/10.1007/978-3-319-28940-3_11 |
الرقم المعياري الدولي للكتاب | 3029743 |
الملخص | Tweet Timeline Generation (TTG) systems provide users with informative and concise summaries of topics, as they developed over time, in a retrospective manner. In order to produce a tweet timeline that constitutes a summary of a given topic, a TTG system typically retrieves a list of potentially-relevant tweets over which the timeline is eventually generated. In such design, dependency of the performance of the timeline generation step on that of the retrieval step is inevitable. In this work, we aim at improving the performance of a given timeline generation system by controlling the depth of the ranked list of retrieved tweets considered in generating the timeline. We propose a supervised approach in which we predict the optimal depth of the ranked tweet list for a given topic by combining estimates of list quality computed at different depths. We conducted our experiments on a recent TREC TTG test collection of 243M tweets and 55 topics. We experimented with 14 different retrieval models (used to retrieve the initial ranked list of tweets) and 3 different TTG models (used to generate the final timeline). Our results demonstrate the effectiveness of the proposed approach; it managed to improve TTG performance over a strong baseline in 76% of the cases, out of which 31% were statistically significant, with no single significant degradation observed. |
راعي المشروع | This work was made possible by NPRP grant# NPRP 6-1377-1-257 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors. |
اللغة | en |
الناشر | Springer Verlag |
الموضوع | Dynamic retrieval cutoff Microblogs Query difficulty Query performance prediction Regression Tweet summarization |
النوع | Conference |
الصفحات | 135-146 |
رقم المجلد | 9460 |
الملفات في هذه التسجيلة
هذه التسجيلة تظهر في المجموعات التالية
-
علوم وهندسة الحاسب [2402 items ]