عرض بسيط للتسجيلة

المؤلفSuwaileh, Reem
المؤلفElsayed, Tamer
المؤلفImran, Muhammad
تاريخ الإتاحة2024-03-11T06:03:07Z
تاريخ النشر2023
اسم المنشورInformation Processing and Management
المصدرScopus
الرقم المعياري الدولي للكتاب3064573
معرّف المصادر الموحدhttp://dx.doi.org/10.1016/j.ipm.2023.103340
معرّف المصادر الموحدhttp://hdl.handle.net/10576/52846
الملخصWhile utilizing Twitter data for crisis management is of interest to different response authorities, a critical challenge that hinders the utilization of such data is the scarcity of automated tools that extract geolocation information. The limited focus on Location Mention Recognition (LMR) in tweets, specifically, is attributed to the lack of a standard dataset that enables research in LMR. To bridge this gap, we present IDRISI-RE, a large-scale human-labeled LMR dataset comprising around 20.5k tweets. The annotated location mentions within the tweets are also assigned location types (e.g., country, city, street, etc.). IDRISI-RE contains tweets from 19 disaster events of diverse types (e.g., flood and earthquake) covering a wide geographical area of 22 English-speaking countries. Additionally, IDRISI-RE contains about 56.6k automatically-labeled tweets that we offer as a silver dataset. To highlight the superiority of IDRISI-RE over past efforts, we present rigorous analyses on reliability, consistency, coverage, diversity, and generalizability. Furthermore, we benchmark IDRISI-RE using a representative set of LMR models to provide the community with baselines for future work. Our extensive empirical analysis shows the promising generalizability of IDRISI-RE compared to existing datasets. We show that models trained on IDRISI-RE better tackle domain shifts and are less susceptible to change in geographical areas.
راعي المشروعThis work was made possible by the Graduate Sponsorship Research Award (GSRA) #GSRA5-1-0527-18082 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
اللغةen
الناشرElsevier
الموضوعDataset
Disaster management
Domain generalizability
Geographical generalizability
Geolocation
Location mention recognition
Twitter
العنوانIDRISI-RE: A generalizable dataset with benchmarks for location mention recognition on disaster tweets
النوعArticle
رقم العدد3
رقم المجلد60


الملفات في هذه التسجيلة

Thumbnail

هذه التسجيلة تظهر في المجموعات التالية

عرض بسيط للتسجيلة