IDRISI-RA: The First Arabic Location Mention Recognition Dataset of Disaster Tweets
Author | Suwaileh, Reem |
Author | Imran, Muhammad |
Author | Elsayed, Tamer |
Available date | 2024-03-11T06:03:07Z |
Publication Date | 2023 |
Publication Name | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
Resource | Scopus |
ISSN | 0736587X |
Abstract | Extracting geolocation information from social media data enables effective disaster management, as it helps response authorities; for example, in locating incidents for planning rescue activities, and affected people for evacuation. Nevertheless, geolocation extraction is greatly understudied for the low resource languages such as Arabic. To fill this gap, we introduce IDRISI-RA, the first publicly-available Arabic Location Mention Recognition (LMR) dataset that provides human- and automatically-labeled versions in order of thousands and millions of tweets, respectively. It contains both location mentions and their types (e.g., district, city). Our extensive analysis shows the decent geographical, domain, location granularity, temporal, and dialectical coverage of IDRISI-RA. Furthermore, we establish baselines using the standard Arabic NER models and build two simple, yet effective, LMR models. Our rigorous experiments confirm the need for developing specific models for Arabic LMR in the disaster domain. Moreover, experiments show the promising domain and geographical generalizability of IDRISI-RA under zero-shot learning. |
Sponsor | This work was made possible by the Graduate Sponsorship Research Award (GSRA) #GSRA5-1-0527-18082 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors. We would like to thank the in-house annotators for their valuable help including Noura Abdullah, Aisha Suwaileh, Rasha Hamdoon, Na-jlaa Alfuhaida, Hana Shamayleh, Lamiaa Basyoni, Sara Alrasbi, and Nada Abo Eita. |
Language | en |
Publisher | Association for Computational Linguistics (ACL) |
Subject | Computational linguistics Disaster prevention Disasters Zero-shot learning Disaster management Effective location Experiment confirm Geolocations Low resource languages Recognition models Rescue activities Simple++ Social media datum Standard arabics Location |
Type | Conference Paper |
Pagination | 16298-16317 |
Volume Number | 1 |
Files in this item
This item appears in the following Collection(s)
-
Computer Science & Engineering [2402 items ]