When a disaster happens, we are ready: Location mention recognition from crisis tweets
الملخص
Geolocation information is important for humanitarian organizations to gain situational awareness and deliver timely aid during disasters. Towards addressing the problem of recognizing locations, i.e., Location Mention Recognition (LMR), within social media posts during disasters, past studies mainly focused on proposing techniques that assume the availability of abundant training data at the disaster onset. In this work, we adopt the more realistic assumption that no (i.e., zero-shot setting) or as little as a few hundred examples (i.e., few-shot setting) from the just-occurred event is available for training. Specifically, we examine the effect of training a BERT-based LMR model on past events using different settings, datasets, languages, and geo-proximity. Extensive empirical analysis provides several insights for building an effective LMR model during disasters, including (i) Twitter crisis-related and location-specific data from geographically-nearby disaster events is more useful than all other combinations of training datasets in the zero-shot monolingual setting, (ii) using as few as 263–356 training tweets from the target language (i.e., few-shot setting) remarkably boosts the performance in the cross- and multilingual settings, and (iii) labeling about 500 target event's tweets leads to an acceptable LMR performance, higher than F1 of 0.7, in the monolingual settings. Finally, we conduct an extensive error analysis and highlight issues related to the quality of the available datasets and weaknesses of the current model.
المجموعات
- علوم وهندسة الحاسب [2402 items ]