ArabicWeb16: A new crawl for today's Arabic Web
Author | Suwaileh, Reem |
Author | Kutlu, Mucahid |
Author | Fathima, Nihal |
Author | Elsayed, Tamer |
Author | Lease, Matthew |
Available date | 2021-09-01T10:02:44Z |
Publication Date | 2016 |
Publication Name | SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Resource | Scopus |
Abstract | Web crawls provide valuable snapshots of the Web which enable a wide variety of research, be it distributional analysis to characterize Web properties or use of language, content analysis in social science, or Information Retrieval (IR) research to develop and evaluate effective search algorithms. While many English-centric Web crawls exist, existing public Arabic Web crawls are quite limited, limiting research and development. To remedy this, we present ArabicWeb16, a new public Web crawl of roughly 150M Arabic Web pages with significant coverage of dialectal Arabic as well as Modern Standard Arabic. For IR researchers, we expect ArabicWeb16 to support various research areas: ad-hoc search, question answering, filtering, cross-dialect search, dialect detection, entity search, blog search, and spam detection. 2016 ACM. |
Language | en |
Publisher | Association for Computing Machinery, Inc |
Subject | Internet Websites Ad-hoc search Arabic retrieval Evaluation Multi-Dialect Web collections Information retrieval |
Type | Conference |
Pagination | 673-676 |
Files in this item
Files | Size | Format | View |
---|---|---|---|
There are no files associated with this item. |
This item appears in the following Collection(s)
-
Computer Science & Engineering [2426 items ]