Automatic diacritics restoration for modern standard Arabic text
Author | Zayyan, Ayman A. |
Author | Elmahdy, Mohamed |
Author | Husni, Husniza binti |
Author | Al Ja'am, Jihad M. |
Available date | 2021-07-05T10:58:31Z |
Publication Date | 2016 |
Publication Name | ISCAIE 2016 - 2016 IEEE Symposium on Computer Applications and Industrial Electronics |
Resource | Scopus |
Abstract | In this paper, the problem of missing diacritic marks in most of Arabic written resources is investigated. Our aim is to implement a scalable and extensible platform to automatically restore missing diacritic marks for Modern Standard Arabic text. Different rule-based and statistical techniques are proposed. These include: morphological analyzer-based, maximum likelihood estimate, and statistical n-gram models. Diacritization accuracy of each technique was evaluated based on Diacritic Error Rate (DER) and Word Error Rate (WER). The proposed platform includes helper tools for text preprocessing and encoding conversion. It yielded a WER of 7.1% and DER of 3.9%. When the case ending was ignored, the platform yielded a WER and DER of 5.1% and 2.7%, respectively. 2016 IEEE. |
Language | en |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Subject | Arabic diacritization text processing vowelization |
Type | Conference |
Pagination | 221-225 |
Files in this item
Files | Size | Format | View |
---|---|---|---|
There are no files associated with this item. |
This item appears in the following Collection(s)
-
Computer Science & Engineering [2402 items ]