Automatic diacritics restoration for modern standard Arabic text
المؤلف | Zayyan, Ayman A. |
المؤلف | Elmahdy, Mohamed |
المؤلف | Husni, Husniza binti |
المؤلف | Al Ja'am, Jihad M. |
تاريخ الإتاحة | 2021-07-05T10:58:31Z |
تاريخ النشر | 2016 |
اسم المنشور | ISCAIE 2016 - 2016 IEEE Symposium on Computer Applications and Industrial Electronics |
المصدر | Scopus |
الملخص | In this paper, the problem of missing diacritic marks in most of Arabic written resources is investigated. Our aim is to implement a scalable and extensible platform to automatically restore missing diacritic marks for Modern Standard Arabic text. Different rule-based and statistical techniques are proposed. These include: morphological analyzer-based, maximum likelihood estimate, and statistical n-gram models. Diacritization accuracy of each technique was evaluated based on Diacritic Error Rate (DER) and Word Error Rate (WER). The proposed platform includes helper tools for text preprocessing and encoding conversion. It yielded a WER of 7.1% and DER of 3.9%. When the case ending was ignored, the platform yielded a WER and DER of 5.1% and 2.7%, respectively. 2016 IEEE. |
اللغة | en |
الناشر | Institute of Electrical and Electronics Engineers Inc. |
الموضوع | Arabic diacritization text processing vowelization |
النوع | Conference |
الصفحات | 221-225 |
الملفات في هذه التسجيلة
الملفات | الحجم | الصيغة | العرض |
---|---|---|---|
لا توجد ملفات لها صلة بهذه التسجيلة. |
هذه التسجيلة تظهر في المجموعات التالية
-
علوم وهندسة الحاسب [2402 items ]