عرض بسيط للتسجيلة

المؤلفDuwairi, Rehab
المؤلفAl-Refai, Mohammad
المؤلفKhasawneh, Natheer
تاريخ الإتاحة2009-12-28T05:33:38Z
تاريخ النشر2007-11-18
اسم المنشور4th International Conference onInnovations in Information Technology 2007
الاقتباسDuwairi, R.; Al-Refai, M.; Khasawneh, N., "Stemming Versus Light Stemming as Feature Selection Techniques for Arabic Text Categorization," Innovations in Information Technology, 2007. IIT '07. 4th International Conference on , vol., no., pp.446-450, 18-20 Nov. 2007
معرّف المصادر الموحدhttp://dx.doi.org/10.1109/IIT.2007.4430403
معرّف المصادر الموحدhttp://hdl.handle.net/10576/10501
الملخصThis paper compares and contrasts two feature selection techniques when applied to Arabic corpus; in particular; stemming, and light stemming were employed. With stemming, words are reduced to their stems. With light stemming, words are reduced to their light stems. Stemming is aggressive in the sense that it reduces words to their 3-letters roots. This affects the semantics as several words with different meanings might have the same root. Light stemming, by comparison, removes frequently used prefixes and suffixes in Arabic words. Light stemming doesn't produce the root and therefore doesn't affect the semantics of words; it maps several words, which have the same meaning to a common syntactical form. The effectiveness of above two feature selection techniques was assessed in a text categorization exercise for Arabic corpus. This corpus consists of 15000 documents that fall into three categories. The K-nearest neighbors (KNN) classifier was used in this work. Several experiments were carried out using two different representations of the same corpus; the first version uses stem- vectors; and the second uses light stem-vectors as representatives of documents. These two representations were assessed in terms of size, time and accuracy. The light stem representation was superior in terms of classifier accuracy when compared with stemming.
اللغةen
الناشرIEEE
الموضوعArabic language
K-nearest neighbors classifier
feature selection
light-stemming
stemming
text categorization
العنوانStemming Versus Light Stemming as Feature Selection Techniques for Arabic Text Categorization
النوعConference Paper


الملفات في هذه التسجيلة

الملفاتالحجمالصيغةالعرض

لا توجد ملفات لها صلة بهذه التسجيلة.

هذه التسجيلة تظهر في المجموعات التالية

عرض بسيط للتسجيلة