Isolated Handwritten Pashto Characters Recognition using KNN classifier
Abstract
Regional and cultural diversities around the world and especially in Pakistan has given birth to a large number of writing systems and scripts consists of varying character sets. Developing an optimal OCR system for such a varying and large character set is a challenging task. Unlimited variations in handwritten texts due to mood swings, varying writing styles, changes in medium of writing, etc. puzzles the research community. Slight change in character shapes for various scripts acts as a big barrier in developing the character recognition (CR) systems for cursive scripts. Unavailability of benchmark results and corpora for cursive scripts CR impedes the researchers in the development of an optimal CR systems. To efficiently address these issues, the proposed research work aims to develop an optimum OCR system for the recognition of handwritten Pashto characters. Also the unavailability of a standard corpora of the handwritten Pashto characters is addressed by developing a medium sized corpus of the handwritten Pashto characters (14784 handwritten samples). K nearest neighbor is adapted for the recognition of the Pashto characters based on the zoning technique. After testing the proposed OCR system for varying training and test sets an overall accuracy of 85.31% is calculated.
Collections
- Accounting & Information Systems [544 items ]