Show simple item record

AuthorAl-maadeed, Somaya
AuthorIssawi, Fatima
AuthorBouridane, Ahmed
Available date2022-04-13T06:55:10Z
Publication Date2014
Publication NameQatar Foundation Annual Research Conference Proceedings
Resourceqscience
CitationAl-maadeed, Somaya. Issawi, Fatima. Bouridane, Ahmed(2014). Digitization and Indexing Of Arabic Historical Manuscripts In Qatar. Qatar Foundation Annual Research Conference Proceedings 2014: SSPP0976 https://doi.org/10.5339/qfarc.2014.SSPP0976
ISSN2226-9649
URIhttps://doi.org/10.5339/qfarc.2014.SSPP0976
URIhttp://hdl.handle.net/10576/29633
AbstractBackground Hundreds of thousands of rare Arabic manuscripts are available in Qatar. A rich archival heritage of Islamic world and Qatar are preserved in the Qatar National Library (QNL). A number of international projects have been carried out in different parts of the world to digitize Arabic manuscripts, for example by the World Digital Library in cooperation with several international bodies, such as UNESCO, the Bibliotheca Alexandrina, and King-Abdullah University of Science and Technology. Still, the above engines do not have the ability or interface to find words inside the image of a manuscript. Our indexing system was implemented on different Arabic manuscripts datasets including samples from the QNL. Objectives The goal of this research project is to be able to query for words in images of any manuscripts database, and point out the word location in the images and the equivalent text. It shows the results of the query to the user who can then view the text in our interactive website. As demonstrated in Figure 1, the website interface is aiding the users to find their query easily. Methods Through our project, we designed and implemented a novel indexing system. In that work, we present an algorithm for automatic segmentation of manuscripts. The segmented page is then manually annotated to correct mistakes in segmentation. During the correction phase information about the image is extracted and stored in a database. This extracted information is then indexed and the users can use our search interface to easily find words in any ancient manuscripts that have been added to the system. To the best of our knowledge, this is the first word search system for manuscripts that use text queries to highlight the search terms in the manuscript image. Results The focal challenge in this project is the segmentation of handwritten Arabic manuscripts to index the word by word. Therefore, manually correction of the automatic segmentation of the manuscript was added to get 100% segmentation rate. This paper discuss a robotic method indexing Arabic manuscripts that has not been developed previously for handwritten manuscripts: providing an interactive website for the word search engine, to index, store, and provide users with searching and highlighting capability in the document image. Conclusions We considered the need for converting the words available in handwritten documents into electronic data with the goal of enabling it to become searchable online. A system prototype applying the proposed and described approach is being developed and experimentally tested, to fully demonstrate the capabilities of the website on Arabic manuscripts. An overview of the initial experimental studies is presented. We expect the proposed word retrieval system to take the search in manuscripts to a new level.
Languageen
PublisherHamad bin Khalifa University Press (HBKU Press)
SubjectDigitization Of Arabic Historical Manuscripts
Indexing Of Arabic Historical Manuscripts
Qatar National Library (QNL)
TitleDigitization And Indexing Of Arabic Historical Manuscripts In Qatar
TypeConference Paper
Issue Number1
Volume Number2014


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record