File(s) under permanent embargo
Challenges in baseline detection of arabic script based languages
journal contributionposted on 2014-01-01, 00:00 authored by S Naz, Imran RazzakImran Razzak, K Hayat, M W Anwar, S Z Khan
In this chapter, we present baseline detection challenges for Arabic script based languages and targeted Nastaliq and Naskh writing style. Baseline is an important step in the OCR as it directly affects the rest of the steps and increases the performance and efficiency of character segmentation and feature extraction in OCR process. Character recognition on Arabic script is relatively more difficult than Latin text due to the nature of Arabic script, which is cursive, context sensitive and different writing style. In this paper, we provide a comprehensive review of baseline detection methods for Urdu language. The aim of the chapter is to introduce the challenges during baseline detection in cursive script languages for Nastaliq and Naskh script. © 2014 Springer International Publishing Switzerland.