Deakin University
Browse

File(s) under permanent embargo

Urdu Nasta’liq text recognition system based on multi-dimensional recurrent neural network and statistical features

journal contribution
posted on 2017-01-01, 00:00 authored by S Naz, A I Umar, R Ahmad, S B Ahmed, S H Shirazi, Imran RazzakImran Razzak
The Natural Computing Applications Forum. Character recognition for cursive script like Arabic, handwritten English and French is a challenging task which becomes more complicated for Urdu Nasta’liq text due to complexity of this script over Arabic. Recurrent neural network (RNN) has proved excellent performance for English, French as well as cursive Arabic script due to sequence learning property. Most of the recent approaches perform segmentation-based character recognition, whereas, due to the complexity of the Nasta’liq script, segmentation error is quite high as compared to Arabic Naskh script. RNN has provided promising results in such scenarios. In this paper, we achieved high accuracy for Urdu Nasta’liq using statistical features and multi-dimensional long short-term memory. We present a robust feature extraction approach that extracts feature based on right-to-left sliding window. Results showed that selected features significantly reduce the label error. For evaluation purposes, we have used Urdu printed text images dataset and compared the proposed approach with the recent work. The system provided 94.97 % recognition accuracy for unconstrained printed Nasta’liq text lines and outperforms the state-of-the-art results.

History

Journal

Neural computing and applications

Volume

28

Pagination

219 - 231

Publisher

Springer

Location

London, Eng.

ISSN

0941-0643

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal

Copyright notice

2015, The Natural Computing Applications Forum