Deakin University
Browse

File(s) under permanent embargo

Deep OCR for Arabic script-based language like Pastho

Version 2 2024-06-06, 09:15
Version 1 2020-05-08, 09:23
journal contribution
posted on 2024-06-06, 09:15 authored by S Naz, NH Khan, S Zahoor, Imran RazzakImran Razzak
Developing cursive script recognition systems have always been a challenging task for researchers. This article proposes a ligature‐based recognition system for the cursive Pashto script using four pre‐trained CNN models using a fine‐tuned approach. The SqueezeNet, ResNet, MobileNet and DenseNet models have been observed for the classification and the recognition of Pashto sub‐word (ligature). Overall, the proposed system is divided into two domains (Source and Target). The source domain contains the pre‐trained models used on the ImageNet Dataset. These models are later fine‐tuned using the transfer learning approach to be used for the Pashto ligature recognition. The data augmentation techniques of negative and contour are used to increase the representation of ligature images and the dataset size. The CNN models have been evaluated on the benchmarks Pashto ligatures FAST‐NU dataset. The proposed system achieved the highest recognition rate of up to 99.31% using the DenseNet architecture of Convolutional Neural Network for Pashto ligature.

History

Journal

Expert Systems

Volume

37

Article number

ARTN e12565

Location

London, Eng.

ISSN

0266-4720

eISSN

1468-0394

Language

English

Notes

In Press

Publication classification

C1 Refereed article in a scholarly journal

Issue

5

Publisher

WILEY