A hybrid approach for NER system for Scarce Resourced Language-URDU: Integrating n-gram with rules and gazetteers

Naz, Saeeda, Umar, Arif Iqbal and Razzak, Muhammad Imran 2015, A hybrid approach for NER system for Scarce Resourced Language-URDU: Integrating n-gram with rules and gazetteers, Mehran University Research Journal Of Engineering & Technology, vol. 34, no. 4, pp. 349-358.

Attached Files
Name Description MIMEType Size Downloads

Title A hybrid approach for NER system for Scarce Resourced Language-URDU: Integrating n-gram with rules and gazetteers
Author(s) Naz, Saeeda
Umar, Arif Iqbal
Razzak, Muhammad ImranORCID iD for Razzak, Muhammad Imran orcid.org/0000-0002-3930-6600
Journal name Mehran University Research Journal Of Engineering & Technology
Volume number 34
Issue number 4
Start page 349
End page 358
Total pages 10
Publisher Mehran University of Engineering and Technology
Place of publication Jamshoro, Pakistan
Publication date 2015-10
ISSN 0254-7821
2413-7219
Keyword(s) Science & Technology
Technology
Engineering, Multidisciplinary
Engineering
Entity Recognition
Named Entities
N-Gram Model
Gazetteer Lists
Linguistics--Study and teaching
Arabic language
Research--Methodology
Summary We present a hybrid NER (Name Entity Recognition) system for Urdu script by integration of n-gram model (unigram and bigram), rules and gazetteers. We used prefix and suffix characters for rule construction instead of first name and last name lists or potential terms on the output list that is produced by n-gram model. Evaluation of the system is performed on two corpora, the IJCNLP NE (Named Entity) corpus and CRL NE corpus in Urdu text. The system achieved 92.65 and 87.6% using hybrid unigram and 92.47 and 86.83% using hybrid bigram on IJCNLP NE corpus and CRL NE corpus, respectively.
Language eng
Indigenous content off
HERDC Research category C1.1 Refereed article in a scholarly journal
Copyright notice ©2015, Mehran University of Engineering & Technology
Persistent URL http://hdl.handle.net/10536/DRO/DU:30146651

Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 2 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 23 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Tue, 12 Jan 2021, 10:31:34 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.