File(s) under permanent embargo
Classification ensemble to improve medical Named Entity Recognition
conference contribution
posted on 2014-01-01, 00:00 authored by Sara Keretna, Chee Peng LimChee Peng Lim, Douglas CreightonDouglas Creighton, K B ShabanAn accurate Named Entity Recognition (NER) is important for knowledge discovery in text mining. This paper proposes an ensemble machine learning approach to recognise Named Entities (NEs) from unstructured and informal medical text. Specifically, Conditional Random Field (CRF) and Maximum Entropy (ME) classifiers are applied individually to the test data set from the i2b2 2010 medication challenge. Each classifier is trained using a different set of features. The first set focuses on the contextual features of the data, while the second concentrates on the linguistic features of each word. The results of the two classifiers are then combined. The proposed approach achieves an f-score of 81.8%, showing a considerable improvement over the results from CRF and ME classifiers individually which achieve f-scores of 76% and 66.3% for the same data set, respectively.
History
Event
2014 IEEE International Conference on Systems, Man and CyberneticsPagination
2630 - 2636Publisher
IEEELocation
San Diego, CA, USAPlace of publication
Piscataway, NJPublisher DOI
Start date
2014-10-05End date
2014-10-08ISBN-13
9781479938407Language
EnglishPublication classification
E Conference publication; E1 Full written paper - refereedCopyright notice
2014, IEEETitle of proceedings
Proceedings of 2014 IEEE International Conference on Systems, Man and CyberneticsUsage metrics
Read the peer-reviewed publication
Categories
Keywords
Machine learningbiomedical named entity recognitionconditional random fieldinformation extractionmaximum entropymedical text miningScience & TechnologyTechnologyComputer Science, Artificial IntelligenceComputer Science, CyberneticsComputer Science, Information SystemsComputer ScienceIDENTIFICATIONRECORDSSYSTEM