File(s) under permanent embargo
Speed up health research through topic modeling of coded clinical data
conference contribution
posted on 2014-08-24, 00:00 authored by Wei LuoWei Luo, Quoc-Dinh Phung, Tien Vu Nguyen, Truyen TranTruyen Tran, Svetha VenkateshSvetha VenkateshAlthough random control trial is the gold standard in medical research, researchers are increasingly looking to alternative data sources for hypothesis generation and early-stage evidence collection. Coded clinical data are collected routinely in most hospitals. While they contain rich information directly related to the real clinical setting, they are both noisy and semantically diverse, making them difficult to analyze with conventional statistical tools. This paper presents a novel application of Bayesian nonparametric modeling to uncover latent information in coded clinical data. For a patient cohort, a Bayesian nonparametric model is used to reveal the common comorbidity groups shared by the patients and the proportion that each comorbidity group is reflected individual patient. To demonstrate the method, we present a case study based on hospitalization coding from an Australian hospital. The model recovered 15 comorbidity groups among 1012 patients hospitalized during a month. When patients from two areas of unequal socio-economic status were compared, it reveals higher prevalence of diverticular disease in the region of lower socio-economic status. The study builds a convincing case for routine coded data to speed up hypothesis generation.
History
Event
IAPR Pattern Recognition for Healthcare Worskshop (2nd : 2014 : Stockholm, Sweden)Pagination
1 - 4Publisher
International Association of Pattern RecognitionLocation
Stockholm, SwedenPlace of publication
[Stockholm, Sweden]Start date
2014-08-24End date
2014-08-24Language
engPublication classification
E Conference publication; E1.1 Full written paper - refereedCopyright notice
2014, IAPRTitle of proceedings
IAPR 2014 : Proceedings of 2nd International Workshop on Pattern Recognition for Healthcare AnalyticsUsage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC