You are not logged in.

Speed up health research through topic modeling of coded clinical data

Luo, Wei, Phung, Dinh, Nguyen, Vu, Tran, Truyen and Venkatesh, Svetha 2014, Speed up health research through topic modeling of coded clinical data, in IAPR 2014: Proceedings of 2nd International Workshop on Pattern Recognition for Healthcare Analytics, International Association of Pattern Recognition, [Stockholm, Sweden], pp. 1-4.

Attached Files
Name Description MIMEType Size Downloads

Title Speed up health research through topic modeling of coded clinical data
Author(s) Luo, WeiORCID iD for Luo, Wei orcid.org/0000-0002-4711-7543
Phung, DinhORCID iD for Phung, Dinh orcid.org/0000-0002-9977-8247
Nguyen, Vu
Tran, TruyenORCID iD for Tran, Truyen orcid.org/0000-0001-6531-8907
Venkatesh, Svetha
Conference name IAPR Pattern Recognition for Healthcare Worskshop (2nd: 2014: Stockholm, Sweden)
Conference location Stockholm, Sweden
Conference dates 24 Aug. 2014
Title of proceedings IAPR 2014: Proceedings of 2nd International Workshop on Pattern Recognition for Healthcare Analytics
Publication date 2014
Start page 1
End page 4
Total pages 4
Publisher International Association of Pattern Recognition
Place of publication [Stockholm, Sweden]
Summary Although random control trial is the gold standard in medical research, researchers are increasingly looking to alternative data sources for hypothesis generation and early-stage evidence collection. Coded clinical data are collected routinely in most hospitals. While they contain rich information directly related to the real clinical setting, they are both noisy and semantically diverse, making them difficult to analyze with conventional statistical tools. This paper presents a novel application of Bayesian nonparametric modeling to uncover latent information in coded clinical data. For a patient cohort, a Bayesian nonparametric model is used to reveal the common comorbidity groups shared by the patients and the proportion that each comorbidity group is reflected individual patient. To demonstrate the method, we present a case study based on hospitalization coding from an Australian hospital. The model recovered 15 comorbidity groups among 1012 patients hospitalized during a month. When patients from two areas of unequal socio-economic status were compared, it reveals higher prevalence of diverticular disease in the region of lower socio-economic status. The study builds a convincing case for routine coded data to speed up hypothesis generation.
Language eng
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category E1.1 Full written paper - refereed
ERA Research output type E Conference publication
Copyright notice ©2014, IAPR
Persistent URL http://hdl.handle.net/10536/DRO/DU:30081485

Document type: Conference Paper
Collection: School of Information Technology
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 158 Abstract Views, 2 File Downloads  -  Detailed Statistics
Created: Fri, 26 Feb 2016, 13:48:58 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.