You are not logged in.

Regularizing topic discovery in emrs with side information by using hierarchical bayesian models

Li,C, Rana,S, Phung,D and Venkatesh,S 2014, Regularizing topic discovery in emrs with side information by using hierarchical bayesian models, in ICPR 2014 : Proceedings of the 22nd International Conference on Pattern Recognition, IEEE, Piscataway, N.J., pp. 1307-1312, doi: 10.1109/ICPR.2014.234.

Attached Files
Name Description MIMEType Size Downloads

Title Regularizing topic discovery in emrs with side information by using hierarchical bayesian models
Author(s) Li,C
Rana,SORCID iD for Rana,S orcid.org/0000-0003-2247-850X
Phung,DORCID iD for Phung,D orcid.org/0000-0002-9977-8247
Venkatesh,SORCID iD for Venkatesh,S orcid.org/0000-0001-8675-6631
Conference name Pattern Recognition. Conference (22nd : 2014 : Stockholm, Sweden)
Conference location Stockholm, Sweden
Conference dates 2014/8/24 - 2014/8/28
Title of proceedings ICPR 2014 : Proceedings of the 22nd International Conference on Pattern Recognition
Editor(s) [Unknown]
Publication date 2014
Conference series Pattern Recognition Conference
Start page 1307
End page 1312
Total pages 6
Publisher IEEE
Place of publication Piscataway, N.J.
Keyword(s) Medical application
Readmission
Side information
Topic analysis
Tree structure
Words
Summary We propose a novel hierarchical Bayesian framework, word-distance-dependent Chinese restaurant franchise (wd-dCRF) for topic discovery from a document corpus regularized by side information in the form of word-to-word relations, with an application on Electronic Medical Records (EMRs). Typically, a EMRs dataset consists of several patients (documents) and each patient contains many diagnosis codes (words). We exploit the side information available in the form of a semantic tree structure among the diagnosis codes for semantically-coherent disease topic discovery. We introduce novel functions to compute word-to-word distances when side information is available in the form of tree structures. We derive an efficient inference method for the wddCRF using MCMC technique. We evaluate on a real world medical dataset consisting of about 1000 patients with PolyVascular disease. Compared with the popular topic analysis tool, hierarchical Dirichlet process (HDP), our model discovers topics which are superior in terms of both qualitative and quantitative measures.
ISBN 9781479952083
ISSN 1051-4651
Language eng
DOI 10.1109/ICPR.2014.234
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category E1 Full written paper - refereed
ERA Research output type E Conference publication
Copyright notice ©2014, IEEE
Persistent URL http://hdl.handle.net/10536/DRO/DU:30072477

Document type: Conference Paper
Collection: Centre for Pattern Recognition and Data Analytics
Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 1 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 193 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Fri, 08 May 2015, 13:50:38 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.