Succinct contrast sets via false positive controlling with an application in clinical process redesign

Nguyen, Dang; Luo, Wei; Vo, B; Pedrycz, W

File(s) under permanent embargo

Succinct contrast sets via false positive controlling with an application in clinical process redesign

journal contribution

posted on 2020-12-01, 00:00 authored by Dang NguyenDang Nguyen, Wei LuoWei Luo, B Vo, W Pedrycz

Many applications of intelligent systems involve understanding a group of contrastively different outcome (e.g., all survivors of a deadly cancer, a top performing team in a large corporation). The intelligent system needs to identify attributes (features) which best describe or explain the group versus its alternatives. In data mining, this problem is studied under the framework of contrast set mining (CSM). Although CSM is not new, the era of big data has produced new computational and statistical challenges. In particular, existing algorithms fail (1) to perform efficiently in terms of runtime on large-scale datasets and (2) to accommodate simultaneous inference on an overwhelming array of features which are often repetitive and collinear. In this paper, we develop a CSM algorithm which addresses both challenges. The computational challenge is addressed with a tree structure and two theorems while the statistical challenge is addressed with the application of false discovery rate for multiple testing. The computational and statistical advantages of the proposed algorithm over three state-of-the-art algorithms are demonstrated with comprehensive experiments. In addition, we also show the effectiveness of our proposed method in an intelligence-system application involving hospital process redesign. The proposed method not only improves the performance of machine learning systems, but also generates succinct and insightful patterns directly relevant to clinical decision-making.

History

Journal

Expert systems with applications

Volume

161

Article number

113670

Pagination

1 - 17

Publisher

Elsevier, The Netherlands

Publisher DOI

https://doi.org/10.1016/j.eswa.2020.113670

ISSN

0957-4174

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Usage metrics

Keywords

Data mining Contrast set mining Classification False discovery rate Emergency department Length of stay (LOS)Science & Technology Technology Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science Computer Science Engineering MINING APPROACH CARE

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Succinct contrast sets via false positive controlling with an application in clinical process redesign

History

Journal

Volume

Article number

Pagination

Publisher

Publisher DOI

ISSN

Language

Publication classification

Usage metrics

Categories

Keywords

Licence

Exports