Partial mixture model for tight clustering of gene expression time-course

Yuan, Yinyin, Li, Chang-Tsun and Wilson, Roland 2008, Partial mixture model for tight clustering of gene expression time-course, BMC bioinformatics, vol. 9, pp. 1-17, doi: 10.1186/1471-2105-9-287.

Attached Files
Name Description MIMEType Size Downloads

Title Partial mixture model for tight clustering of gene expression time-course
Author(s) Yuan, Yinyin
Li, Chang-TsunORCID iD for Li, Chang-Tsun orcid.org/0000-0003-4735-6138
Wilson, Roland
Journal name BMC bioinformatics
Volume number 9
Article ID 287
Start page 1
End page 17
Total pages 17
Publisher BioMed Central
Place of publication London, Eng.
Publication date 2008-06-18
ISSN 1471-2105
Keyword(s) Gene Ontology
Maximum Likelihood Estimator
Simulated Dataset
Partial Regression
Tight Cluster
Science & Technology
Life Sciences & Biomedicine
Biochemical Research Methods
Biotechnology & Applied Microbiology
Mathematical & Computational Biology
Biochemistry & Molecular Biology
Summary BACKGROUND: Tight clustering arose recently from a desire to obtain tighter and potentially more informative clusters in gene expression studies. Scattered genes with relatively loose correlations should be excluded from the clusters. However, in the literature there is little work dedicated to this area of research. On the other hand, there has been extensive use of maximum likelihood techniques for model parameter estimation. By contrast, the minimum distance estimator has been largely ignored. RESULTS: In this paper we show the inherent robustness of the minimum distance estimator that makes it a powerful tool for parameter estimation in model-based time-course clustering. To apply minimum distance estimation, a partial mixture model that can naturally incorporate replicate information and allow scattered genes is formulated. We provide experimental results of simulated data fitting, where the minimum distance estimator demonstrates superior performance to the maximum likelihood estimator. Both biological and statistical validations are conducted on a simulated dataset and two real gene expression datasets. Our proposed partial regression clustering algorithm scores top in Gene Ontology driven evaluation, in comparison with four other popular clustering algorithms. CONCLUSION: For the first time partial mixture model is successfully extended to time-course data analysis. The robustness of our partial regression clustering algorithm proves the suitability of the combination of both partial mixture model and minimum distance estimator in this field. We show that tight clustering not only is capable to generate more profound understanding of the dataset under study well in accordance to established biological knowledge, but also presents interesting new hypotheses during interpretation of clustering results. In particular, we provide biological evidences that scattered genes can be relevant and are interesting subjects for study, in contrast to prevailing opinion.
Language eng
DOI 10.1186/1471-2105-9-287
Field of Research 06 Biological Sciences
08 Information and Computing Sciences
01 Mathematical Sciences
HERDC Research category C1.1 Refereed article in a scholarly journal
Copyright notice ©2008, Yuan et al
Persistent URL http://hdl.handle.net/10536/DRO/DU:30120141

Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 10 times in TR Web of Science
Scopus Citation Count Cited 12 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 45 Abstract Views, 0 File Downloads  -  Detailed Statistics
Created: Sat, 23 Mar 2019, 10:20:37 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.