posted on 2008-01-01, 00:00authored byYongli Ren, Y Ye, Gang LiGang Li
Clustering with the agglomerative Information Bottleneck (aIB) algorithm suffers from the sub-optimality problem, which cannot guarantee to preserve as much relative information as possible. To handle this problem, we introduce a density connectivity chain, by which we consider not only the information between two data elements, but also the information among the neighbors of a data element. Based on this idea, we propose DCIB, a Density Connectivity Information Bottleneck algorithm that applies the Information Bottleneck method to quantify the relative information during the clustering procedure. As a hierarchical algorithm, the DCIB algorithm produces a pruned clustering tree-structure and gets clustering results in different sizes in a single execution. The experiment results in the documentation clustering indicate that the DCIB algorithm can preserve more relative information and achieve higher precision than the aIB algorithm.
History
Pagination
1783 - 1788
Location
Zhang Jia Jie, China
Open access
Yes
Start date
2008-11-18
End date
2008-11-21
ISBN-13
9780769533988
Language
eng
Notes
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
Publication classification
E1 Full written paper - refereed
Copyright notice
2008, IEEE
Editor/Contributor(s)
G Wang, J Chen, M Fellows, H Ma
Title of proceedings
Proceedings of the 9th International Conference for Young Computer Scientists