File(s) under permanent embargo
Finding the optimal cardinality value for information bottleneck method
Information Bottleneck method can be used as a dimensionality reduction approach by grouping “similar” features together [1]. In application, a natural question is how many “features groups” will be appropriate. The dependency on prior knowledge restricts the applications of many Information Bottleneck algorithms. In this paper we alleviate this dependency by formulating the parameter determination as a model selection problem, and solve it using the minimum message length principle. An efficient encoding scheme is designed to describe the information bottleneck solutions and the original data, then the minimum message length principle is incorporated to automatically determine the optimal cardinality value. Empirical results in the documentation clustering scenario indicates that the proposed method works well for the determination of the optimal parameter value for information bottleneck method.
History
Journal
Lecture notes in computer scienceVolume
4093Pagination
594 - 605Publisher
Springer-VerlagLocation
Berlin, GermanyISSN
0302-9743eISSN
1611-3349Language
engPublication classification
C1 Refereed article in a scholarly journalCopyright notice
2006, Springer-Verlag Berlin HeidelbergUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC