Modern datasets are becoming heterogeneous. To this end, we present in this paper Mixed- Variate Restricted Boltzmann Machines for simultaneously modelling variables of multiple types and modalities, including binary and continuous responses, categorical options, multicategorical choices, ordinal assessment and category-ranked preferences. Dependency among variables is modeled using latent binary variables, each of which can be interpreted as a particular hidden aspect of the data. The proposed model, similar to the standard RBMs, allows fast evaluation of the posterior for the latent variables. Hence, it is naturally suitable for many common tasks including, but not limited to, (a) as a pre-processing step to convert complex input data into a more convenient vectorial representation through the latent posteriors, thereby oering a dimensionality reduction capacity, (b) as a classier supporting binary, multiclass, multilabel, and label-ranking outputs, or a regression tool for continuous outputs and (c) as a data completion tool for multimodal and heterogeneous data. We evaluate the proposed model on a large-scale dataset using the world opinion survey results on three tasks: feature extraction and visualization, data completion and prediction.
History
Pagination
213 - 229
Location
Taoyuan, Taiwan
Open access
Yes
Start date
2011-11-13
End date
2011-11-15
Language
eng
Publication classification
E1.1 Full written paper - refereed
Copyright notice
2011, The Authors
Editor/Contributor(s)
C Hsu, W Lee
Title of proceedings
ACML 2011 : Proceedings of the 3rd Asian Conference on Machine Learning