Deakin University
Browse

File(s) under permanent embargo

Topic model kernel classification with probabilistically reduced features

journal contribution
posted on 2015-04-01, 00:00 authored by Tien Vu Nguyen, Quoc-Dinh Phung, Svetha VenkateshSvetha Venkatesh
Probabilistic topic models have become a standard in modern machine learning to deal with a wide range of applications. Representing data by dimensional reduction of mixture proportion extracted from topic models is not only richer in semantics interpretation, but could also be informative for classification tasks. In this paper, we describe the Topic Model Kernel (TMK), a topicbased kernel for Support Vector Machine classification on data being processed by probabilistic topic models. The applicability of our proposed kernel is demonstrated in several classification tasks with real world datasets. TMK outperforms existing kernels on the distributional features and give comparative results on nonprobabilistic data types.

History

Journal

Journal of data science

Volume

13

Issue

2

Pagination

323 - 340

Publisher

Department of Statistics, Columbia University

Location

New York, N.Y.

ISSN

1680-743X

eISSN

1683-8602

Language

eng

Publication classification

C Journal article; C1 Refereed article in a scholarly journal

Copyright notice

2015, Department of Statistics, Columbia University