Deakin University
Browse

File(s) under permanent embargo

A matrix factorization framework for jointly analyzing multiple nonnegative data source

conference contribution
posted on 2011-01-01, 00:00 authored by Sunil GuptaSunil Gupta, Quoc-Dinh Phung, B Adams, Svetha VenkateshSvetha Venkatesh
Nonnegative matrix factorization based methods provide one of the simplest and most effective approaches to text mining. However, their applicability is mainly limited to analyzing a single data source. In this paper, we propose a novel joint matrix factorization framework which can jointly analyze multiple data sources by exploiting their shared and individual structures. The proposed framework is flexible to handle any arbitrary sharing configurations encountered in real world data. We derive an efficient algorithm for learning the factorization and show that its convergence is theoretically guaranteed. We demonstrate the utility and effectiveness of the proposed framework in two real-world applications–improving social media retrieval using auxiliary sources and cross-social media retrieval. Representing each social media source using their textual tags, for both applications, we show that retrieval performance exceeds the existing state-of-the-art techniques. The proposed solution provides a generic framework and can be applicable to a wider context in data mining wherever one needs to exploit mutual and individual knowledge present across multiple data sources.

History

Event

Workshop on Text Mining (9th : 2011 : Mesa, Ariz.)

Pagination

6 - 15

Publisher

Society for Industrial and Applied Mathematics

Location

Mesa, Ariz.

Place of publication

[Mesa, Ariz.]

Start date

2011-04-30

Language

eng

Publication classification

E1.1 Full written paper - refereed

Title of proceedings

Proceedings of the 9th Workshop on Text Mining, in conjunction with the 11th SIAM International Conference on Data Mining

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC