Although tagging has become increasingly popular in online image and video sharing systems, tags are known to be noisy, ambiguous, incomplete and subjective. These factors can seriously affect the precision of a social tag-based web retrieval system. Therefore improving the precision performance of these social tag-based web retrieval systems has become an increasingly important research topic. To this end, we propose a shared subspace learning framework to leverage a secondary source to improve retrieval performance from a primary dataset. This is achieved by learning a shared subspace between the two sources under a joint Nonnegative Matrix Factorization in which the level of subspace sharing can be explicitly controlled. We derive an efficient algorithm for learning the factorization, analyze its complexity, and provide proof of convergence. We validate the framework on image and video retrieval tasks in which tags from the LabelMe dataset are used to improve image retrieval performance from a Flickr dataset and video retrieval performance from a YouTube dataset. This has implications for how to exploit and transfer knowledge from readily available auxiliary tagging resources to improve another social web retrieval system. Our shared subspace learning framework is applicable to a range of problems where one needs to exploit the strengths existing among multiple and heterogeneous datasets.
History
Pagination
1169-1178
Location
Washington, D. C.
Start date
2010-07-25
End date
2010-07-28
ISBN-13
9781450300551
Language
eng
Publication classification
E1.1 Full written paper - refereed
Copyright notice
2010, IEEE
Title of proceedings
KDD 2010 : Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Event
International Conference on Knowledge Discovery and Data Mining (16th : 2010 : Washington, D. C.)