Learning sparse latent representation and distance metric for image retrieval

Nguyen, Tu Dinh, Truyen, Tran, Phung, Dinh and Venkatesh, Svetha 2013, Learning sparse latent representation and distance metric for image retrieval, in ICME 2013 : Proceedings of the 14th IEEE International Conference on Multimedia and Expo, IEEE, Piscataway, N.J., pp. 1-6, doi: 10.1109/ICME.2013.6607435.

Attached Files
Name Description MIMEType Size Downloads

Title Learning sparse latent representation and distance metric for image retrieval
Author(s) Nguyen, Tu Dinh
Truyen, TranORCID iD for Truyen, Tran orcid.org/0000-0001-6531-8907
Phung, DinhORCID iD for Phung, Dinh orcid.org/0000-0002-9977-8247
Venkatesh, SvethaORCID iD for Venkatesh, Svetha orcid.org/0000-0001-8675-6631
Conference name Multimedia and Expo. IEEE International Conference (14th : 2013 : San Jose, California)
Conference location San Jose, California
Conference dates 15-19 Jul. 2013
Title of proceedings ICME 2013 : Proceedings of the 14th IEEE International Conference on Multimedia and Expo
Editor(s) [Unknown]
Publication date 2013
Conference series IEEE International Conference on Multimedia and Expo
Start page 1
End page 6
Total pages 6
Publisher IEEE
Place of publication Piscataway, N.J.
Keyword(s) image retrieval
restricted Boltzmann machines
metric learning
Summary The performance of image retrieval depends critically on the semantic representation and the distance function used to estimate the similarity of two images. A good representation should integrate multiple visual and textual (e.g., tag) features and offer a step closer to the true semantics of interest (e.g., concepts). As the distance function operates on the representation, they are interdependent, and thus should be addressed at the same time. We propose a probabilistic solution to learn both the representation from multiple feature types and modalities and the distance metric from data. The learning is regularised so that the learned representation and information-theoretic metric will (i) preserve the regularities of the visual/textual spaces, (ii) enhance structured sparsity, (iii) encourage small intra-concept distances, and (iv) keep inter-concept images separated. We demonstrate the capacity of our method on the NUS-WIDE data. For the well-studied 13 animal subset, our method outperforms state-of-the-art rivals. On the subset of single-concept images, we gain 79:5% improvement over the standard nearest neighbours approach on the MAP score, and 45.7% on the NDCG.
ISBN 9781479900152
Language eng
DOI 10.1109/ICME.2013.6607435
Field of Research 080109 Pattern Recognition and Data Mining
Socio Economic Objective 970108 Expanding Knowledge in the Information and Computing Sciences
HERDC Research category E1 Full written paper - refereed
HERDC collection year 2013
Copyright notice ©2013, IEEE
Persistent URL http://hdl.handle.net/10536/DRO/DU:30057165

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 8 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 727 Abstract Views, 2 File Downloads  -  Detailed Statistics
Created: Wed, 23 Oct 2013, 10:02:51 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.