File(s) under permanent embargo
Context-based diversification for keyword queries over XML data
journal contribution
posted on 2015-03-01, 00:00 authored by Jianxin LiJianxin Li, Chengfei Liu, Jeffrey Xu YuWhile keyword query empowers ordinary users to search vast amount of data, the ambiguity of keyword query makes it difficult to effectively answer keyword queries, especially for short and vague keyword queries. To address this challenging problem, in this paper we propose an approach that automatically diversifies XML keyword search based on its different contexts in the XML data. Given a short and vague keyword query and XML data to be searched, we first derive keyword search candidates of the query by a simple feature selection model. And then, we design an effective XML keyword search diversification model to measure the quality of each candidate. After that, two efficient algorithms are proposed to incrementally compute top-k qualified query candidates as the diversified search intentions. Two selection criteria are targeted: the k selected query candidates are most relevant to the given query while they have to cover maximal number of distinct results. At last, a comprehensive evaluation on real and synthetic data sets demonstrates the effectiveness of our proposed diversification model and the efficiency of our algorithms.
History
Journal
IEEE transactions on knowledge and data engineeringVolume
27Issue
3Pagination
660 - 672Publisher
Institute of Electrical and Electronics EngineersLocation
Piscataway, N.J.Publisher DOI
ISSN
1041-4347Language
engPublication classification
C1.1 Refereed article in a scholarly journalCopyright notice
2014, IEEEUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC