File(s) under permanent embargo
Top-k keyword search over probabilistic XML data
conference contribution
posted on 2011-01-01, 00:00 authored by Jianxin LiJianxin Li, Chengfei Liu, Rui Zhou, Wei WangDespite the proliferation of work on XML keyword query, it remains open to support keyword query over probabilistic XML data. Compared with traditional keyword search, it is far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efficient algorithms. The first algorithm PrStack can find k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efficiency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisfied SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results.
History
Event
IEEE Computer Society. Conference (27th : 2011 : Hannover, Germany)Series
IEEE Computer Society ConferencePagination
673 - 684Publisher
Institute of Electrical and Electronics EngineersLocation
Hannover, GermanyPlace of publication
Piscataway, N.J.Publisher DOI
Start date
2011-04-11End date
2011-04-16ISBN-13
978-1-4244-8959-6Language
engPublication classification
E1.1 Full written paper - refereedCopyright notice
2011, IEEEEditor/Contributor(s)
[Unknown]Title of proceedings
ICDE 2011 : Proceedings of the 27th International Conference on Data EngineeringUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC