Top-k keyword search over probabilistic XML data

Li, Jianxin; Liu, Chengfei; Zhou, Rui; Wang, Wei

Top-k keyword search over probabilistic XML data

conference contribution

posted on 2011-01-01, 00:00 authored by Jianxin Li, Chengfei Liu, Rui Zhou, Wei Wang

Despite the proliferation of work on XML keyword query, it remains open to support keyword query over probabilistic XML data. Compared with traditional keyword search, it is far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efficient algorithms. The first algorithm PrStack can find k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efficiency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisfied SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results.

History

Pagination

673-684

Location

Hannover, Germany

Publisher DOI

https://doi.org/10.1109/ICDE.2011.5767875

Start date

2011-04-11

End date

2011-04-16

ISBN-13

978-1-4244-8959-6

Language

eng

Publication classification

E1.1 Full written paper - refereed

Copyright notice

2011, IEEE

Editor/Contributor(s)

[Unknown]

Title of proceedings

ICDE 2011 : Proceedings of the 27th International Conference on Data Engineering

Event

IEEE Computer Society. Conference (27th : 2011 : Hannover, Germany)

Publisher

Institute of Electrical and Electronics Engineers

Place of publication

Piscataway, N.J.

Series

IEEE Computer Society Conference

Usage metrics

Keywords

XML Probablistic logic Semantics Equations Encoding Mathematical model

Top-k keyword search over probabilistic XML data

History

Pagination

Location

Publisher DOI

Start date

End date

ISBN-13

Language

Publication classification

Copyright notice

Editor/Contributor(s)

Title of proceedings

Event

Publisher

Place of publication

Series

Usage metrics

Categories

Keywords

Licence

Exports