Deakin University
Browse

File(s) under permanent embargo

Matching top-k answers of twig patterns in probabilistic XML

conference contribution
posted on 2010-01-01, 00:00 authored by Bo Ning, Chengfei Liu, Jeffrey Xu Yu, Guoren Wang, Jianxin LiJianxin Li
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. The top-k matching of a twig pattern against probabilistic XML data is essential. Some classical twig pattern algorithms can be adjusted to process the probabilistic XML. However, as far as finding answers of the top-k probabilities is concerned, the existing algorithms suffer in performance, because many unnecessary intermediate path results, with small probabilities, need to be processed. To cope with this problem, we propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.

History

Event

Database Systems for Advanced Applications. Conference (15th : 2010 : Tsukuba, Japan)

Series

Database Systems for Advanced Applications Conference

Pagination

125 - 139

Publisher

Springer

Location

Tsukuba, Japan

Place of publication

Berlin, Germany

Start date

2010-04-01

End date

2010-04-04

ISBN-13

978-3-642-12025-1

Language

eng

Publication classification

E1.1 Full written paper - refereed

Copyright notice

2010, Springer-Verlag Berlin Heidelberg

Editor/Contributor(s)

H Kitagawa, Y Ishikawa, Q Li, C Watanabe

Title of proceedings

DASFAA : Proceedings of the 15th International Conference on Database Systems for Advanced Applications

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC