Deakin University
Browse

File(s) under permanent embargo

Suggestion of promising result types for XML keyword search

conference contribution
posted on 2010-01-01, 00:00 authored by Jianxin LiJianxin Li, Chengfei Liu, Rui Zhou, Wei Wang
Although keyword query enables inexperienced users to easily search XML database with no specific knowledge of complex structured query languages or XML data schemas, the ambiguity of keyword query may result in generating a great number of results that may be classified into different types. For users, each result type implies a possible search intention. To improve the performance of keyword query, it is desirable to efficiently work out the most relevant result type from the data to be retrieved.

Several recent research works have focused on this interesting problem by using data schema information or pure IR-style statical information. However, this problem is still open due to some requirements. (1) The data to be retrieved may not contain schema information; (2) Relevant result types should be efficiently computed before keyword query evaluation; (3) The correlation between a result type and a keyword query should be measured by analyzing the distribution of relevant values and structures within the data. As we know, none of existing work satisfies the above three requirements together. To address the problem, we propose an estimation-based approach to compute the promising result types for a keyword query, which can help a user quickly narrow down to her specific information need. To speed up the computation, we designed new algorithms based on the indexes to be built. Finally, we present a set of experimental results that evaluate the proposed algorithms and show the potential of this work.

History

Event

Association for Computing Machinery. Conference (13th : 2010 : Lausanne, Switzerland)

Series

Association for Computing Machinery Conference

Pagination

561 - 572

Publisher

Association for Computing Machinery

Location

Lausanne, Switzerland

Place of publication

New York, N.Y.

Start date

2010-03-22

End date

2010-03-26

ISBN-13

978-1-60558-945-9

Language

eng

Publication classification

E1.1 Full written paper - refereed

Copyright notice

2010, ACM

Editor/Contributor(s)

I Manolescu, S Spaccapietra, J Teubner, M Kitsuregawa, A Léger, F Naumann, A Ailamaki, F Ozcan

Title of proceedings

EDBT 2010 : Proceedings of the 13th International Conference on Extending Database Technology

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC