File(s) under permanent embargo
Suggestion of promising result types for XML keyword search
conference contribution
posted on 2010-01-01, 00:00 authored by Jianxin LiJianxin Li, Chengfei Liu, Rui Zhou, Wei WangAlthough keyword query enables inexperienced users to easily search XML database with no specific knowledge of complex structured query languages or XML data schemas, the ambiguity of keyword query may result in generating a great number of results that may be classified into different types. For users, each result type implies a possible search intention. To improve the performance of keyword query, it is desirable to efficiently work out the most relevant result type from the data to be retrieved.
Several recent research works have focused on this interesting problem by using data schema information or pure IR-style statical information. However, this problem is still open due to some requirements. (1) The data to be retrieved may not contain schema information; (2) Relevant result types should be efficiently computed before keyword query evaluation; (3) The correlation between a result type and a keyword query should be measured by analyzing the distribution of relevant values and structures within the data. As we know, none of existing work satisfies the above three requirements together. To address the problem, we propose an estimation-based approach to compute the promising result types for a keyword query, which can help a user quickly narrow down to her specific information need. To speed up the computation, we designed new algorithms based on the indexes to be built. Finally, we present a set of experimental results that evaluate the proposed algorithms and show the potential of this work.
Several recent research works have focused on this interesting problem by using data schema information or pure IR-style statical information. However, this problem is still open due to some requirements. (1) The data to be retrieved may not contain schema information; (2) Relevant result types should be efficiently computed before keyword query evaluation; (3) The correlation between a result type and a keyword query should be measured by analyzing the distribution of relevant values and structures within the data. As we know, none of existing work satisfies the above three requirements together. To address the problem, we propose an estimation-based approach to compute the promising result types for a keyword query, which can help a user quickly narrow down to her specific information need. To speed up the computation, we designed new algorithms based on the indexes to be built. Finally, we present a set of experimental results that evaluate the proposed algorithms and show the potential of this work.
History
Event
Association for Computing Machinery. Conference (13th : 2010 : Lausanne, Switzerland)Series
Association for Computing Machinery ConferencePagination
561 - 572Publisher
Association for Computing MachineryLocation
Lausanne, SwitzerlandPlace of publication
New York, N.Y.Publisher DOI
Start date
2010-03-22End date
2010-03-26ISBN-13
978-1-60558-945-9Language
engPublication classification
E1.1 Full written paper - refereedCopyright notice
2010, ACMEditor/Contributor(s)
I Manolescu, S Spaccapietra, J Teubner, M Kitsuregawa, A Léger, F Naumann, A Ailamaki, F OzcanTitle of proceedings
EDBT 2010 : Proceedings of the 13th International Conference on Extending Database TechnologyUsage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC