¡¡Chinese Journal of Computers   Full Text
  TitleEfficiently Processing XML Path Queries Using Automata
  AuthorsWANG Guo-Ren YU Yong-Qian SUN Bing
  Address(School of Information Science and Engineering, Northeastern University, Shenyang 110004)
  Year2007
  IssueNo.9(1520¡ª1532)
  Abstract &
  Background
Abstract In XML query processing, path expressions applied in most XML query languages have a powerful ability in locating and querying XML data as well as structural relationships. Due to the semi-structured feature of XML data, the query processing techniques of XML path expression query has new characteristics and challenges compared to traditional database query processing techniques. So far, some techniques have been proposed for processing path queries. However, when they are applied to large scale of XML documents and complicated path expressions, their performance degrade dramatically. This paper proposes a high efficient XML path expression query processing method¡ª¡ªSAM based on the automata technique. Its basic idea is transforming a path expression query to an equivalent automata and matching the automata with the schema paths abstracting from the XML document. This paper also presents an approach to computing the "//" operation based on the SAM method. The experimental results show that SAM is an efficient and practical method for computing complicated path expression queries on large scale XML documents.

keywords XML path expression; automata; query processing

background Path expression is the core part of most XML query languages and a lot of methods have been proposed recently to solve the problem of computing path expressions. However, the existing methods are not efficient for computing complex path expressions, especially for computing long path expressions. To solve this problem, this paper proposes an efficient method SAM to compute XML path expression queries based on the automata technique. In order to compute the path expressions containing the "//" operation, this paper propose a schema automata method to rewrite such path expressions. The experimental results given in this paper that the proposed method is very efficient for computing long path expressions and complex path expression queries.