| ¡¡ | Chinese Journal of Computers Full Text |
| Title | Complex Query Processing in Large-Scale Distributed System |
| Authors | ZHOU Ao-Ying1),2) ZHOU Min-Qi2) QIAN Wei-Ning1) ZHANG Rong2) |
| Address | 1)(Institute of Massive Computing, East China Normal University, Shanghai 200062) 2)(Department of Computer Science and Engineering, Fudan University, Shanghai 200433) |
| Year | 2008 |
| Issue | No.9(1563¡ª1572) |
| Abstract & Background | Abstract Complex query processing in large-scale distributed systems is an important problem in bringing peer-to-peer techniques into applications. It has attracted much attention in both academic and industrial community. This paper presents a generalized Chord-like technique, GChord, for evaluating queries with multi-attributes with scalability and efficiency. GChord supports not only exact match queries but also range queries. It has advantages over existing methods in that each tuple is only encoded and indexed once, while the query efficiency is guaranteed. Thus, index maintenance cost and search efficiency are balanced. Additional optimization techniques further improve the performance of GChord. Extensive experiments are conducted to validate the efficiency of the proposed method. Keywords multi-attribute query processing; overlay network; distributed system Background Peer-to-Peer (P2P) systems provide a new paradigm for information sharing in large-scale distributed environments. Though the success of file sharing applications has proved the potential of P2P-based systems, the limited query operators supported by existing systems prevent their usage in more advanced applications. Much effort has been devoted to provide fully featured database query processing in P2P systems. There are several differences between query processing for file sharing and database queries. Firstly, the types of data are much more complex in databases than those in file names. Basically, numerical and categorical data types should be supported. Secondly, files are searched via keywords. Keyword search is often implemented by using exact match query. However, for numerical data types, both of the exact-match queries (or point queries) and the range queries should be supported. The last but not the least, user may issue queries with constraints on variant number of attributes for database applications. This last requirement poses additional challenges for database style query processing in P2P systems. Therefore, to support complex query processing in a large-scale distributed environment is an important problem in massive data processing. |