¡¡Chinese Journal of Computers   Full Text
  TitleCSky: An Online Efficient Algorithm for Subspace Skyline Computation in High Dimensional Space
  AuthorsZHOU Hong-Fu GONG Xue-Qing ZHENG Kai ZHOU Ao-Ying
  Address(Department of Computer Science and Engineering, Fudan University, Shanghai 200433)
  Year2007
  IssueNo.8(1409¡ª1417)
  Abstract &
  Background
Abstract Skyline computation aims to find the points that are not dominated by any other point in the dataset. It has been becoming a hot topic due to its potential applications in real-time online services. Usually, such applications expect to return the first Skyline point quickly, without ransacking all the points. This paper focuses on the problem of progressive subspace Skyline queries in high dimensional space. To the best of our knowledge, the existing algorithms and their variations cannot be easily extended to support arbitrary subspace Skyline query efficiently. The BNL(Blocked Nested Loop) method can be used for subspace Skyline queries, but it is very inefficiently, and not progressive. A novel algorithm, called CSky(stands for Count the Skyline), is proposed in this paper to solve this problem. With CSky, the Skyline points can be rapidly obtained in any query subspace of a high dimensional space. It is an online algorithm based on a novel data structure called InvertS. The algorithm scans the dataset at most one pass, and the points are only compared with those detected Skyline points, thus resulting in a much smaller computation overhead in comparison with BNL. Furthermore, CSky not only can efficiently find the Skyline like other index-based algorithms, but also do it in any subspace of the whole query space. The theoretical analysis and experimental comparison show that the CSky algorithm outperforms BNL significantly in various kinds of application cases.

keywords Skyline; subspace; progressive algorithm; online algorithm

background This work is mainly supported by the National Natural Science Foundation of China under grant Nos.60496327 and 60496325. The problem addressed in the paper is from the field of Skyline computation. Nearly, a large number of Skyline algorithms are explored in order to efficiently evaluate the Skyline in the dataset with the fixed dimensions.
The purpose of this work aims to solve the problem of Skyline computation of any subspace of high dimensional space in order to develop a platform of multi-criteria decision making prototype system based on the Skyline view. The WebDB and P2P research group at Fudan University has produced fruitful results in the fields of data mining, Web service, business intelligence, similarity search, and security and trust management in P2P environment. This paper presents their recent research achievement in the study of the Skyline computation of high dimensional space.