¡¡Chinese Journal of Computers   Full Text
  TitleSubspace Searching Based Generalized Principal Component Analysis
  AuthorsCAO Yang LUO Yu-Pin YANG Shi-Yuan
  Address(Department of Automation, Tsinghua University, Beijing 100084)
  Year2007
  IssueNo.12(2151¡ª2155)
  Abstract &
  Background
Abstract GPCA(Generalized Principal Component Analysis) is a new clustering and dimensionality reduction algorithm. It classifies and represents data in some subspaces. GPCA is used to solve some computer vision problems such as image segmentation and face clustering. Original GPCA algorithm has an exponential computational complexity so that it cannot be applied to high-dimension data. A new SGPCA(Subspace Searching Based Generalized Principal Component Analysis) algorithm is proposed in this paper. Clustering problem is reduced to searching of orthogonal vectors of subspaces. It has a polynomial computational complexity because it searches every subspace one at a time. Experiments show that new method runs faster and more robust to noise than the original algorithm.

keywords principal component analysis; subspace segmentation; dimensionality reduction; minimization; local minimum

background Generalized Principal Component Analysis is a new clustering algorithm. It supposes that data points lie in some subspaces. GPCA is used to solve a lot of computer vision problems because image is often self-similarity. The processed image is often divided into small regions such as 5¡Á5 rectangles, which means that data has 25 dimensions. Original GPCA algorithm that has exponential computational complexity cannot deal with such high-dimension data. Previous researches use some dimensionality reduction algorithm such as PCA before using GPCA to reduce computational cost.
In this work, the authors address the issue of how to reduce computational cost. A new Subspace Searching Based Generalized Principal Component Analysis algorithm is proposed to solve the same problem. New algorithm searches subspaces in the data one by one so that it can solve the problem with polynomial computational complexity.