计算机学报

	Chinese Journal of Computers Full Text
Title	Object Classification Based on Latent Local Spatial Relations Learning
Authors	HAN Dong-Feng LI Wen-Hui GUO Wu
Address	(College of Computer Science and Technology, Jilin University, Changchun 130012) (Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012)
Year	2007
Issue	No.8(1286—1294)
Abstract & Background	Abstract Latent Local Spatial Relations (LLSR) model is presented as a novel technique of learning spatial models for visual object classification. Combined the latent local spatial relations model with statistical visual words and variational expectation maximization, LLSR is developed as an implementation of object classification algorithm. LLSR uses an unsupervised process that can capture both spatial relations and visual words appearances simultaneously. In contrast to other methods which explicitly give some parameterized spatial models, the proposed algorithm uses a latent class model to reveal some certain latent spatial relations. The advantages of the proposed model include: (1) it uses an unsupervised learning paradigm which can avoid some manual controls; (2) it can resist some geometry transforms; (3) it is a dense model; (4) the spatial relations are latent which have more insight into describing the object structure. The experiments are demonstrated on some standard databases and show that LLSR is a promising model for solving object classification problems, especially for translation, rotation, scale, affine and part of occlusion. keywords object classification; latent local spatial relations; graph model; variational expectation maximization; local interest points background The problem of image classification is to categorize images using some image features and learning methods. The goal is to correctly classify unseen objects. Though this basic ability is easy and natural for humans, it is more difficult for computers to finish this work. A rich palette of diverse ideas has been proposed during the past few years. One of the most difficult problems is how to combine the statistics information with the spatial information to make the algorithm invariant to translation, rotation, scale, affine and occlusion. In this paper, the authors focus their attentions on developing statistical methods which learn object spatial and appearance models from the training examples using unsupervised learning algorithm. LLSR model is proposed by analyzing the latent local spatial relations. The local count number is used to construct the relations of neighbourhood regions. This representation contains the relations of local regions and the statistics of visual words. LLSR is a hierarchical probabilistic model and the model parameters are learnt using variational expectation maximization. The variational methods are promising for large scale problems. Experiments show that the proposed method can capture both spatial relations and statistics information simultaneously and can resist to translation, rotation, scale, affine and part of occlusion. This research is supported by the National Natural Science Foundation of China (grant Nos.50338030, 60573182), the Doctor Foundation of China (grant No.20060183042) and the Jilin Science Foundation (grant Nos.20040531, 20060527). The research group has been working on computer vision areas for many years and has obtained some achievements in related fields. The group has published more than sixty papers in the international journals and conferences. More than twenty papers are indexed by SCI/SCIE and more than thirty papers are indexed by EI.