¡¡Chinese Journal of Computers   Full Text
  TitleAn sIB Algorithm for Automatically Determining Parameter
  AuthorsYE Yang-Dong1) LIU Dong1) JIA Li-Min2) LI Gang3)
  Address1)(Department of Computer Science, School of Information Engineering, Zhengzhou University, Zhengzhou 450052)
2)(State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing 100044)
3)(School of Information Technology, Deakin University, 221 Burwood Highway, Vic 3125, Australia)
  Year2007
  IssueNo.6(969¡ª978)
  Abstract &
  Background
Abstract To solve the problem of determining the compression variable parameter for sIB algorithm, this paper proposes an AsIB algorithm for automatically determining parameter based on minimum description length principle. An efficient encoding scheme is designed to estimate the description length of the solution model of sIB algorithm and the original data given the model respectively, and the minimum description length model is selected as a criterion to find the number of feature patterns hidden in dataset. Experiment results show that the encoding scheme in AsIB is efficient to recover the true feature pattern in dataset without the requirement of setting category number of feature pattern. AsIB algorithm removes the dependency of empirical knowledge for sIB algorithm, which widens its applications in areas such as automatic dimension reduction and pattern extraction, etc.

keywords IB theory; sIB algorithm; AsIB algorithm; minimum description length principle; model selection

background This work is supported by the National Natural Science Foundation of China under grant No.600332020 with title "Research on Intelligent Synthetical transportation Information System of High Railway and its Key Technologies", and the Natural Science Foundation of Henan province under grant No.0411012300 titled "Research on Application of Theories of Modeling of Hybrid System based on Distributed Agent". This paper aims to solve the problem of the analysis of multi-source and multi-dimension data and focuses on the methods and technologies for analysis of huge data using IB theory.The research has achieved certain results in areas such as improving the accuracy of IB algorithms and introducing IB into image retrieval. By compressing one variable to the "Bottleneck" variable and maximally preserving its relevance to another variable, IB theory effectively settles the problems that many other classical feature extracting methods can¡¯t solve. Along with the embedded applications of IB theory in many areas, the shortcomings of IB algorithms in the case of enlarging and changing of searching spaces have emerged. One of these problems can be summarized as: Determining the parameter of compression variable for IB algorithms. In order to determine the compression variable parameter, this paper proposes an sIB algorithm for automatically determining parameter based on minimum description length principle. An efficient encoding scheme is designed to estimate the description length of the solution model of sIB algorithm and the original data given the model respectively, and the minimum description length model is selected as a criterion to find the number of feature patterns hidden in dataset. The proposed AsIB algorithm removes the dependency of empirical knowledge for sIB algorithm, and the research results could widen its applications in areas such as automatic dimension reduction and pattern extraction, etc.