| ¡¡ | Chinese Journal of Computers Full Text |
| Title | Inconsistency Measure of Database with Similarity Relation |
| Authors | ZHANG Wei-Gang1),2) PAN Quan2) ZHANG Hong-Cai2) |
| Address | 1)(College of Aeronautical Mechanics and Avionics Engineering, Civil Aviation University of China, Tianjin 300300) 2)(School of Automation, Northwestern Polytechnical University, Xi¡äan 710072) |
| Year | 2008 |
| Issue | No.1(91¡ª103) |
| Abstract & Background | Abstract As one of the principles to select and appraise data mining algorithm, the inconsistency measure of database received much attention in classification rules discovering. But the classical measure based on information entropy will not meet those needs with the further study of incomplete database since the requirement of equivalence relation may not be satisfied in such condition. This paper gives a method to found information granularity with belief and plausibility measure based on the similarity relation and evidence theory. At the same time, inconsistency measures which are similar to inconsistent and confusion degree of fuzzy entropy are proposed with the proving of their some character. From the proving and simulation, it shows the proposed method will give a well description of inconsistency in incomplete database, and when there is no data missing, it will gives a same result as the previous studies. keywords similarity relation; inconsistency; incomplete database; evidence theory; fuzzy entropy background This is partially supported by the National Natural Science Foundation of China under grant No.60172037 and Science Research Foundation of Civil Aviation University of China under grant No.05qd08q. In the field of data mining and machine learning, how to appraise an algorithm in theory research and select a proper one for an application in practice is crucial. And the inconsistency measure, one of the most important characters of database for learning classification rules, is utilized popularly in such fields. However, most of the tradition measures are based on rough entropy theory, which requires no data missing be present. Since the databases in real life are often incomplete and learning with such database has received many attentions in recent years, estimating inconsistency with incomplete database is significative. In this paper, the authors consider this problem with two steps: matching of missing data and formulizing of measures. In the first step, the authors utilize two different similar relations that are based on evidence theory and be used widely in field of fuzzy query of information to reflect optimism and conservative attitudes with missing data instead of equivalence relation, the basis of tradition measure based on rough entropy. In the second step, the authors extend the tradition measure with these two similar relations and give two measures by use of evidence theory and fuzzy entropy theory. The measures can be seen as a nature extent of tradition measure based rough entropy with evidence theory and fuzzy entropy and have reasonable constructions in mathematics and well understandability in intuition. At the same time, the difference between these two measures makes a opportunity to estimate the influence of different learning algorithms than just using one measure and reflects the essence of missing data. It may be helpful to evaluate and select learning algorithm, and also can be used to modify learning algorithm based on information entropy to learn with incomplete database. |