¡¡Chinese Journal of Computers   Full Text
  TitlePerson Name Disambiguation of Searching Results Using Social Network
  AuthorsLANG Jun QIN Bing SONG Wei LIU Long LIU Ting LI Sheng
  Address(Information Retrieval Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001)
  Year2009
  IssueNo.7(1365¡ª1374)
  Abstract &
  Background
Abstract The person names are so ambiguous that the results for searching a person name are usually a mixture of pages about the namesakes. This paper presents a novel approach leveraging the fact that each namesake has a unique social community. Firstly, the social network of the person name to search is found and extended by employing the co-occurrence of person names in snippets returned by a search engine, then automatically clustered into different social communities by the algorithm combining spectral partition and modularity evaluation. Finally, the search results are clustered into different groups where each contains pages referring to the same individual. On the corpus of Chinese person names, experimental results show that the whole performance achieves high level and graph clustering algorithm benefits improving disambiguation effect from further dividing the connecting social network. Keywords social network; name disambiguation; spectral partition; modularity Background Searching people information is one of the major activities of Internet users. However, in the real world, a number of people share one name is a very common phenomenon. This has led to the results of searching a person name are usually a mixture of pages about the namesakes. Although some systems can handle searching results clustering, they deal with the person name as general terms. Moreover, the generating labels of the clusters are common terms. These systems have not directly distinguished the namesakes¡¯ searching results. Person name disambiguation of searching results can utilize the context of person names and adopt similar methods to word sense disambiguation. The common approaches extract the searched snippets or corresponding Web pages, and extract the key context phrases for vector space model, then use the vector similarity for final searching results clustering. The better solution is extracting the people related information records for calculating people similarity and judging person identity, such as gender, nation, native place, birth date, family ties, home address, position, and so on. To person name disambiguation of searching results, text-clustering approaches have considered many useless words, and require manual setting threshold or class number. The personal information extraction and person similarity based approach is very dependent on the personal information extraction. All kinds of extraction errors are easy to cause cascading errors. To solve these problems, this paper proposed using social network for person name disambiguation of searching results. The method is mainly based on the saying ¡°Birds of a feather flock together¡±. That is different people with same name have the distinction of their own social networks. For example, the social network for ¡°Wang Gang¡± of entertainment circle is significantly different from that for ¡°Wang Gang¡± of political circle. This paper utilizes the hidden social networks of the searching results for person name disambiguation. It aims at Chinese person names¡¯ searching results, and combines spectral partition and modularity evaluation for automatically clustered into different social communities.