| ¡¡ | Chinese Journal of Computers Full Text |
| Title | An Answer Extraction Algorithm Based on Syntax Structure Feature Parsing and Classification |
| Authors | HU Bao-Shun1) WANG Da-Ling2) YU Ge2) MA Ting2) |
| Address | 1)(Department of Computer Science and Technology, Software Collage, Northeastern University, Shenyang 110004) 2)(Institute of Computer Software and Theory, School of Information Science and Engineering, Northeastern University, Shenyang 110004) |
| Year | 2008 |
| Issue | No.4(662¡ª676) |
| Abstract & Background | Abstract Due to the feature and difficulty of Chinese natural language processing and the lack of related resources, some foreign mature techniques can not be applied in Chinese Question Answering (QA) system. For the Chinese factoid QA system, a new answer extraction method based on syntax structure feature parsing and classification is presented in this paper. With the method, the answer extraction is regarded as candidate answer classification problem£¬i.e. candidate answers are classified into correct and incorrect answer. According to the part-of-speech information of candidate answers corresponding to question types£¬the candidate answers and their features (both simple and syntactic) in sentences from snippets are firstly extracted. Then these features are used to train the classifier. Finally, the trained classifier is used to distinguish whether the candidate answer is correct or not. For Chinese factoid questions, comparing to currently typical pattern matching based answer extraction algorithm, the new method improves precision by 6.2% and MRR by 9.7%. keywords syntax dependency parsing; classification; answer extraction; Chinese Question Answering (QA) system; factoid questions background The problem discussed in this paper belongs to answer extraction in Question Answering System which is relevant to search engine technology. Recent years, Question Answering System is becoming a focus researching area. But the first contest on Chinese Question Answering System was held in 2005(NTCIR5). Currently the performance of Chinese QA is not well enough for users¡¯ requirements. This paper presents a new answer extraction method for Chinese QA. Comparing to the typical answer extraction method, the performance has been improved. The research in this paper belongs to the National Natural Science Foundation of China "Research of Deduction Model for Users¡¯ Motivation Orienting to New Generation Search Engine" grant No. 60573090. New generation search engine has such characteristics as interactive searching, classific navigation, accurately related querying, and rapid updating. Moreover, it pays attention to personalized and intelligent services. So studying on related technologies with new generation search engine such as personalized search, users¡¯ model, QA system, users¡¯ behavior and motivation, is helpful for improving the quality of search engines. Now, the research group has achieved some productions. More than 40 papers have been published in "Chinese Journal of Computer", "Journal of Software", "Journal of Computer Research and Development", "Lecture Notes in Computer Science", and other Journals, and WISE, WAIM, APWeb, NDBC, DBAT, and other conferences. QA system is a more effective and personalized new generation search engine, and the research content of this paper is an important part of new generation engine research. |