| ¡¡ | Chinese Journal of Computers Full Text |
| Title | User Attention Analysis Based Video Summarization and Highlight Ranking |
| Authors | HUANG Qing-Ming1),2) ZHENG Yi-Jia1) JIANG Shu-Qiang2),3) GAO Wen4) |
| Address | 1)(Graduate University of Chinese Academy of Sciences, Beijing 100190) 2)(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190) 3)(Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences, Beijing 100190) 4)(School of Electronics Engineering and Computer Science£¬Peking University, Beijing 100871) |
| Year | 2008 |
| Issue | No.9(1612¡ª1621) |
| Abstract & Background | Abstract This paper proposes a user attention analysis based video content understanding approach, which can be used to automatically detect the highlights of videos and rank them according to their impressive values. Firstly, audio classification is done using the authors¡® hierarchical bintree framework and classifier selection algorithm. Then, the user attention space is established and the visual, aural, temporal mid-level features are extracted to represent the three main modalities of this space, and the attention values are calculated correspondingly. A specific fusion strategy called ordinal-decision is used to combine the visual, aural attention models and form the attention curve for a video. The highlight segments can be extracted from this attention curve. Finally, the support vector regression model and relevance feedback mechanism are employed to rank the highlight segments and make the ranking result more suitable for human personalization. The method that introduces the user attention into the video content analysis field could effectively generate the summaries and rank them according to their impressive values. The proposed approach is based on the changes of human attention while watching videos rather than the simple content changes of them, which is more consistent with human understanding. Experimental results demonstrate that the proposed approach is effective for video summarization and highlight ranking. Keywords user attention space; attention analysis; attention model; audio-video summarization; highlight ranking Background With the emergence of more and more digital video information, fast and automatic extraction of user oriented personalized video summarization from massive video database has become an issue to be solved. On one hand, by labeling the semantic information expressed by the video and realizing the automatic video content analysis and understanding, we can reduce the work load of manual browsing for video content and save retrieval time, which is of great value in research and application. On the other hand, with the unceasing emergence of many kinds of new application scenarios such as 3G wireless communication environment, a good video analysis system calls for stronger personalized information, so that we can carry on the target-oriented operation according to the user's personal demand, and return the results most concerned by the user. If we make computer to understand the video content in a human-being way, we can draw closer to the user's request, thus will cause the video parsing technique to conform better to the characteristics of human perception. In this regards, based on the previous work, the authors further propose the new methods to solve video summarization and highlight ranking issues, trying to understand the video content based on user attention space and attention analysis. The target is to realize an expandable video summarization and highlight ranking system based on the definition of user attention space and construction of user attention model. |