¡¡ | Chinese Journal of Computers Full Text |
Title | The Sorting Mathematical Model and Algorithm of Written Tibetan Language |
Authors | JIANG Di1) KANG Cai-Jun2) |
Address | 1)(Key Laboratory of Computational Linguistics, Institute of Ethnology & Anthropology, Chinese Academy of Social Sciences, Beijing 100081£© 2)(Department of Automation, Beijing Institute of Technology£¬ Beijing 100081) |
Year | 2004 |
Issue | No.4(524-529) |
Abstract & Background | According to GB16959-1997 and ISO/IEC 10646-1:1993 of coded character set for Tibetan information processing, there is an engineering need for applying the set to all kinds of software and databases, in which sorting is an important technology. As Tibetan sorting involves construction order, classes of constitution and character sequence in the dictionary order, A Written Tibetan word has an inconceivably complex structure with multi-hierarchies. The paper makes an exhaustive analysis to the structures of words, the order of construction categories, and the sequence of characters in each structural position, as well as the length of words and the hierarchies of vertical composition stacks, and then establishes a sorting mathematical model. On the basis of the analysis, the paper assigns distinctive values to all existing characters with numerals in a word, then step by step identifies each character in the words with special algorithm and match it with character-numeral lists. At last, the paper combines all the values extracted from characters of words and compares different combination to make an ordered arrangement for any words in Tibetan language. This processing strategy has been accomplished in Windows 2000/NT Operating System.
keywords written Tibetan; construction order; classes of constitution; character sequence; sorting by computer |