| ¡¡ | Chinese Journal of Computers Full Text |
| Title | An Effective Coloring Algorithm for Close Relationship Between the Scales of Element Set and Color Set and Its Application |
| Authors | WANG Jian-Xin LIU Yun-Long CHEN Jian-Er |
| Address | (School of Information Science and Engineering, Central South University, Changsha 410083) |
| Year | 2008 |
| Issue | No.1(32¡ª42) |
| Abstract & Background | Abstract Color-coding is one of the most important solutions to those problems proved to be NP-Hard. The performance of those color-coding based algorithms principally depends on the scale of the coloring schemes generated by coloring algorithms they deploy. Therefore, the ultimate objective of coloring algorithm is to provide a coloring scheme as small as possible. All existing coloring algorithms are mainly based on perfect hash functions, requiring of the number of elements n far greater than the number of colors k, and k relatively small. This paper mainly focuses on the study of effective algorithms under the circumstance where the size of element set and color set are close (n¡Ü2k) and the analysis of performance of coloring algorithm while n¡Ü2k. This paper proposes a novel partition based algorithm PBCC, and proves that the coloring scheme generated by PBCC can cover all the subsets. A detailed method for constructing a (20,16)-coloring scheme with size 403 which can be applied for (l,d)-(20,16) motif finding problem is also provided. Furthermore, this paper analyzes the scale of coloring schemes generated by PBCC, and proves that while n¡Ü2k and n-kªÅ2, no algorithm can create a coloring scheme with its size £üS(n,k)£ü less than [n/2]n-k+nn-k-[n/2]n-k2n-k/(2n-k-2).At last, using asymptotic analysis, it is proved that coloring scheme generated by PBCC is of size O(e2Rootof(ex-e(¦Ìx)+1)(n-k)), and it is O(2.62n-k) when n=2k. Also, it is showen that any coloring scheme with n¡Ü2k will be larger than 2n-k. keywords coloring; coloring algorithm; NP-hard; asymptotic analysis; bound analysis background This work is supported by the National Natural Science Foundation of China (60433020) and the Program for New Century Excellent Talents in University (NCET-05-0683). With the development of computing theories, more and more real-world applications have been proved NP-Hard in the basis of model view. Therefore, to formulate some practical algorithms for these problems, researchers have tried various solutions, for example, a novel technique named color-coding. And this technique is being applied to more and more practical problems in recent years. Color-coding can be applied to some NP-Hard problems in subset selection. The key idea in this technique is make a transformation from the restriction of distinct elements to the restriction of distinct colors, and then the transformation will lead to the shrink of solution space therefore make the problem easier. The application of color-coding can be classified into two categories according to different coloring schemes: Randomized color-coding and deterministic color-coding. Although color-coding improves the efficiency on solve subset selection-like problems, it still has many deficiencies as an inchoate solution: (1)Despite the complexity of randomized color-coding is acceptable in most of the cases, its accuracy cannot meet the demands from time to time, especially in those situations where all solutions except exact ones are invalid; (2)The complexity of deterministic color-coding has been reduced a lot with the efforts of researchers, albeit most coloring schemes provided by these algorithms are oversized and impractical; (3)At present, all deterministic color-codings are based on perfect hash functions and fix parameter theory, therefore there is a presumption that the number of elements is far greater than the number of colors, and there are few colors. This presumption constrains the application of color-coding for it is not always stands. Thus, in order to improve the practicality of color-coding and enable the applications of color-coding in more areas, this paper mainly concerns on the circumstance where the size of element set and color set are close. To make full use of this condition, this paper proposes a non-hash style coloring algorithm named PBCC (Partition-Based Color-Coding). This algorithm intends to make every coloring covers as many subsets as possible therefore minimize the size of coloring scheme through equal partitioning and the consideration of coloring distribution between subsets. PBCC is proved to be efficient under the circumstance where the size of element set and color set are close by both application and theoretical analysis. It can generate much smaller coloring schemes than other algorithms such as those hash-style ones, therefore it can be applied to computing biology problems such as motif-finding and its similar ones. Furthermore, since deciding lower bound of coloring schemes using just maximum coverage cannot points out where the bottleneck on reducing colorings, the authors also study on the lower bound in this situation, and show that repetitions between colorings are the obstacles in reducing colorings, and the obstacles cannot be annihilated. Therefore, a stricter lower bound is got by study of repetitions between colorings. |