| ¡¡ | Chinese Journal of Computers Full Text |
| Title | A Real-Time Fault-Tolerant Scheduling Algorithm of Periodic Tasks in Heterogeneous Distributed Systems |
| Authors | LUO Wei YANG Fu-Min PANG Li-Ping TU Gang |
| Address | (School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074) |
| Year | 2007 |
| Issue | No.10(1740¡ª1749) |
| Abstract & Background | Abstract This paper proposes a novel reliability model based on preemptive periodic tasks. Compared with existing reliability models in literature, the proposed reliability model can be one-processor-failed fault-tolerant, which makes it more realistic and precise. Moreover, a real-time fault-tolerant scheduling algorithm based on heterogeneous distributed systems, namely IRDFTAHS, is presented. IRDFTAHS tries to assign tasks copies in a way to improve reliability of system. In addition, IRDFTAHS considers backup copy in both active and passive status, which makes the proposed algorithm more flexible than existing algorithms. Finally, simulation experiments are carried out to compare the algorithm with existing ones in several aspects. The experiments results show that the IRDFTAHS generally performs significantly better than existing algorithms. keywords real-time periodic tasks; fault-tolerance; primary/backup copy; heterogeneous distributed systems; reliability background The work reported in this paper was supported in part by the National Natural Science Foundation of China under grant No.60603032. Fault-tolerance is an inherent requirement in real-time distributed systems. In this area, Primary Backup scheme plays an important role. Different from homogeneous distributed systems, the most essential characteristic of heterogeneous distributed systems is their great varieties of the computing power and reliability of different processors. Therefore, when designing a real-time scheduling algorithm, we should not only consider the real-time property and schedulability of tasks but also take the reliability of the scheduling results into accounts. Reliability has been a main concern of computer systems research community for many years. Conventionally, the reliability of a system is defined as the probability that the system functions are properly and continuously without any interruption. With the emergence of critical business applications and safety critical systems, the traditional definition of reliability needs to be extended to incorporate fault tolerance. While achieving high reliability, it is critical to ensure that real-time tasks can be completed before their deadlines even in presence of failures. Thus, it is imperative for a reliable system to take fault-tolerant actions after failures are detected so that the reliable system can continue its execution without any interrupt due to the failures. However, to the best of our knowledge, most existing reliability models constructed for real-time systems has not addressed the issue of fault tolerance. To bridge this gap, this paper proposes a novel reliability model based on preemptive periodic tasks. Compared with existing algorithms in literature, the proposed reliability model can be one-processor-failed fault-tolerant. Moreover, a real-time fault-tolerant scheduling algorithm based heterogeneous distributed systems, namely IRDFTAHS, is presented. IRDFTAHS try to assign tasks copies in a way to improve reliability of system. IRDFTAHS considers both active and passive backup copy, which make the proposed algorithm more flexible than existing algorithms. Future studies in this arena are three-fold. First, the authors are going to develop more efficient scheduling algorithms aiming at boosting system¡¯s reliability. Second, they will further extend the reliability model to deal with hybrid real-time applications containing the three different types of real-time tasks. Third, they intend to investigate a more complex version of scheduling algorithm, in which the precedence constrains among tasks and aperiodic tasks scheduling are incorporated. |