¡¡Chinese Journal of Computers   Full Text
  TitleA Clustering and Scheduling Algorithm Based on Task Duplication
  AuthorsHE Kun1),2) ZHAO Yong2) HUANG Wen-Qi1)
  Address1)(College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074)
2)(Department of Control Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074)
  Year2008
  IssueNo.5(733¡ª740)
  Abstract &
  Background
Abstract This paper addresses an important and classic scheduling problem, the static scheduling of dependent tasks in homogeneous environment. It is NP hard even when the resources are unbounded, and finds many applications in the parallel and distributed computation area. Dependent tasks are usually denoted by Directed Acyclic Graph(DAG), and solving heuristics are commonly categorized to priority list based, cluster based and task duplication based schemes. Task-duplication-based(TDB) algorithms are of better performance than non-duplication ones. A new TDB clustering and scheduling algorithm, called the dynamic critical predecessor(DCP) algorithm, is proposed in this paper. DCP algorithm defines a new selective strategy for important ancestors to be duplicated. The primary aim is to get the shortest schedule length, and the next is to utilize as less resources as possible. Based on an improved definition of granularity, DCP algorithm achieves a better performance guarantee for arbitrary DAG than relative works reported in the literature. Experimental results on several benchmarks show that DCP algorithm is quite effective and it exceeds other classic TDB algorithms. Especially for the classic EZ benchmark, DCP algorithm gets an optimal solution with 8 makespan, which is better than the optimal result taken for before with 8.5 makespan. Complement graph of a DAG is defined, and a similar algorithm is developed to produce a 2-optimal schedule for tree graph if task duplication is not allowed for the tasks.
Keywords task duplication; task clustering; scheduling algorithm; DAG; task granularity
Background This work is supported by the National Natural Science Foundation of China under grant No.70471077 and by the National Basic Research Program of China (973 Program) under grant No.2004CB318000.
The problem of static scheduling dependent tasks in homogeneous environment for minimizing the task makespan is a classic scheduling problem in the parallel and distributed computation area. The problem is NP-hard in theory and finds many applications in practice. It has attracted a considerable amount of attention over the years, and various approaches have been developed to handle it efficiently. Dependent tasks are usually denoted by Directed Acyclic Graph(DAG), and heuristics are the representative solving methods. The commonly categorized schemes are priority list based, cluster based and task duplication based schemes. In general, task duplication schemes(TDB) are observed to be better than the other two. Among the TDB algorithms reported in the literature, PY algorithm and MJD algorithm are theoretically best for the reason that they give performance guarantees for arbitrary DAG, while just some of the others give performance guarantees for certain DAG. For arbitrary DAG, PY could produce a schedule at most 2 times the optimal, and MJD could produce a schedule at most 1+¦Å¡¯ times the optimal (0<¦Å¡¯¡Ü1).
This paper proposes a new efficient approach, the dynamic critical predecessor (DCP) algorithm, for solving the problem. Based on an improved granularity definition proposed in this paper, a proof is given that DCP algorithm can produce a schedule at most 1+¦Å times the optimal and 0<¦Å¡Ü¦Å¡¯ for arbitrary DAG. Experimental results show that DCP algorithm exceeds other classic TDB algorithms. Especially on the classic EZ benchmark, DCP algorithm achieves an optimal scheduling with 8 makespan. The result is better than the optimal result taken for before with 8.5 makespan. Further, theoretical proof is given that the makespan of EZ benchmark is 8 for its optimal scheduling. Finally, complement graph of a DAG is defined, and a similar algorithm is developed to produce a 2-optimal schedule for graphs like In-tree and Out-tree if task duplication is not allowed for the tasks.