Abstract:
To fully benefit from a multiprocessor system, the tasks of a program are to be carefully assigned and scheduled on the processors of the system such that the overall execution time is minimal. Due to the reached physical limits of processor technology the improvements are fading out and manufacturers have moved to multi(core)processors. With multiple processors, however, the performance growth is not automatic anymore and can only be achieved when the processors are efficiently employed in parallel. For the performance and efficiency of a parallel program the scheduling of its (sub)tasks is crucial. Unfortunately, scheduling is a fundamental unsolved problem (an NP-hard optimisation problem), as the time needed to solve it optimally grows exponentially with the number of tasks. I.e. the associated task scheduling problem with communication delays, P|prec, cij |Cmax, is a well known NP-hard problem. The tasks that are to be scheduled may or may not be dependent on each other and are represented as an acyclic directed graph. The nodes in the graph represent the tasks and the edges between the nodes, the communications. The node cost is the time required for the task to complete and the edge cost is the communication time between two tasks on different processors. We assume a connected network of processors with identical communication links. Further, there is no multitasking or parallelism within a task. Each processor may execute several tasks but no concurrent execution of tasks is permitted. The tasks are to be assigned in such a way as to minimise the overall schedule length. Existing scheduling algorithms are mostly heuristics (for e.g. the list scheduling algorithm) as they try to produce good rather than optimal schedules. Having optimal schedules can make a fundamental difference, e.g. for time critical systems such as flight control, industrial automation, automotive applications, telecommunication systems, consumer electronics, robotics and multimedia systems. Multiprocessor systems are also popular in small portable devices such as cellphones or navigators to large systems such as industrial robots or aircraft. An optimal schedule may also be used as a benchmark to enable the precise evaluation of scheduling heuristics. Moreover, once an optimal schedule is found, it may be reused when a parallel program is rerun. This work has two main contributions 1) investigate and propose novel Mixed Integer Linear Programming (MILP) solutions to this scheduling problem, despite the fact that scheduling problems are often difficult to handle by MILP solvers. 2) Propose memory limited versions of the A* scheduling algorithm since A* for larger problem instances may run out of memory before finding an optimal schedule. The applicability of existing pruning techniques used on A* are re-investigated and new pruning techniques to further speedup the memory limited A*algorithms are proposed. The first part of the work proposes MILP solutions that use problem specific knowledge to eliminate the need to linearise the bi-linear equations arising out of communication delays. The classic ILP formulation for the scheduling problem and its improved formulations PACKING-USUAL and PACKING-COMPACT are studied. Two new ILP formulations, namely ILP-RBL and ILP-TC are proposed to further speedup the solution. This is attained by re-formulating the constraints and variables needed to linearise the bi-linear forms arising from communication delays between tasks running on different processors. Next, the work in the thesis proposes the formulation ILP-DELTA by introducing new overlap constraints to ensure that no two tasks running on the same processor overlap in time and space. The purpose is to test if re-formulating these constraints would help speedup the formulation. Further, the work in the thesis proposes the MILP formulations SHD-BASIC, SHDRELAXED and SHD-REDUCED wherein the size of the proposed formulations in terms of variables are independent of the number of processors. We analyse and discuss the influence of the different MILP components in respect to characteristics of the task graph such as structure and communication to computation ratio. The proposed MILP formulations are experimentally compared with previous MILP formulations used to solve this scheduling problem. The proposed formulations displays a drastic improvement in performance, which allows to solve larger problems optimally. The work also observe strengths and weaknesses of the formulations related to the input characteristics. The second part of the work proposes two memory limited optimal scheduling algorithms: Iterative Deepening A* (IDA*) and Depth-First Branch and Bound A* (BBA*). When finding a guaranteed near optimal schedule length is sufficient, the proposed algorithms can be combined, reporting the gap while they run. A novel method to find a good initial lower bound close to the optimal schedule length in order to speed-up the execution of IDA* is proposed. Problem specific pruning techniques, which are crucial for good performance, are studied and new pruning techniques proposed for the two memory limited algorithms. Duplicate avoidance without memory and processor normalisation without memory are the pruning techniques proposed. Extensive experiments are conducted to evaluate and compare the proposed algorithms with previous optimal algorithms.