Abstract:
Parallel Task is a Java-based technology that incorporates the benefits of task parallelism into Object-Oriented Java applications. However, Parallel Task’s existing Help-First scheduling policy struggles to perform well in fine-grained nested parallelism; while the Work-First scheduling policy is ideal in fine-grained nested parallelism, the typical Work-First approach is not viable due to its conflicting behaviours with Parallel Task. An alternative strategy for a Work-First solution in Parallel Task is needed. This thesis presents a solution that addresses the performance limitations identified in Parallel Task by combining both the existing Help-First and Work-First scheduling policies together in Parallel Task in an effort to help alleviate its performance limitations in finegrained nested parallelism. This solution uses the advantages of both scheduling policies where Help-First is used to distribute tasks for load balancing, while Work-First is used to execute tasks efficiently during fine-grained nested parallelism. The proposed solution comprises of three different implementations of Work-First: Global Task Population Control, Local Task Queue Control, and Task Depth Control. Each implementation monitors the load of Parallel Task and governs when it is suitable to employ Work-First, based on their own perspectives. The Global Task Population Control evaluates the load of Parallel Task based on the population of tasks in the system. The Local Task Queue Control evaluates the load of each worker task queue. The Task Depth Control examines the hierarchical task structure by examining the nested depth of a task. Each implementation will apply Work-First to newly created tasks if Parallel Task is found to perform inefficiently from their perspectives. All three proposed control policies for the integration of Work-First into Parallel Task are evaluated in numerous experiments. The results show that the Work-First scheduling for Parallel Task achieves a clear and significant improvement in performance over the pure Help- First approach. However, this improvement is not enough to gain a speedup > 1.00 for finegrained nested parallelism in certain situations. The Task Depth Control was found to have the best overall performance between all three implementations in all tested environments.