Abstract:
Predicting the effort to develop a Web application is of vital importance for successful project management. Too often Web development projects overrun their allocated budget before completion, resulting in adverse effects on Web applications’ quality or even resulting in cancellation [1-3]. The work presented in this thesis is part of a larger research effort that aims to apply a probabilistic modelling technique known as Bayesian Network (BN) in estimating Web development effort. A Bayesian Network model can be described as a Direct Acyclic Graph (DAG) where the nodes represent variables (factors) of interest to the modeller, and the edges represent dependency (or causation) between the variables. Each node contains a Conditional Probability Table (CPT) that contains conditional probabilities for all the possible states that variable can have [4]. In our research, Bayesian Network models are constructed by eliciting information from domain experts (local Web development companies). Under these circumstances, the majority of the elicitation time is spent on acquiring probabilities for CPTs. Unfortunately, the elicitation process is very time-consuming and can sometimes take years. Therefore, in this thesis, we focused on two objectives: the first was to propose a solution (based on empirical evidence) for the CPT elicitation problem, and the second objective was to propose a methodology for combining (aggregating) independently-elicited Web effort estimation BN models. We embark on the first objective by investigating six CPT generation techniques, namely: the Independence of Causal Influence (ICI) method, the Weighted Sum Algorithm, the Ranked Nodes method, and three interpolation techniques. After an initial comparison, three of the six techniques were selected for further empirical assessment using two real-world case studies, the outcome of which has revealed that the Weighted Sum Algorithm attained the highest accuracy results for large CPTs (over 500 parameters in size); however, for smaller CPTs any of the three techniques were comparable. For the second objective, we proposed an aggregation methodology that utilises a mapping scheme for combining Bayesian Network structures. The methodology was applied to six models and has resulted in the emergence of a consensus causal structure.