Abstract:
Machine learning algorithms play a key role in many fields of science, technology and business. Classification and regression are two fundamental tasks in machine learning. This thesis is devoted to develop learning algorithms for solving the classification and regression problems. A novel learning algorithm dubbed as adaptive local hyperplane (ALH) is proposed to solve the general classification problem. The ALH algorithm belongs to the nonparametric paradigm and it is an extension of the K-local hyperplane distance nearest neighbor (HKNN) classifier. A novel strategy for extending the ALH algorithm into regression problems has also been developed. Experimental results on many real data sets show that the ALH (classification and regression) method gives competitive performance compared to several benchmarking learning algorithms including K-nearest neighbors (KNN), linear discriminant analysis (LDA), classification and regression tree (CART), support vector machine (SVM) and HKNN. In second part of the thesis the ALH classifier is applied to the tasks of face recognition, protein fold recognition, protein subcellular localization and small data set learning. By employing a suitable feature extraction method on the recognition objects (such as images or protein sequences), the ALH classifier can be applied to the numeric data. The experimental results on many benchmarking problems suggest that the ALH classifier can be used as an alternative tool for the classification or pattern recognition procedure involved in each task. Finally, the computational issues of the ALH algorithm are addressed and analyzed. The classification tree and ALH algorithms are combined in an efficient way such that the computational burden of ALH can be dramatically reduced while retaining the same level of generalization performance.