Abstract:
This is a report on solving regression (hypersurface fitting, function approximation) prob-lems in high dimensional spaces by novel learning rule called Active Set - Least Squares (AS-LS) algorithm for kernel machines (a.k.a. support vector machines (SVMs)) and RBF (a.k.a. regularization) networks, multilayer perceptron NNs and other related networks. Regression is a classic statistical problem of learning from empirical data (i.e., examples, samples, measurements, records, patterns or observations) where the presence of a training data set D = {[x(i), y(i)] n , i = 1,...,N} is a starting point in reaching the solution. (N stands for the number of the training data pairs and it is therefore equal to the size of a set D). Often, yi is denoted as di (ti), where d (t) stands for a desired (target) value. The data set D is the only information available about the dependency of y upon x. Hence, we are dealing with the supervised learning problem and solution technique here. The basic aim of this report is to give, as far as possible, a condensed (but systematic) presentation of a novel regression learning algorithm for training various data modeling networks. The AS-LS learning rule resulted from an extension of the active set training al-gorithm for SVMs as presented in (Vogt and Kecman, 2004, 2005). Unlike for SVMs where one uses only the selected support vectors (SVs) while computing their dual vari-ables i, in an AS-LS method all the data will be used while calculating the weights wi of the regressors (i.e., SVs) chosen at a given iteration step. Our focus will be on the con-structive learning algorithm for regression problems (although the same approach applies to the classification (pattern recognition) tasks ). In AS-LS we don't solve a quadratic pro-gramming (QP) problem typical for SVMs. Instead, the overdetermined least squares prob-lem will be solved at each step of an iterative learning algorithm. As in active set method for SVMs, a single data point violating the 'Karush-Kuhn-Tucker' (KKT) conditions the most will be selected and added as the new support vector i.e., regressor at each step. However, the weights wi of the selected regressors will be computed by using all the avail-able training data points. Thus, unlike in SVMs, the non-regressors (non-SVs) do influence the weights of regressors (SVs) in an AS-LS algorithm. A QR decomposition of a systems matrix H for calculating wi will be used in an iterative updating scheme with no need to find the matrix Q at all. This makes AS-LS fast algorithm. In addition, the AS-LS algo-rithm with box-constraints -C wi C i = 1, NSV, has also been developed, and this resem-bles the soft regression in SVMs. The resulting model is parsimonious, meaning the one with a small number of support vectors (hidden layer neurons, regressors, basis functions). Comparisons to the results obtained by classic SVMs are also shown. A weighted AS-LS algorithm that is very close to active set method for solving QP based SVMs learning problem has been introduced too.