Abstract:
Recent advances in "big data" technology have allowed the automated interpretation of large medical databases. The application of Machine Learning (ML) in cerebral palsy (CP) gait pattern recognition and human activity recognition (HAR) has made significant strides in both fields. This work is designed to train supervised ML algorithms for two studies focused on (i) CP gait pattern recognition and (ii) HAR for different accelerometers. The aims for the first study are (i) to determine if incorporating kinetic features can improve classification results and (ii) to determine if PCA can improve classification results. The aims for the second study are (i) to determine the performance of classification models based on different accelerometers (Actigraph GT3X+ and GT9) and (ii)) to determine if PCA can improve classification results. The algorithms used to induce classification models are Decision Tree, Support Vector Machine (SVM), k-Nearest Neighbour and Ensemble classifiers. 10-fold cross validation technique was performed during training. In the first study, data for over 200 cerebral palsy patients was sourced from a publicly available database , using a Vicon motion capture system. Of the 200 plus patients, there were 74 legs that fell into one of the four classification groups, which belonged to 51 subjects. Random Forest performed the best for both original kinematic and mixed kinematic-kinetic features. Overall sensitivity and specificity were the same at 78.4% and 92.8% respectively, whilst average AUC improved by 2% with the addition of kinetic features. In the second study, triaxial accelerometer data was obtained using Actigraph GT3X+ and GT9 placed at the hip and thigh of the right leg. The participant was a healthy individual (20 years old, 1.67m tall, 61.6 kg). SVM performed the best across both individual and combined sensor modalities. The overall accuracy of combining two different accelerometers attached to the hip increased by 0.6% and 3.3%, when compared to the individual GT9 and GT3X+ respectively (96.7% overall). For the thigh, the overall accuracy of the combined sensor model was the same as the GT9, but 0.3% lower than the GT3X+ individually (93.0%). PCA didn't improve results in both studies, suggesting that key information was lost during the feature reduction process. This approach in CP gait classification facilitates the automation of classification in an objective and reliable manner, potentially reducing the time and cost with respect to traditional methods. However, the combination of a high feature number (18), small sample (74 strides), imbalanced data and class number (four), limit the generalisability of the model developed. Further work with a much larger and balanced data set is vital to address these issues. The results in the second study demonstrates the potential to establish a device-independent HAR models for off-the-shelf activity monitors. The realisation of a recognised device-independent standard and device-independent models will enhance communication across researchers, and ultimately lead to better measurement and treatment of PA-related pathology. These findings are exploratory in nature, and only account for data from one subject. Further work with a larger data set from many subjects is required for full validation.