Improving the Generalisation of Neural Networks with Unlabelled Data

Peng, Alex Yuxuan

Improving the Generalisation of Neural Networks with Unlabelled Data

Peng, Alex Yuxuan

Identifier: https://hdl.handle.net/2292/57517

Issue Date: 2021

Degree Name: PhD

Degree Grantor: The University of Auckland

Rights: Copyright: The author

Rights (URI): https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm

Abstract:

Neural networks have attracted a lot of attention from both academia and industry due to their success in challenging tasks such as machine vision, speech recognition and natural language processing. However, in order to achieve good performance on supervised learning tasks, neural networks have to be trained on large amounts of labelled training data. This often requires human experts to painstakingly label every single example in the data. The labelling process can be costly and time consuming. Even though labelled data can be scarce and expensive to collect in practice, unlabelled data are usually available in abundance. The goal of this thesis is to improve the generalisation performance of neural networks on classiﬁcation tasks by utilising the unlabelled data. This thesis studies three strategies to tackle this problem: pretraining, semi-supervised learning and active learning. In this thesis, we propose a self-supervised pretraining method for tabular data that learns to identify real data from randomly shuﬄed data. Then the weights learned in the pretraining are reused as initial weights for the original task on the labelled training set. In the second piece of work, we break the common assumption in semi-supervised learning that the labelled data and unlabelled data come from the same distribution. We empirically show that novel classes in unlabelled data can lead to a degradation in generalisation performance for semi-supervised algorithms. We propose a 1-nearest-neighbour based method to assign a weight to each unlabelled example in order to reduce the negative eﬀect of novel classes in unlabelled data. Lastly, we propose a new uncertainty-based active learning method speciﬁcally for neural networks trained using stochastic gradient descent by querying examples whose predictions change the most during the training. Experimental results show that the proposed method is more eﬀective when a large labelled training set is already available. We also show that diﬀerent types of active learning methods perform diﬀerently under diﬀerent settings. It suggests that to fully evaluate the characteristics of an active learning algorithm, experiments under a wide range of settings are required.

Show full item record