Abstract:
Deep learning has become the state of the art algorithm for many areas of machine learning research. Deep learning can achieve high performance but it requires a lot of training data and is not readily interpretable.This makes it difficult to develop algorithms for new applications, as errors cannot easily be resolved when the algorithm’s reasoning is obscure, and large datasets cannot always be obtained. To address these limitations logic knowledge bases can be used for interpretable reasoning while deep learning is used used for any processing that is not suitable for representation by a knowledge base. A knowledge base requires no training data and provides interpretable outputs, and in addition, the intermediate outputs from knowledge base components can be used to guide learning for the deep learning components, making training more efficient. These advantages were central to the design of a novel architecture, which used deep learning for image processing and a combination of non-monotonic logic and decision tree induction for interpretable reasoning. Performance was evaluated with visual question answering (VQA), and a state of the art planning architecture was used to show the advantages of knowledge bases in dynamic domains. The VQA architecture’s performance was evaluated in two domains: estimating the stability of simulated block towers, and determining the messages conveyed by traffic signs. Experimental results showed that the proposed architecture achieved interpretability and high performance with small training datasets. The planning architecture was evaluated in a dynamic tower-building domain. Another domain in which a simulated Turtlebot acted as an office assistant was used to show that the VQA and planning architectures could be combined. The performance of knowledge base reasoning was improved through the use of automated axiom learning, which was shown in all four domains. In the stability and traffic domains the use of axiom learning to add to the knowledge base improved VQA accuracy, while in the tower building and office assistant domains the action plans developed after axiom learning were more sensible and efficient than those developed before.