Enhancing Reinforcement Learning Efficiency: Novel Distributed Algorithms, Human-Inspired Reward Mechanisms, and State Space Analysis for Simulated and Real-World Applications

Reference

Degree Grantor

The University of Auckland

Abstract

Reinforcement Learning (RL), a subset of Machine Learning (ML), has garnered significant attention because of its capacity to enable systems to learn problem-solving without the need for pre-collected datasets. Despite its historical roots and its efficacy in tackling tasks, mostly in simulation and video games, RL algorithms encounter various limitations. These include a lack of standardisation in definitions and concepts, sample inefficiency in sparse reward environments, challenges with high-dimensional state spaces, bridging of the sim-to-real gap, complex training strategies, and the necessity for meticulous hyperparameter tuning. Consequently, the widespread application of RL in solving intricate real-world scenarios with straightforward implementations remains an unresolved challenge. This thesis endeavours to address these challenges by first providing clear explanations of RL concepts and definitions. Subsequently, it identifies and analyses the limitations in diverse RL algorithms before presenting two novel solutions inspired by the human brain and the contextual intricacies of the real world.

Specifically, two new algorithms named NaSA-TD3 and CTD4 are introduced. NaSA-TD3 explores the impact of human-inspired stimuli presented as a reward bonus to improve the exploration of the environment and sample efficiency in dense and sparse environments. CTD4 mitigates the overestimation bias of the Q-values and eases the implementation and training strategies of traditional categorical RL methods. Further, this thesis critically examines the applicability of Model-Based Reinforcement Learning (MBRL) in a real-world scenario in terms of sample efficiency and hyperparameter robustness by identifying its strengths and weaknesses in using dynamic model representations.

Through rigorous experimentation and analysis in both real-world and simulated environments, this thesis contributes to advancing the understanding and application of RL. Validation on physical robots and standardised virtual platforms confirms the applicability of the proposed algorithms. Furthermore, detailed experiences and recommendations are presented. By providing clear definitions, thorough explanations, effective and innovative algorithms with trustworthy implementation, this thesis opens new avenues for the practical implementation of RL in diverse domains where all source codes and clear instructions on their usage are provided.

Description

DOI

Related Link

Keywords

ANZSRC 2020 Field of Research Codes

Reinforcement Learning, Machine Learning, Robotics

Collections