From One to Infinity: New Algorithms for Reinforcement Learning and Inverse Reinforcement Learning

Chen, Yang

dc.contributor.advisor	Liu, Jiamou
dc.contributor.author	Chen, Yang
dc.date.accessioned	2022-05-18T20:35:30Z
dc.date.available	2022-05-18T20:35:30Z
dc.date.issued	2022	en
dc.identifier.uri	https://hdl.handle.net/2292/59350
dc.description.abstract	bandit algorithm that achieves a good trade-off in this complex environment setting, which was previously challenging to handle. Then, for the multi-agent case, we develop an MARL algorithm that can reproduce the emergence of structural norms in multi-agent systems where agents are connected in a networked manner, i.e., the formation of particular structural properties as a function of agent interactions. Our method bridges the gap between structural norms and MARL as existing approaches for norm emergence either ignores the structures in multi-agent systems or fail to analyse structural norms on the ground of MARL. Lastly, for the many-agent case where the number of agents is far more than two. We study IRL for large-scale multi-agent systems, which has long been intractable due to the curse of dimensionality. To achieve tractability, we adopt mean field games as the model for multi-agent systems. We propose two novel IRL algorithms, which we collectively call mean field IRL. The first algorithm builds the theoretical foundations and justifications for IRL in mean field games; while the second algorithm offers an efficient probabilistic framework for reward inference in mean field games. These two algorithms transfer IRL to mean field games both theoretically and practically, broadening our scope towards modelling purposeful behaviours for large populations. We empirically evaluate these new algorithms on simulated and real-world scenarios, including recommender systems, simulated social interactions and simulated economic problems. Experimental results justify the effectiveness of these new algorithms and demonstrate their outperformance over the existing methods in the literature.
dc.publisher	ResearchSpace@Auckland	en
dc.relation.ispartof	PhD Thesis - University of Auckland	en
dc.relation.isreferencedby	UoA	en
dc.rights	Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated.	en
dc.rights	Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
dc.rights.uri	https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/nz/
dc.title	From One to Infinity: New Algorithms for Reinforcement Learning and Inverse Reinforcement Learning
dc.type	Thesis	en
thesis.degree.discipline	Computer Science
thesis.degree.grantor	The University of Auckland	en
thesis.degree.level	Doctoral	en
thesis.degree.name	PhD	en
dc.date.updated	2022-04-20T21:21:16Z
dc.rights.holder	Copyright: The author	en
dc.rights.accessrights	http://purl.org/eprint/accessRights/OpenAccess	en