Openly accessible

An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users

Bignold, A, Cruz, F, Dazeley, Richard, Vamplew, P and Foale, C 2021, An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users, Biomimetics, vol. 6, no. 1, pp. 1-15, doi: 10.3390/biomimetics6010013.

Attached Files
Name Description MIMEType Size Downloads
dazeley-evaluationmethodology-2021.pdf Published version application/pdf 1.18MB 5

Title An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users
Author(s) Bignold, A
Cruz, F
Dazeley, RichardORCID iD for Dazeley, Richard orcid.org/0000-0002-6199-9685
Vamplew, P
Foale, C
Journal name Biomimetics
Volume number 6
Issue number 1
Start page 1
End page 15
Total pages 15
Publisher MDPI
Place of publication Basel, Switzerland
Publication date 2021
ISSN 2313-7673
2313-7673
Keyword(s) interactive reinforcement learning
methodology for simulated users
reinforcement learning
reward shaping
Summary Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing so can be considerable. Additionally, reusing the same people for the experiment introduces bias, as they will learn the behaviour of the agent and the dynamics of the environment. This paper presents a methodology for evaluating interactive reinforcement learning agents by employing simulated users. Simulated users allow human knowledge, bias, and interaction to be simulated. The use of simulated users allows the development and testing of reinforcement learning agents, and can provide indicative results of agent performance under defined human constraints. While simulated users are no replacement for actual humans, they do offer an affordable and fast alternative for evaluative assisted agents. We introduce a method for performing a preliminary evaluation utilising simulated users to show how performance changes depending on the type of user assisting the agent. Moreover, we describe how human interaction may be simulated, and present an experiment illustrating the applicability of simulating users in evaluating agent performance when assisted by different types of trainers. Experimental results show that the use of this methodology allows for greater insight into the performance of interactive reinforcement learning agents when advised by different users. The use of simulated users with varying characteristics allows for evaluation of the impact of those characteristics on the behaviour of the learning agent.
Language eng
DOI 10.3390/biomimetics6010013
Indigenous content off
HERDC Research category C1 Refereed article in a scholarly journal
Free to Read? Yes
Use Rights Creative Commons Attribution licence
Persistent URL http://hdl.handle.net/10536/DRO/DU:30148126

Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 0 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 11 Abstract Views, 6 File Downloads  -  Detailed Statistics
Created: Thu, 18 Feb 2021, 08:06:46 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.