Human feedback in continuous actor-critic reinforcement learning

Millán, C; Fernandes, B; Cruz Naranjo, F

File(s) under permanent embargo

Human feedback in continuous actor-critic reinforcement learning

conference contribution

posted on 2019-01-01, 00:00 authored by C Millán, B Fernandes, F Cruz Naranjo

© 2019 ESANN (i6doc.com). All rights reserved. Reinforcement learning is utilized in contexts where an agent tries to learn from the environment. Using continuous actions, the performance may be improved in comparison to using discrete actions, however, this leads to excessive time to find a proper policy. In this work, we focus on including human feedback in reinforcement learning for a continuous action space. We unify the policy and the feedback to favor actions of low probability density. Furthermore, we compare the performance of the feedback for the continuous actor-critic algorithm and test our experiments in the cart-pole balancing task. The obtained results show that the proposed approach increases the accumulated reward in comparison to the autonomous learning method.

History

Event

Artificial Neural Networks, Computational Intelligence and Machine Learning. European Symposium (27th : 2019 : Bruges, Belgium)

Pagination

661 - 666

Publisher

ESANN

Location

Bruges, Belgium

Place of publication

[Bruges, Belgium]

Start date

2019-04-24

End date

2019-04-26

ISBN-13

9782875870650

Language

eng

Publication classification

E1.1 Full written paper - refereed

Copyright notice

2019, ESANN

Title of proceedings

ESANN 2019 : Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning

Usage metrics

Keywords

Untagged

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Human feedback in continuous actor-critic reinforcement learning

History

Event

Pagination

Publisher

Location

Place of publication

Start date

End date

ISBN-13

Language

Publication classification

Copyright notice

Title of proceedings

Usage metrics

Categories

Keywords

Licence

Exports