Deakin University
Browse

File(s) under permanent embargo

Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Learning

Version 2 2024-06-06, 01:48
Version 1 2020-07-06, 16:15
journal contribution
posted on 2024-06-06, 01:48 authored by H Cha, Jihong ParkJihong Park, H Kim, M Bennis, SL Kim
IEEE Traditional distributed deep reinforcement learning (RL) commonly relies on exchanging the experience replay memory (RM) of each agent. Since the RM contains all state observations and action policy history, it may incur huge communication overhead while violating the privacy of each agent. Alternatively, this article presents a communication-efficient and privacy-preserving distributed RL framework, coined federated reinforcement distillation (FRD). In FRD, each agent exchanges its proxy experience replay memory (ProxRM), in which policies are locally averaged with respect to proxy states clustering actual states. To provide FRD design insights, we present ablation studies on the impact of ProxRM structures, neural network architectures, and communication intervals. Furthermore, we propose an improved version of FRD, coined mixup augmented FRD (MixFRD), in which ProxRM is interpolated using the mixup data augmentation algorithm. Simulations validate the effectiveness of MixFRD in reducing the variance of mission completion time and communication cost, compared to the benchmark schemes, vanilla FRD, federated reinforcement learning (FRL), and policy distillation (PD).

History

Journal

IEEE Intelligent Systems

Volume

35

Season

July-Aug

Pagination

94-101

Location

Piscataway, N.J.

ISSN

1541-1672

eISSN

1941-1294

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Issue

4

Publisher

Institute of Electrical and Electronics Engineers