Federated Reinforcement Learning#
Reinforcement Learning is a machine learning paradigm centered around a trained agent taking actions given a current state in order to maximize some reward function. See for example the Wikipedia page on Reinforcement Learning.
Reinforcement Learning has been used to train agents that successfully carry out tasks in areas such as GO, computer games, or autonomous driving vehicles.
OctaiPipe implements a framework for training Reinforcement Learning agents in Federated settings. It uses the same commands and structure as other FL workloads, with some adjustment for RL specific components and configuration. A sample config file for Federated Reinforcement Learning (FRL) is shown below.
name: federated_learning
infrastructure:
device_ids:
- device-0
- device-1
image_tag: 3.1.0
model_specs:
type: frl
name: test_rl_model
model_params:
env:
path: ./envs/env_test.py
params:
gravity: 9.8
masscart: 1.0
policy:
name: PPO
params:
learning_rate: 0.0003
gamma: 0.99
clip_range: 0.2
n_steps: 128
batch_size: 64
n_epochs: 10
run_specs:
backend: frl
The config file follows similarly to the config of other FL experiments. The name at
the top should be federated_learning
. There’s an infrastructure
section to define
on what devices and with what version of OctaiPipe to run the experiment. In the run_specs
,
the only key field is backend
, which should be set to frl
.
The data_specs
and model_specs
will be explained inmore detail below:
Data specs#
As opposed to other FL workloads in OctaiPipe, FRL does not use input_data_specs
or
evaluation_data_specs
. This is because some RL environments do not necessarily need
to read data for training or evaluation. If data loading is required, this should be
handled in the RL environment Python file. See more at RL Environments.
Model specs#
The model_specs
configure the details for what model to run and how. For FRL, the
type
should always be frl
and the name
should be something that helps you
understand what the model does, e.g. thermostat_control_agent
.
The model_params
contain two fields. The env
field has a path
specifying the
local filepath to a Python file containing the RL environment and params
which contains
the parameters to set for that environment. Details on environments found at RL Environments.
The policy
field contains a name
, which specifies which FRL policy to use and
params
to hand that policy. Policies are explained in detail in RL Policies.
RL Environments and Policies#
Federated Reinforcement Learning