Networks are dividied depending on their context.
For some reason it’s often to find convetion of heads and bodies, and that’s why we’re keeping it here.
If you haven’t heard of these before think about the Frankeinstain monster.
Body is not a whole body but rather a body part, e.g. arms and legs.
Obviously(!), they don’t work by themselves so you need a head which will control them.
Some heads take body parts explicitly and build the whole monstrocity and some heads are predefined
to closely match suggestion in a paper. So, in general, a head is more complex and does more than a body,
but for some agents a single body part, e.g. Fully connected network, is good enough.
Constructs a layered network over torch.nn.Conv2D. Number of layers is set based on hidden_layers argument.
To update other arguments, e.g. kernel_size or bias, pass either a single value or a tuple of the same
length as hidden_layers.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Mainly used to estimate the state-action value function in actor-critic agents.
Actions are included (by default) in the first hidden layer (changeable).
Since the main purpose for this is value function estimation the output is a single value.
in_features (tuple of ints) – Dimension of the input features.
inj_action_size (int) – Dimension of the action vector that is injected into inj_action_layer.
out_features (tuple of ints) – Dimension of critic’s action. Default: (1,).
hidden_layers (tuple of ints) – Shape of the hidden layers. Default: (100, 100).
inj_action_layer (int) – An index for the layer that will have actions injected as an additional input.
By default that’s a first hidden layer, i.e. (state) -> (out + actions) -> (out) … -> (output).
Default: 1.
Keyword Arguments:
bias (bool) – Whether to include bias in network’s architecture. Default: True.
gate (callable) – Activation function for each layer, expect the last. Default: Identity layer.
gate_out (callable) – Activation function after the last layer. Default: Identity layer.
device – Device where to allocate memory. CPU or CUDA. Default CUDA if available.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
For the activation layer we use tanh by default which was observed to be much better, e.g. compared to ReLU,
for policy networks [1]. The last gate, however, might be changed depending on the actual task.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
A linear layer with added noise perturbations in training as described in [1].
For a fully connected network of NoisyLayers see NoisyNet.
Parameters:
in_features (tuple ints) – Dimension of the input.
out_features (tuple ints) – Dimension of the output.
sigma (float) – Used to intiated noise distribution. Default: 0.4.
factorised – Whether to use independent Gaussian (False) or Factorised Gaussian (True) noise.
Suggested [1] for DQN and Duelling nets to use factorised as it’s quicker.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
in_features (tuple ints) – Dimension of the input.
out_features (tuple ints) – Dimension of the output.
hidden_layers (sequence ints) – Sizes of latent layers. Size of sequence denotes number of hidden layers and
values of the sequence are nodes per layer. If None is passed then the input goes straight to output.
Default: (100, 100).
sigma (float) – Variance value for generating noise in noisy layers. Default: 0.4 per layer.
factorised (bool) – Whether to use independent Gaussian (False) or Factorised Gaussian (True) noise.
Suggested [1] for DQN and Duelling nets to use factorised as it’s quicker.
Keyword Arguments:
gate (callable) – Function to apply after each layer pass. For the best performance it is suggested
to use non-linear functions such as tanh. Default: tanh.
gate_out (callable) – Function to apply on network’s exit. Default: identity.
device (str or torch.device) – Whether and where to cast the network. Default is CUDA if available else cpu.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Heads are build on Brains.
Like in real life, heads do all the difficult part of receiving stimuli,
being above everything else and not falling apart.
You take brains out and they just do nothng. Lazy.
The most common use case is when one head contains one brain.
But who are we to say what you can and cannot do.
You want two brains and a head within your head? Sure, go crazy.
What we’re trying to do here is to keep thing relatively simple.
Unfortunately, not everything can be achieved [citation needed] with a serial
topography and at some point you’ll need branching.
Heads are “special” in that each is built on networks/brains and will likely need
some special pipeping when attaching to your agent.
Computes discrete probability distribution for the state-action Q function.
CategoricalNet [1] learns significantly different compared to other nets here.
For this reason it won’t be suitable for simple replacement in most (current) agents.
Please check the Agent whether it supports.
The algorithm is used in the RainbowNet but not this particular net.
num_atoms – Number of atoms that disceritze the probability distrubition.
v_min – Minimum (edge) value of the shifted distribution.
v_max – Maximum (edge) value of the shifted distribution.
net – (Optional) A network used for estimation. If net is proved then hidden_layers has no effect.
obs_space – Size of the observation.
action_size – Length of the output.
hidden_layers – Shape of the hidden layers that are fully connected networks.
Note that either net or both (obs_space, action_size) need to be not None.
If obs_space and action_size are provided then the default net is created as
fully connected network with hidden_layers size.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
in_features (tuple of ints): Shape of the input.
out_features (tuple of ints): Shape of the expected output.
Keyword Arguments:
hidden_layers (tuple of ints) – Shape of fully connected networks. Default: (200, 200).
num_atoms (int) – Number of atoms used in estimating distribution. Default: 21.
v_min (float) – Value distribution minimum (left most) value. Default -10.
v_max (float) – Value distribution maximum (right most) value. Default 10.
noisy (bool) – Whether to use Noisy version of FC networks.
pre_network_fn (func) – A shared network that is used before value and advantage networks.
device (None, str or torch.device) – Device where to cast the network. Can be assigned with strings, or
directly passing torch.device type. If None then it tries to use CUDA then CPU. Default: None.