The project proposes a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. This is the implementation of the research paper with the same name available here. Flappy bird game is used for the implementation.
- Python 3
- Tensorflow
- pygame
- OpenCV-Python
- First clone the project into your local system
git clone https://github.com/Nuclearstar/Asynchronous-Methods-for-Deep-Reinforcement-Learning.git
- Then change directory to this project
cd Asynchronous-Methods-for-Deep-Reinforcement-Learning
- Then setup a virtual env
python -m venv myenv
- Then activate your virtual env
cd myenv
cd Scripts
activate
- Further change directory to project root
cd ..
cd ..
- Next install all the required packages in the virtual env
pip install -r requirements.txt
- Now you are ready to run the program
python deep_q_network_actual.py
We present asynchronous variants of three standard reinforcement learning algorithms:
- One-step method
- Actor critic method
- n steps Q-learning method.
-
We present multi-threaded asynchronous n-step Q-learning and advantage actor-critic.
-
First, we use asynchronous actor-learners, similarly to the Gorila framework, but instead of using separate machines and a parameter server, we use multiple CPU threads on a single machine.
-
Second, we make the observation that multiple actors-learners running in parallel.
-
Deep Q-Network is a convolutional neural network, trained with a variant of Q-learning.
-
Q-function with a neural network, that takes the state and action as input and outputs the corresponding Q-value.
-
One forward pass through the network and having all Q-values for all actions available is made use of.
Flappy bird game is used for the implementation.
-
We have presented asynchronous versions of three standard reinforcement learning algorithms and also showed that they are able to train neural network controllers on a variety of domains in a stable manner.
-
Combining other existing reinforcement learning methods or recent advances in deep reinforcement learning with our asynchronous framework presents many possibilities for immediate improvements to the methods we presented.