GitHub - reiniscimurs/DRL-robot-navigation-IR-SIM: Deep Reinforcement Learning for mobile robot navigation in IR-SIM simulation. Using DRL (SAC, TD3, PPO, DDPG) neural networks, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.

DRL Robot navigation in IR-SIM

Deep Reinforcement Learning algorithm implementation for simulated robot navigation in IR-SIM. Using 2D laser sensor data and information about the goal point a robot learns to navigate to a specified point in the environment.

Installation

Package versioning is managed with poetry
pip install poetry
Clone the repository
git clone https://github.com/reiniscimurs/DRL-robot-navigation.git
Navigate to the cloned location and install using poetry
poetry install

Training the model

Run the training by executing the train.py file
poetry run python robot_nav/train.py
To open tensorbord, in a new terminal execute
tensorboard --logdir runs

Sources

Package	Description	Source
IR-SIM	Light-weight robot simulator	https://github.com/hanruihua/ir-sim
PythonRobotics	Python code collection of robotics algorithms (Path planning)	https://github.com/AtsushiSakai/PythonRobotics

Models

Model	Description	Model Source
TD3	Twin Delayed Deep Deterministic Policy Gradient model	https://github.com/reiniscimurs/DRL-Robot-Navigation-ROS2
SAC	Soft Actor-Critic model	https://github.com/denisyarats/pytorch_sac
PPO	Proximal Policy Optimization model	https://github.com/nikhilbarhate99/PPO-PyTorch
DDPG	Deep Deterministic Policy Gradient model	Updated from TD3
CNNTD3	TD3 model with 1D CNN encoding of laser state	-
RCPG	Recurrent Convolution Policy Gradient - adding recurrence layers (lstm/gru/rnn) to CNNTD3 model	-
MARL: TD3-G2ANet	G2ANet attention encoder for TD3 model in MARL setting	G2ANet adapted from https://github.com/starry-sky6688/MARL-Algorithms
MARL: TD3-IGS	In-Graph Softmax attention model for TD3 model in MARL setting	-

Max Upper Bound Models

Models that support the additional loss of Q values exceeding the maximal possible Q value in the episode. Q values that exceed this upper bound are used to calculate a loss for the model. This helps to control the overestimation of Q values in off-policy actor-critic networks. To enable max upper bound loss set use_max_bound = True when initializing a model.

Model
TD3
DDPG
CNNTD3

MARL Models

Multi agent RL setting with training multiple robots a the same time with a single policy. Implementation does not use sensor informaition and only exchanged graph messages are used for navigation.

Video of results:

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.github/workflows		.github/workflows
docs		docs
robot_nav		robot_nav
tests		tests
.gitignore		.gitignore
README.md		README.md
image.png		image.png
mkdocs.yml		mkdocs.yml
out.gif		out.gif
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

reiniscimurs/DRL-robot-navigation-IR-SIM

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages