Skip to content

Latest commit

 

History

History
executable file
·
17 lines (9 loc) · 420 Bytes

File metadata and controls

executable file
·
17 lines (9 loc) · 420 Bytes

Maze problem with Reinforcement Learning

Maze Environment

The environment can be represented as:

  • States: tiles

  • Actions: Left, Right, Up, Down

  • Reward: +1 for gold state, -1 for black state, 0 for others.

Results

After 50 episodes, the number of movements get converged to the optimal. The reward also goes to 1.

Movement and reward trend