Member-only story
There is so little code that I have been able to find on reinforcement learning that I decided to write a code review on a simple exercise I obtained whilst watching a data science by Edureka!.
The updating equation for Q learning based on the Bellman Equation can be seen below:-
The reinforcement learning exercise discussed in this post is to move the marker to room 5, which is outside. A diagram of the floorplan can be seen below:-
Reinforcement learning is based on trial and error, so a system of rewards has been created to help the reinforcement learning program learn through trial and error to move the pointer from room to room until it makes it to room 5 (being outside):-
Based on the system of rewards that has been determined, a matrix must be designed to determine what each reward is for moving the pointer from one room to the next:-
It is now time to code the reinforcement learning maze in Python. I have created the program in Google Colab, which is a free online Jupyter Notebook hosted by Google. Google Colab is a great program to code data science projects because it is…