As part of a lecture at my university, I’ve developed a Linefollower robot which learns to follow a line on it’s own. It uses Q-Learning, a type of reinforcement learning to achieve this goal. The original robot used an Arduino, two servo’s and a lightsensor array. I’ve replaced the Arduino with a Raspberry Pi with the intention to use a camera instead of a lightsensor array to find the line.
But first i started small and programmed a simple simulation in Python with Tkinter to learn how Q-Learning works. The simulation consists of a quadratic environment with 20×20 tiles. The lightsensor array was represented as a red rectangle which can enclose three of those tiles. It has three actions: move forward, turn left and turn right. The environment is endless which means if you leave it on the left side you will return on it on the right side (similar to the game Snake).


Each tile can be black (1) or white (0) which results in 8 possible states for the lightsensor (000, 001, 010, 100, 101, 110, 111). The state 000 get’s a negative reward of -1, which means the sensor is not on the line at all. The state 010 get’s a positive reward of +1 which means the sensors has reached the best position on the line. The Q-Learning algorithm now will try to reach the best state and optimize the actions it can do in each state. At the beginning this is pure randomness. The result can be seen in the video below:
The next step was to bring this simulation to real life. The simulation was pretty sucessful but doens’t work at all with the actual robot. One major issue was that the robot left the environment very quickly and it didn’t learn anything. The rewards needed to be changed and an additional action was implemented (moving backwards). So now we’ve got four actions: forward, left, right, backwards. The result can be seen in the videos below:

learning process
learned
An interesting and funny fact: the robot was never tested on the big parkour before. The lines are thicker than on my small test parkour. I doubted that it will work but reality proved me wrong. It worked flawless and was a proof that the Q-Learning algorithm was working. This was also when the professors evaluated my work.
You can find the code on my Github page. Feel free to try it out: https://github.com/denczo/RaspberryBot