After successfully testing extremely basic character recognition 6 months ago, I was introduced to the MNIST handwriting dataset generously provided to ML researchers by Yann LeCun, Corinna Cortes, and Christopher J.C. Burges.
The initial results from testing AIRIS on 1,000 characters from this dataset with no prior training was a meager ~%51 accuracy. I quickly realized that I would need to complete several more development milestones in order to achieve better results. So I buried my head in the code, and these are the resulting improvements:
Previous versions of AIRIS associated the casual relationships between states with all of the pixels in its inputs. In the case of MNIST that resulted in a poor ability to predict what the image label would be unless the image shown was very similar to an image it had seen before. In the case of the Puzzle Game Agency Test, knowledge about object interactions such as keys and doors in one level would only partially persist into the next level if the next level was also very similar.
However, AIRIS now has the ability to deduce causal relationships between states to just the relevant pixels. This conceptual understanding allows it to broadly apply its learned knowledge to novel situations.
Effect on MNIST Test
An MNIST test of just 100 characters with no prior training has a slightly increased accuracy of 55%. Since it had no prior training, it also had to learn the labels as it went. For example, the first image it is shown is a 7. It has no guess because it has no data on anything. The second image it is shown is a 2. It only knows of 7, so that’s what it guesses. The third image is a 1. Since it only knows of 7 and 2, it determines that the 1 is closest to the 7.
Below is all 100 of the images in the order they were shown to AIRIS. The number on the right is best described as its “doubt”. The smaller the number, the higher its confidence in its answer.
After 1,000 characters its accuracy improves to 69.9%. After 2,000 characters its accuracy is 76.5%. I have not yet tested it beyond 2,000 characters because it took a little over 8 days non-stop to get that far with the current language and serial processing hardware I’m using. I’m looking forward to getting a final accuracy on all 70,000 MNIST images once AIRIS is ported to a faster language with parallel processing.
Effect on Agency Test
The most dramatic improvement is with the Puzzle Game Agency Test. Now instead of having to play through, or be shown the solution to each level, AIRIS can be shown simple human controlled “tutorials” such as this:
And just by observing the pixels and the commands input by the human, it can deduce the interactions between the objects (pixels) and apply that knowledge to achieve its goal of collecting batteries in levels it has never seen before. In the example below, it was shown 689 tutorial frames (such as the 33 frames shown above). From that small amount of data it was able to deduce how the state changes with each command, and then apply that knowledge to model the sequences of commands needed to solve the levels.
AIRIS can also more effectively reason through situations it has not been trained on. In the example below, it has been taught all the object interactions except for fire on the other side of the door. Standing in a doorway is a different pixel value than standing in an empty space (represented by a blue square inside the brown square). So while it knew that the “robot” pixel could put out a fire by moving onto the fire from an empty space after collecting a fire extinguisher, it did not know that the “robot in a doorway” pixel could do the same.
It quickly learns that moving onto the fire from a doorway resets the level, but it isn’t certain if the fire is the cause or if that happens every time it moved right from inside a doorway. Which is why at the pause point in the middle of the example, you see the AI’s model of the world on the right has a whole lot more colored pixels than there should be. It thought that moving right against the wall would reset the level similarly to when it moved against the fire. It immediately saw that wasn’t the case, and deduced that a wall just makes it stay put. Eventually through experimentation, it learns that it needs a fire extinguisher to put out the fire on the other side of a door just like it needs it when standing in an empty space.
Effect on Contextual Memory
The efficiency of AIRIS’s Contextual Memory has also improved. By deducing down to only the relevant pixels to a state change, significantly less data is stored and accessed for prediction modeling.
All of the existing features of the Contextual Memory also remain intact. Such as the ability to seamlessly merge two uniquely trained agents into a third agent that has the knowledge of both original agents. The agent in the example below was created by merging one agent that was only trained on one-way arrows with another agent that was only trained on doors and fire. Independently, neither original agent could have completed the level without having to experiment and learn about the elements it wasn’t trained on. The combined agent has no difficulty whatsoever.
The Contextual Memory also allows for multi-domain knowledge to co-exist without interference. For example, all of the memories from the MNIST test and all of the memories from the Agency test can be contained within a single agent. The agent would be capable of both playing the puzzle game and recognizing handwritten numbers with no degradation to either ability.
There is still a lot of work to be done! There are 2 new domain types that I will be testing AIRIS in next.
The first is an information incomplete version of the puzzle game, where only a small portion of the game world is visible at a time. AIRIS will have to remember and be able to model parts of the level that it can’t currently see to be able to achieve its goals. Edit: See https://airis-ai.com/2017/10/17/modeling-information-incomplete-worlds/
The second is the classic board game: Checkers.