I happened to come across Stanford's CS 229 (Machine Learning) videos on YouTube by Andrew Ng. Here's the link to the first video: https://www.youtube.com/watch?v=UzxYlbK2c7E. Here is the course webpage with notes: http://cs229.stanford.edu. (There's also a Coursera version of the class, but it looks different and seems to be coming in at a lower level of difficulty. I'm going to hope that my background is enough to figure this out after just watching the lectures.)
This is going to be a far more technical skill to learn than making colored dots on a screen, so it will be a bit more of a journey. But I think the end outcome from learning this will be something pretty great if I can get there. I'm going to try to get a computer to learn Tic-Tac-Toe. But I don't want to program a strategy for it to follow. I want it to use data from randomly generated games and see if it can't learn how to play well from that information. I need to take the time to think through the details carefully, as watching the videos at 2x speed (as I've been doing) is great for getting general ideas but not so good for setting things up. I would like to do this with both supervised and unsupervised learning, and I'll need to program my own Tic-Tac-Toe game to generate the data for the computer to try to learn from.
The long term goal of this is that I'd like to try to apply this type of machine learning to other games and maybe even something like horse racing. But that's pretty far out into the future. Right now, I just want to think about Tic-Tac-Toe.
What's the plan? I don't have one. Yet. But I have the components of one.
- Build a Tic-Tac-Toe Game that can play itself by making random moves and can log the games that it plays. How exactly this will look is an open question. This will probably need to be tinkered with several times depending on the model that's being created and the way that the log needs to be formatted to make it work with the learning algorithm.
- Create a supervised learning model. I barely know what that means, but I think in the space of all possible games (approximately 9! games), there are winners and losers (say, if you are the first player). I can imagine the games being embedded into a 9 dimensional space and that we're trying to find ways to estimate when we'll have a winner and when we'll have a loser, and we can use that estimation to make future moves. I think that the section on generative learning algorithms is what I should be looking at, but it's entirely possible that I have no idea what I'm talking about. I may have to backtrack and do some of the problem sets to understand what I'm saying.
No comments:
Post a Comment