Tic Tac Toe - Creating Unbeatable AI with Minimax Algorithm
By Greg Surma
12 - 15 minutes
In today’s article, I am going to show you how to create an unbeatable AI agent that plays the classic Tic Tac Toe game. You will learn the concept of the Minimax algorithm that is widely and successfully used across the fields like Artificial Intelligence, Economics, Game Theory, Statistics or even Philosophy.
Before we go into the AI part, let’s make sure that we understand the game.
Tic-tac-toe (also known as noughts and crosses or Xs and Os) is a paper-and-pencil game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.
I recommend you to play the game yourself, feel free to check out my iOS Tic Tac Toe app.
In order to solve Tic Tac Toe, we need to go deeper than just to think about it as a game where two players place X’s and O’s on the board. Formally speaking, Tic Tac Toe is a zero-sum and perfect information game. It means that each participant’s gain is equal to the other participants’ losses and we know everything about the current game state.
In a two-player (A vs B) game, if player A scores x points (utility units), player B loses x points. Total gains/losses always sum up to 0.
With that in mind, let’s proceed to the Minimax algorithm that’s suited for such cases.
Minimax is a recursive algorithm which is used to choose an optimal move for a player assuming that the opponent is also playing optimally. As its name suggests, its goal is to minimize the maximum loss (minimize the worst case scenario).
You can think of the algorithm as a representation of the human thought process of saying, “OK, if I make this move, then my opponent can only make two moves, and each of those would let me win. So this is the right move to make.”
Let’s take a look at the code!
In order to calculate the minimax score, let’s feed the minimax function with the current board state ([Player]) and an opponent player (Player)
Then let’s check if the board already has a winner. If it’s the player that was passed into the function, we are returning 1, otherwise -1.
Afterward, let’s try every possible move and recursively calculate a minimax score for it.
If the minimax function does not find a terminal state, it keeps recursively going level by level deeper into the game. This recursion happens until it reaches a terminal state and returns a score one level up.
If the score was updated we are returning it as a minimax score; otherwise, it’s a draw so we are returning 0.
I know. It may sound intimidating and not intuitive at first. Mostly because of its recursive nature but once you see it in action, you will easily understand it.
Without further ado, let’s go to the example to get a better understanding of what’s going on.
Let’s look at the following board state. We will try to solve it as X.
How would you solve it?
I guess that would you think that the best move in such a scenario would be to place an X in the middle of the board and win the game.
And you would be correct, but is this the only winning solution for X? How to derive this solution?
In order to determine this, let’s draw a tree of all possible board states.
As you can see above, starting from the initial state 0.0, we have 3 possibilities (1.0, 1.1, 1.2).
1.0 gives us a win (+1), but let’s explore other paths as well.
1.1 gives our opponent 2 possibilities (2.0, 2.1). 2.0 is a winning state for our opponent, so it’s a losing state for us (-1). 2.1 gives only one possibility 3.0 in which we are winning (+1).
1.2 gives our opponent 2 possibilities (2.2, 2.3). 2.2 is a winning state for our opponent, so it’s a losing state for us (-1). 2.3 gives only one possibility 3.1 in which we are winning (+1).
Okay, but how can we interpret this?
Let’s start from the terminal states at the bottom and calculate the minimax scores.
At the depth 3 we are maximizing, so we are propagating +1 scores to the previous moves at the depth 2.
At the depth 2 we are minimizing so we are propagating -1 scores to the previous moves at the depth 1.
At the depth 1 we are maximizing so we are propagating +1 to the previous move at the depth 0.
Ultimately at the depth 0, where we actually are, we should pick the move associated with the +1 score we’ve ended up with.
Tic Tac Toe AI would decide to go to the 1.0 node and win the game.
Our Tic Tac Toe AI performs such simulations for every move thus making itself an unbeatable opponent.
But what makes it unbeatable?
Due to the relatively smallstate space (9!=362 880 possible board combinations), it can easily search the whole game tree for an optimal solution, treating the game as a fully deterministic environment.
On the other hand, chess for example has an enormously large state space of ~10¹²⁰ possibilities.
With such extensive search spaces, we can still use Minimax algorithm, but we have to remember to limit the depth of our search. Otherwise, we can end up computing results for a very long time or even forever.
Long story short, the smaller the state space, the better results we can achieve with the Minimax algorithm.
tl;dr Tic Tac Toe AI is unbeatable. You can draw at most and only with a perfect game. If you think that you can outsmart it and win the game, let me know how to do it in the comments section🙂.
If you are curious about other Tic Tac Toe based games, don’t hesitate to check out Achi! It’s a Tic Tac Toe extension that allows moving the tiles. This subtle addition significantly complicates the game and makes it much more challenging!
By now you should be able to understand the principles behind the Minimax algorithm that allowed us to create an unbeatable Tic Tac Toe AI agent. Minimax is a very powerful and universal algorithm that can be applied in a wide variety of applications. Don’t hesitate to apply it to other decision problems. I am looking forward to seeing your results!
Questions? Comments? Feel free to leave your feedback in the comments section or contact me directly at https://gsurma.github.io.