Implementing Minimax in Scala: Alpha-Beta Pruning

Last time, we built a naive minimax algorithm in Scala. While it was effective, it was unfortunately very slow. To increase the performance of the algorithm, we’ll implement a common variant of minimax with what is called “alpha-beta pruning”. Again, if you want to see the full code right away, head over to GitHub.

Recall that the minimax algorithm works by playing out all possible games given a starting game state. During the course of the analysis, undoubtably paths will be encountered that have no hope of resulting in victory. Alpha-beta pruning removes those as possible paths and stops searching any deeper down those paths. This is accomplished by applying a check for each iteration to see if the score returned is worse than the best score encountered.

  private def minimax(state:GameState):Int = {
    minimize(state, maxDepth, Integer.MIN_VALUE, Integer.MAX_VALUE)
  }

  private def minimize(state:GameState, depth:Int, alpha:Int, beta:Int):Int = {
    if(state.isTerminal || depth == 0) return state.hVal
    var newBeta = beta
    state.availableMoves.foreach(move => {
      val newState = state.makeMove(move)
      newBeta = math.min(beta, maximize(newState, depth - 1, alpha, newBeta))
      if (alpha >= newBeta) return alpha
    })
    newBeta
  }

You should see a lot similarity here. The big difference is that now we instead of just accepting the best result as the result of maximize function, β is now the smaller of the previous β value or the result of maximize. Next comes the critical check. If α ≥ β we immediately exit the function and return α as the best possible score.

The maximize function is of course quite similar, except this time we return β if no better scores are are found.

  private def maximize(state:GameState, depth:Int, alpha:Int, beta:Int):Int = {
    if(state.isTerminal || depth == 0) return state.hVal
    var newAlpha = alpha
    state.availableMoves.foreach(move => {
      val newState = state.makeMove(move)
      newAlpha = math.max(newAlpha, minimize(newState, depth - 1, newAlpha, beta))
      if (newAlpha >= beta) return beta
    })
    newAlpha
  }

When I ran this code in the same game as the naive algorithm, the processing time was an order of magnitude faster.

Implementing Minimax in Scala: Naive Minimax

Minimax is algorithm commonly used by AI in two-player games to determine the best possible move given a current state of the game. It gets its name from the intuitive concept that a player wants to minimize his opponent’s score while maximizing his own score. The core of this algorithm essentially plays out all of the possible games until the game is over. Then the play that resulted in the best end game (minimize opponent’s score, maximize player’s score) is chosen. If you want to see the full code right away, head over to GitHub. Otherwise, read on and let’s see how this is implemented.

  def play(state:GameState):(Int, Int) = {
    val moveScores = {
      for {
        move <- state.availableMoves
      } yield move -> minimax(state.makeMove(move))
    }
    val bestMove = moveScores.minBy(_._2)._1
    bestMove
  }

We start with a GameState which I have intentionally left out of this post. You are free to implement GameState however you see fit (although stay tuned for an actual game that uses this algorithm). We then iterate over all of the available moves in the game state and return a key->value pair of move->score. The best move is the one that minimizes the opponent’s score. So given the list of move->score pairs, we find the pair with the smallest score and return the move. Scala’s syntax for this last bit is pretty terse: moveScores.minBy(_._2)._1. The _. was confusing to me at first, but the underscore could be replaced by any identifier, such as “pair” or “moveToScoreTuple”. The use of the underscore is to imply that we do not particularly care to name the pair because we never need to refer to the pair by name.

If you recall, I said earlier that minimax plays out all possible games until the end. In practice, this would take far too long. Consider a game like Chess in which there are approximately 10120 possible games (more than the number of atoms in the universe). Playing out all of the games would be impractical. So we usually implement a “depth” concept in minimax which tells the algorithm how many moves it is allowed to play. When the depth is reached, the algorithm returns the score of that game state (even though the game is not necessarily over). We kickoff the search using the chosen depth:

  val maxDepth = 3

  private def minimax(state:GameState):Int = {
    minimize(state, maxDepth)
  }

  private def minimize(state:GameState, depth:Int):Int = {
    if(state.isTerminal || depth == 0) return state.hVal
    var bestResult = Integer.MAX_VALUE
    state.availableMoves.foreach(move => {
      val newState = state.makeMove(move)
      val bestChildResult = maximize(newState, depth - 1)
      bestResult = math.min(bestResult, bestChildResult)
    })
    bestResult
  }

Note that the minimax function returns an Int. This is score of the game state in which the algorithm terminated. To start the process we jump down into the minimize function.

Immediately we want to see if the game is over (state.isTerminal) or if the algorithm has reached the maximum number of moves (depth == 0). If either of those conditions are true we return the “heuristic value” (hVal) of that state, which is the score. If neither of those conditions are true, we continue as normal.

Remember at this point we already made one move (in the play() function), so we’re now evaluating the plays the opponent will make. Since we are trying to minimize the opponent’s score, the worst outcome for us is one in which the opponent scores +∞, which we represent by Integer.MAX_VALUE. We want to see if there is any move in which our opponent will score lower than +∞, so we loop over the available moves our opponent could make in this game state. Of course, our opponent is also trying to win, so the best play for the opponent is the one that maximizes his score! More on this in a moment, but that is the reason why we run val bestChildResult = maximize(newState, depth - 1). Finally we update the bestResult to be the smaller of the current bestResult and the bestChildResult. After checking all possibilities, we have minimized the score and return bestResult.

Now we need to see the other half of the algorithm.

  private def maximize(state:GameState, depth:Int):Int = {
    if(state.isTerminal || depth == 0) return state.hVal
    var bestResult = Integer.MIN_VALUE
    state.availableMoves.foreach(move => {
      val newState = state.makeMove(move)
      val bestChildResult = minimize(newState, depth - 1)
      bestResult = math.max(bestResult, bestChildResult)
    })
    bestResult
  }

No surprise that is looks very similar! All we’re doing is flipping a few things around. This time, the worst outcome is the one in which we score -∞. Additionally, we want to return the larger of the bestResult and bestChildResult because we are trying to maximize our score.

Looking at both sides of the algorithm, we can now see the flip-flop that occurs. The minimize function calls maximize and vice versa. This flip back and forth simulates the two players taking alternating turns. Each player wants to make the move that is best for him and worst for his opponent. To increase the speed of the algorithm, we limit the number of possible moves which is why each successive call is made with depth - 1.

Unfortunately, this is still pretty slow. In the next post we’ll see how we can save the algorithm some time by not evaluating games that have already been lost.

Raspberry Pi Minimally Required Boot Files

I got a Raspberry Pi for Christmas so naturally I wanted to get started hacking on it right away. But the tutorials I found online assume you’re starting with an SD card with a kernel like NOOBS or Raspbian installed. I was going through the Baking Pi tutorials to write your own kernel, so I had no use for another kernel.

I could not easily find just the Raspberry Pi boot files. So to save others some time, I created a github repo for just that purpose. Instructions and files are available at Raspberry Pi Minimal Boot.

Just a quick post for now, but expect to see more about Raspberry Pi in the coming weeks.

NaNoGenMo 2014 Dev Diary #3: Results

NaNoGenMo is an idea created by Darius Kazemi to parody NaNoWriWo. Instead of writing a novel, developers write programs to generate 50k+ word “novels”. This series of posts will document my participation throughout the month.

With the scaffolding in place that I described in my last post, I was able to pretty easily substitute the sample words for real words. The most significant change I made was implementing a simple cache. There’s no reason to search the graph and calculate the probabilities for the same word more than once.

As I expected, this method resulted in mostly gibberish. The reason for this is because using only word concordances means that the only guaranteed grammatical correctness is between each pair of words. There is no guarantee that the sentence as a whole is grammatically correct. Nonetheless, I did end up with some humorous text. Here are just a few of my favorites.

GEORGE: OH THAT’S NEWMAN ALL RIGHT. THE CORNER.
ELAINE: I’M VERY VERY AWKWARD.

ELAINE: SO HE HAD ORGASMS?
GEORGE: WELL MAYBE. HOW? MOVE EDWARD SCISSORHANDS. I’M SPECIAL. NO PROBLEM FOR BREAKFAST. SO LENNY BRUCE USED.

JERRY: FIRST DATE WAS IT WAS.
GEORGE: YOU TELL ME? SHE SAID THANK YOU SLEEPY.

KRAMER: MINIPLEX MULTITHEATER.
JERRY: GIDDYUP.

GEORGE: OH THAT SOUP NAZI.

If I redid this, I would want to find a cleaner data source. My source was filled with typos, inconsistencies, and non-standard punctuation. This led to some difficulties seeding the database. Also I think I would count “…” as its own distinct sentence-terminator. Finally, a similar approach that would have less gibberish would be using Markov chains instead of simple concordances.

All of my code and the final “Missing Season” is available on github.

NaNoGenMo 2014 Dev Diary #2: Setting up the template

NaNoGenMo is an idea created by Darius Kazemi to parody NaNoWriWo. Instead of writing a novel, developers write programs to generate 50k+ word “novels”. This series of posts will document my participation throughout the month.

Having loaded sample data into Neo4j, I was able to move on to loading the full Seinfeld scripts. It was a bit slow, about an hour, although that was with minimal optimization on my part. When finished, the graph contained over 16,000 nodes and almost 500,000 relationships.

Although I know my final product will be nonsense, I want it to have the feel of an actual transcript. To do this, I generated some statistics about number of lines per script, words per sentence, etc. I discovered a handy function buried in scipy that accepts a list of percentages and returns random numbers distributed according to the percentages. For example, here is some code that returns a number of sentences for a given line:

from scipy import stats

line_lengths = {1: 0.37012475970532455, 2: 0.13680991884813376, 3: 0.05066508405475493, 4: 0.019468706859848535, 5: 0.00896670064121586, 6: 0.004683327097498355, 7: 0.002425524777767743, 8: 0.001535305577416816, 9: 0.0010579416583880582, 10: 0.0003741500986982157, 11: 0.0005289708291940291, 12: 0.0003096414609916268, 13: 0.0001419190029544956, 14: 7.74103652479067e-05, 15: 0.00012901727541317782, 16: 6.450863770658891e-05, 17: 5.160691016527113e-05, 18: 0.00010321382033054225, 19: 5.160691016527113e-05, 20: 3.870518262395335e-05, 21: 1.2901727541317782e-05, 22: 3.870518262395335e-05, 23: 2.5803455082635563e-05, 24: 1.2901727541317782e-05, 26: 1.2901727541317782e-05, 34: 1.2901727541317782e-05, 37: 1.2901727541317782e-05}

probability = stats.rv_discrete(a=1,
    values = (list(line_lengths.keys()), list(line_lengths.values())))

line_length = probability.rvs(size=1)[0] 

Applying these probabilities with a nonsense sample sentence, I was able to achieve something like this:

SEASON UNKNOWN: EPISODE 1 — THE PLACEHOLDER
=============================

KRAMER: The goats. The goats? The goats are going!
GEORGE: The goats are going! The goats are going? The.
ELAINE: The goats?
GEORGE: The.
JERRY: The?
GEORGE: The goats are?

The next step will be getting words out of the Neo4j graph instead of the sample sentence.