The Chess SuperTree

Part Of: Chess sequence
Followup To: Decision Trees In Chess
Content Summary: 1500 words, 15 min read

Chess Is A Nested Algorithm

In An Introduction To Chess, I introduced you to the basics of chess analysis: how to tell when one position is better than another. Specifically, we looked at material, pawn structure, space, king safety, piece development, piece cohesion, and initiative. These ways of thinking about different chess positions were combined into a score. The first aspect of chess thought lies in the maturation of such an scoring function.

In Decision Trees In Chess, I introduced you to the basics of chess decision making: how to evaluate different alternatives when you are not sure what to do. Specifically, we looked at how imagining different scenarios can be represented in decision trees, how these trees ultimately use scoring functions in their leaf nodes. These decision trees produce decisions after application of the minimax algorithm.

Together, scoring functions and decision trees describe the entirety of what it is to do chess. Had this series been written by world champion Magnus Carlsen, I submit that even he could communicate effectively using this two-part framework. The only things that need be different would be his analysis (his scoring function would be more accurate) and his choice of alternatives to analyze (his considered choices would contain fewer bad alternatives). In other words, skill in chess is not doing more than score functions and decision trees. It is doing them well.

Two Kinds Of Score

Consider the following. White’s turn, can you find mate in two moves?

Chess Computation- Example Mate In Two (1)

The solution to this puzzle, displayed on the rightmost decision tree, is Qxb8+. If Black recaptures, White checkmates with Bb5# (note that the Bg5 prevents Black’s king from escaping (and if Black runs away Kd7, he is met with the same fate).

Now, imagine if White had not found this clever solution immediately. Suppose White instead tries to attack Black with Bb5+. White’s reflections could lead him to consider the other decision tree, which ultimately values Bb5 as -3.2 points (do you see why the minimax algorithm requires this to be the case?)

Consider the returned values (1-0 WIN) for Qxb8+, and (-3.2) for Bb5. This incongruity is itchy. Why would the scoring function return different types of answers depending on the position?

Meet The SuperTree

Imagine for a moment the decision tree for all chess games. Call this object the SuperTree. What does it look like?

Consider the following facts:

  1. On move zero, all chess games ever begin in the same position.
  2. After white’s first move, all chess games ever are in one of twenty possible positions (White has 20 legal moves)
  3. After Black’s first move, all chess games ever are in one of four hundred possible positions (20*20)
  4. All chess games ever conclude with a result (White wins, Black wins, or tie). Some games finish quickly, others take many moves.

These facts can be consolidated into the following visualization:

Chess Computation- Birds Eye View (1)

Consider the top of the SuperTree: as we move our gaze downwards, its width increases. This corresponds to our observation above that, as more moves are made, the number of possible positions rapidly increases.

Why then does the number of possible positions decrease as we move into the endgame? Well, you have surely noticed how endgames tend to feature fewer pieces. Consider all chess games that have transitioned into King+Queen vs. King. There are only a few thousand ways to place these three pieces on the board (64*63*62 = 249,984 is an upper bound). Fewer pieces means fewer states (which is why the Supertree is shown as a diamond, not of a triangle).

At the end of every game, at the end of a game tree, are result nodes. This are represented by the dark orange boundary.

We have so far analyzed two game positions: my game from the Decision Trees In Chess post, and the Mate-In-Two position above. Since the above cartoon claims to comprise all chess games that can ever be played, let’s see how these games fit into the SuperTree.

Chess Computation- Games Within BEV

On the left, we see that the Mate-In-Two game line from this post is shorter: this reflects that it (presumably) terminated in the early middlegame, whereas my opponent in the last post survived until the 53rd move.

On the right, we see the decision trees we constructed at moments in each of these games. Pay attention to the coloring of these trees: in the earlier analysis, our endpoints we numerical scores (green) which propagate up to the top of the triangle. In today’s analysis, our endpoints contain a mixture of numerical scores (green) and final results (orange). In this case, by minimax logic, the orange style of result successfully propagated to the top of the decision tree.

Can Chess Be Solved?

Final results, like numerical scores, propagate up a decision tree. If we start at the end of chess games, and work our way backwords, could we color the entire cartoon orange? Could we banish these numerical scores and solve chess? By “solve chess”, I mean be able to prove perfect play; that is, to be able to prove, from any position, the correct move (including move 1).

Lest I be accused of dreaming of something too implausible, precisely this has been done for games like Checkers and Connect-4 (more examples here).

It turns out that chess has been partially solved: computer scientists have managed to solve all games that land in King+Queen vs. King positions. Don’t believe me? Go here, and experience for yourself what it is like to play against God.

We can visualize on our cartoon how computer scientists have managed to “pull orange upwards” in an attempt to solve the SuperTree. These results are typically referred to as tablebases.

Chess Computation- BEV with Endgame Tablebase

Why haven’t tablebases reached the top of the SuperTree? Compare the size of tablebases as we increase the number of pieces. A three-piece tablebase would include, for example, our classic example of King+Queen vs. King. Such a tablebase only requires 62 kilobytes of memory. But consider what happens when we accomodate more pieces:

Number Of Pieces Tablebase Size (Bytes)
3 62,000 62 kB
4 30,000,000 30 MB
5 7,100,000,000 7.1 GB
6 1,200,000,000,000 1.2 TB
7 140,000,000,000,000 140 TB

To my knowledge, to date our modern supercomputers have failed to construct a complete 8-piece tablebase. Chess begins with 32 pieces. From the table above, when do you think the human race will possess enough computing resources to implement a 32-piece tablebase? 🙂

A Role For Numeric Scores

We now see why our scoring function cannot return game results for most moves in our path.

Consider the case where the results of the Supertree propagate upwards in a completely unpredictable manner. In this situation, we could do no better than rolling a dice to make our move, because no move can increase or chances of winning (or losing, or drawing).

But chess is not like this. Suppose you convince two evenly-matched players to play 100 games, and you remove the queen from the first player’s starting position before each match begins. The second player will probably win many more than 50 games. This tells us that chess results are not completely unpredictable. Locations in the SuperTree in which one player has a material advantage have an above-average chance of leading to a Win result, just as material disadvantages tend to balance the scale towards the Lose result. Given this window of improved predictability, we can prefer moves that improve our material situation.

Most moves in chess, regrettably, do not yield a material advantage. Can we do better than rolling the dice? Do there exist materially-equal positions where, when given to evenly-matched players, one player will win more frequently? Of course there are. We met some of these measures in An Introduction To Chess, for example:

  • Pawn structure. Some pawn configurations are more stable and defensible than others.
  • Space. Some pawn configurations allow a player relatively more room to maneuver.
  • King safety. How many defensive resources does a player have near his King?
  • Piece development. At what rate has one managed to bring pieces into the fight?
  • Piece cohesion. Are one’s pieces working together well?
  • Initiative. Who is making more threats, and dictating the course of the game?

Skill in chess is nothing more, and nothing less, than this: the ability to discover new features (patterns in the Supertree) and the ability to “mix & match” all features to produce a single move recommendation.

This is why our “green” numeric results are are so vital. Despite not knowing the true state of the underlying SuperTree, iteratively maturing your score function provides a player with an increasingly confident prediction of the SuperTree’s true value. In other words, these aggregations of different features comprise a heuristic flashlight that gives clues about the SuperTree.

Takeaways

  • Chess decisions are made via a nested algorithm.
    • At the higher level, a player must compute a decision tree, and make decisions by rolling up the leaf scores via the minimax procedure.
    • At the lower level, for each node in the decision tree, a player must invoke a scoring function to evaluate the desirability of that position.
  • Some decision trees can be evaluated as Win/Lose/Draw, rather than a particular number.
  • The SuperTree – the decision tree that contains all possible moves of any possible game – is a useful way to think about the game of chess.
  • We can think of specific games as traces in the supertree, and understand why attempts to solve the SuperTree (tablebases) have proved intractable.
  • Scoring functions only work if they provide access to regularities within the underlying SuperTree.

Decision Trees In Chess

Part Of: Chess sequence
Followup To: An Introduction To Chess
Content Summary: 1500 words, 15 min read

Setting The Stage

Let’s pick up where we left off. I had just played:

  1. e5 dxe5
  2. f5

A decision which as White’s most promising option, yielding a small (~0.2 pawn) increase in our advantage.

Starting Score End Score Difference
Estimated Total +0.4 +0.6 +0.2

Black responds by attacking White’s bishop, forcing another important decision.

  1. … f6

first_chess_position

In this post, I will be discussing several chess positions in depth. Unless you are a trained professional able to simulate move sequences in your head, it would be much better if you followed along on a board. I cannot underscore this point enough. You should be able to do this digitally, by re-enacting moves via this link: http://bit.ly/1GyEcQ5

Constructing Options

Back to the position at hand. Black is threatening my Bishop: other things being equal, I should remove it from danger.  Where to?  Bh4 feels wrong: the Bishop is misplaced.  Be3 is a natural retreat.  Bh6, offering a trade of Bishops (Bxh6 Qxh6), is also possible.

Our spidey-sense should be tingling here. Can we transcend a retreat, and pursue a more aggressive course of action? Can we create threats of our own, scary enough to dissuade Black from capturing our Bishop? If we captured his pawn (fxg6) could Black afford to capture our bishop? Similarly, what if we simply moved our Knight towards his vulnerable King?  (Ne4 fxg5 Nxg5) while losing two points, should be examined because our Knight is about to land in the powerful e6 square, and make Black’s life rather bleak.

Please note, explicitly, the kind of tradeoff we are here considering: position vs material (quality vs quantity).  Surrendering the latter for the former brings with it a large amount of risk – material factors “persist forever”, whereas positional advantages tend to be fleeting.  Further, note that as the pawns are move aside and our pieces make inroads into each others armies, our game will become highly volatile, with the entire game often hinging on one individual moment.  When the pace of the game accelerates in this way, it is said to produce sharp positions.

To restate our four options:

  • 22. Be3
  • 22. Bh6
  • 22. fxg6
  • 22. Ne4

I shall now survey my analysis of these options. I spent four hours combing over the implications of these options. I want you to pay attention to the language I use, and the inter-analysis differences you can spot.  I will be drawing from such observations in my next post.

Analyzing “Quiet” Moves

For our first positional option, if I play the quiet 22. Be3, Black will most likely resolve the f5-g6 pawn tension:

Suppose Black plays 22 … gxf5.  

Here, White cannot play the aggressive 23. Bc2 f4 24. Qd3 because Black’s King escapes after 24 …fxe3 25. Qxh7+ Kf8 (-3.0, [pos 1a]).  

Perhaps White does best with the quiet 23. Rxf5 e6 24. dxe6 Rxe6 25. Rh5 with control of the light squares (+0.2, [pos 1b]).

If he instead locks down the Kingside with 22 … g5, I will perhaps want to open it up with 22. h4.  Play may proceed 22 …h6 23. Kf2 Nf8 24. Rh1, with play on the Kingside (+0.5, [pos 1c]).

For our second positional option, 22. Bh6 is very similar to Be3, but forces the trade of the dark Bishops.

After 22 …g5 23. Bxg7 Kxg7 Black has more room to breathe than the former line, and can swing his Rooks over to the King’s defense (-0.2, [pos 2a]).

Here are the final positions & scores of the four lines we considered here:

Chess Decision Tree- Quiet Position Comparison

We can represent these analyses graphically, as follows:

Chess Decision Trees- Positional DTree after f5 f6

We are now done with half of our analysis. Take a deep breath, get a glass of water. We’re almost there! 😛

Analyzing “Sharp” Moves

Consider our first tactical option, 22. fxg6. Black may “play my game” and accept the pawn trade, or “call my bluff” and annihilate my bishop.

On 22 … hxg6 White’s threat is dissolved and he must bring his bishop to safety. After 23. Bh6, Black may try to protect his king by gaining space, or by holing up:

On 23 … f5 White must hammer on Black’s new center before it becomes too strong. 24. g4 can only be met with 24 … f4, after which 25. Bxg7 Kxg7 Ne4 and White’s attack is dead, and is left down a pawn (-0.8, [pos 3a])

Black may instead try the defensive 23 … Bxh6 24. Qxh6 Nf8, but White has a fighting chance after 25. Bg4 Rcd8 where his bishop has a promising view of the light squares (+0.0, [pos 3b])

On 22 … fxg5, Black dares White to prove his attack is worth more than the sacrificed Bishop. White is obliged to blow away Black’s safety net with 23. gxh7, after which Black has two options:

If Black recaptures 23 … Kxh7, after 24. Qxg5 White has a strong attack, potentially leveraging his Re4 to Rh4 (+1.4, [pos 3c])

Black can “hide behind” White’s pawn with 23 … Kh8. A plausible line here may be 24. Ne4 Bh6 25. Ng5 Bxg5 Qxg5, where again White dreams of bringing other pieces to bear in the attack (+0.8, [pos 3d])

Chess Decision Tree- Sharp Position Comparison Part One
Consider instead the second tactical option 22. Ne4. Here, Black may play defensively, capture my bishop, or capture my pawn:

On 22 … Nf8 a plausible line may proceed: 23. Bh6 Rb8 24. Bc2 Bc8 25. Ng3 with strong positional pressure (+0.5, [pos 4a])

On 22 … fxg5 23. Nxg5 Black is faced with several defensive options, of which three stand out:

On 23 … Nf8 24. fxg6 Nxg6 25. Rf7 Black can avoid the Ne6 fork with 25 … Qd6, with a very sharp, unclear game ahead (+0.0, [pos 4b])

On 23 … Qd6 24. fxg6 hxg6 25. Ne6 White’s control of the light squares is very powerful. Play might continue 25 … Nf8 26. Bg4 (+1.0, [pos 4c])

On 23 … Nf6 24. Ne6 Qd6 25. fxg6 hxg6 26. Qg5 White’s Queen bears down on the weak g-file (+0.8, [pos 4d])

Chess Decision Tree- Sharp Position Comparison Part Two

On 22 … gxf5 23. Rxf5 Black can challenge the Rook with 23 … e6 after which 24. dxe6 Rxe625. Bh6 and White is comfortable (+1.0, [pos 4e])

But Black can simply capture the bishop on instead. After 23 … fxg5 24. Nxg5, Black is again faced with many defensive options.

On 24 … Rf8 Qd3 White threatens to move the rook and checkmate via Qh7. (+0.9, [pos 4f])

On 24 … Nf8 Bg4 White is poised to threaten the Rc8, and counter Black moving his e-pawn (+0.7, [pos 4g])

On 24 … e6 Nxe6 25. Qd6 Qg5 White threatens Qg7 checkmate, but this threat can be blocked (+0.5, [pos 4h])

Chess Decision Tree- Sharp Position Comparison Part Three

We can again represent the above analyses graphically, as follows:

Chess Decision Trees- Sharp DTree after f5 f6

Chess Decisions Are Graph Computations

Congratulations on surviving the past two sections! We’re almost done. Take a deep breath.

The last question on our plate today is:

How should we take the above analyses and produce a move?

This seems difficult to answer. But consider the following simpler question

What is the score of our original four choices?

If we knew the score attached to our four possible moves, we would have an answer: make the move with the best possible score! 🙂

How to solve this simplified question? If we need scores in the head of the tree, graphically our task becomes propagating the already-known leaf scores back up the tree. Let’s see this in action.

Chess Decision Trees- Positional Minimax after f5 f6

We have a clear understanding of the score for the position after 22. Bh6: it is -0.2.

But what is the score after 22. Be3?

To answer this, we must first evaluate the score after 22 … gxf5. It is White’s turn, should he choose 23. Bc2 or 23. Rxf5? He should choose Rxf5, because it has a higher score! In mathematical terms, we simply take the maximum of all white moves: max(0.2, -3.0) = 0.2!

Now that we know that the state after gxf5 is +0.2, what is the score made by Be3? It is Black’s turn, should he choose gxf5 or g5? He should choose gxf5, becuase it has a lower score! In mathematical terms, we simply take the minimum of all of black’s option scores: min(0.2, 0.5) = 0.2

The above two paragraphs express the basics of the minimax algorithm. The name comes from the application of max() for White and min() for Black. Its procedure is how computers, and as I will argue brains, do chess.

Can you apply the same logic to fxg6 and Ne4?  Take a moment to valuate their scores, and then come back to check your work!

  • Be3: 0.2
  • Bh6: -0.2
  • fxg6: 0.0
  • Ne4: 0.0

Now, given these scores for White’s candidate moves, which should White pick? The maximum, or 22. Be3. And that is precisely the move I chose. 🙂

Here’s a graphical representation of the above results:

Chess Decision Trees- Complete Minimax after f5 f6

It is my hope that chess decisions now remind you of lightning physics:

Takeaways

  • When considering possible moves, it is helpful to graph moves as nodes in a decision tree.
  • Analyzing chess is hard work!
  • When you are done constructing the decision tree, the minimax algorithm translates its data into a decision.

Appendix

If anybody is interested in how the game concluded, let me attach a link to it here for completeness. While I did win, I feel like the win was not particularly instructive. This particular moment analyzed above, however, contained a nice mix of different strategies and thereby was pedagogically useful.

1. d4 Nf6 2. c4 g6 3. Nc3 Bg7 4. e4 d6 5. Nf3 O-O 6. Be2 c5 7. d5 a6 8. a4 Nbd7 9. O-O b6 10. h3 Bb7 11. Bf4 Re8 12. Qd2 Qc7 13. Rae1 Rad8 14. Bd3 Nh5 15. Bg5 Ne5 16. Be2 Nxf3+ 17. Bxf3 Nf6 18. Bd1 Rc8 19. f4 Nd7 20. e5 dxe5 21. f5 f6 22. Be3 g5 23. b3 Rb8 24. h4 gxh4 25. Qf2 h3 26. Qg3 Kh8 27. Bh5 hxg2 28. Kxg2 Rg8 29. Bg6 Bh6 30. Rh1 Nf8 31. Rxh6 e6 32. Qh4 exf5 33. Qxf6+ Rg7 34. Kf2 Nxg6 35. Rxg6 hxg6 36. Bh6 Kg8 37. Bxg7 Qxg7 38. Qe6+ Qf7 39. Qxe5 Rf8 40. Qe6 Qxe6 41. Rxe6 Kg7 42. Rxb6 Rf7 43. b4 cxb4 44. Rxb4 Kf6 45. Rb6+ Kg5 46. c5 f4 47. Ne4+ Kh4 48. Nd6 Bxd5 49. Nxf7 Bxf7 50. c6 Be6 51. c7 Bf5 52. Rxg6 Kh5 53. Rg8 1-0