December 29, 2022

Rock Paper Scissors With Matrices

I had a go at the first couple of weeks of problems from Advent of Code this year. I’ve posted my answers over here. I stopped after day 13 because the problems were getting harder but the time I had to devote to them could not increase. I hope to come back to them at some point.

But today I wanted to point out something interesting about the day 2 puzzle. Spoilers ahoy for day 2 of Advent of Code 2022. This post is mainly an excuse to test MathJaX on the blog.

The puzzle involves figuring out results of a Rock Paper Scissors competition. I actually did two solutions. The first just relies on the fact that if you map Rock, Paper Scissors (henceforth RPS) to 0,1,2, then the difference between your result and mine modulo 3 determines who won. 0 is a draw, 1 is a win for me, 2 is a win for you. Or, if you like, since we’re dealing with modular arithmetic here, -1 is a win for you. Then it’s just a matter of translating those results into points: 6 for a win, 3 for a draw, 0 for a loss. The other wrinkle is that you get points for which move you played. So Rock gets you 1, Paper 2 and Scissors 3. Again, pretty easy to add in. Incidentally, I wonder what this additional points thing would do to the optimal strategies in RPS…

But it’s the second solution that I want to try to unpack a little. This is also an excuse to experiment with MathJax on this blog. So in figuring out the first solution, I drew myself a few little 3x3 arrays to try to figure out the various permutations of what plays result in what results. And in doing so, I thought, “hm, I bet I could solve this just using matrices”. Once I’d finished my first solution, I decided to have a go at a second solution using matrix multiplication.

So, here’s the first step. Define this matrix:

$$ M_1 = \begin{pmatrix} 3 & 6 & 0 \\ 0 & 3 & 6 \\ 6 & 0 & 3 \end{pmatrix} $$

Note that the first row lists the outcomes in points if you were to play Rock. (For now we’re ignoring the points for playing a specific move, we’ll come back to that). That is, if I also play Rock we look at the first value - 3, a draw – if I play Paper we look at the second value – 6 a win for me – etc. Likewise, the second row lists the results if you were to play Paper, and the third, Scissors.

Now, instead of mapping RPS to 0,1,2, you map them to unit vectors $R = (1,0,0), P = (0,1,0), S = (0,0,1)$ then, you can work out who won in the following way. If we multiply your vector by the matrix $M_1$, we get the row corresponding to the results as above! So if we multiply, say $P$, by $M_1$ you get That is, for example:

$$ (0,1,0) \times \begin{pmatrix} 3 & 6 & 0 \\ 0 & 3 & 6 \\ 6 & 0 & 3 \end{pmatrix} = \begin{pmatrix} 0 \\ 3 \\ 6 \end{pmatrix} $$

These unit vectors basically pick a row of our matrix. (If we multiply the other way round, i.e. $A\times y$, we select a column of the matrix). And if we multiply this output vector by my move, we get the result. So the outcome of a game of RPS in score terms is just $(y\times A)\times x$.

What about the bonus score for making specific moves? Well, we can just multiply my move by $(1,2,3)$ to get the score we need to add. So the solution to part one of AoC is simply to turn the inputs into the relevant unit vectors, and then do $(y\times M_1)\times x + x\times (1,2,3)$. ($x$ is a row vector, so I guess that last terms should be a column, but that would take up too much vertical space. You get what I mean, and numpy is pretty forgiving about multiplying vectors…) Part one, job done. For now…

On to part two. Here we learn that the input that we thought was telling us what move to make is actually telling us what result to aim for (i.e. win lose or draw). Again, all we need to do is keep track of points, so figuring out how many points we get for our result is way simpler now: multiply my unit vector $x$ by $(0,3,6)$. That is, if it’s telling me to draw ($(0,1,0)$) then I get $(0,3,6) \times (0,1,0)= 3$. However, figuring out how many bonus points we get for the move we play becomes trickier. Think about it this way: if you play Rock, and I want to lose, I need to play scissors, which earns me 3 points for the move. Figure out all those combinations and write them into a matrix as we did for part one, and we get:

$$ M_2 = \begin{pmatrix} 3 & 1 & 2\\ 1 & 2 & 3 \\ 2 & 3 & 1 \end{pmatrix} $$

So then part two is the same deal as before, but with different matrices: $(y\times M_2)\times x + x \times (0,3,6)$. Part two done. That’s basically the most readable way of seeing this approach to the solution. Each part is a sum of two terms, and the two terms correspond to the score you get for the result and the score you get for the move you make.

But we can noodle around with this a little more. We can actually get rid of that second term in both parts. Remember that we calculated $M_1$ by figuring out what you’d score depending on your move and your opponents. Well, we can just repeat that process taking into account the points you also score for your move. We get:

$$ \begin{pmatrix} 4 & 8 & 3 \\ 1 & 5 & 9 \\ 7 & 2 & 6 \end{pmatrix} $$

So if we just multiply this matrix by their move, and then by your move as we did before, we’ll arrive at the same result.

We can do the same thing for part 2. We get:

$$ \begin{pmatrix} 3 & 4 & 8 \\ 1 & 5 & 9 \\ 2 & 6 & 7 \end{pmatrix} $$

So we could just hard-code these two matrices into our solution and solve things pretty quick. But that would be to miss some interesting structure. This might be a slightly more elegant solution, in that we do fewer operations per game of RPS, but we got there by hard-coding the payoff matrices. There’s a symmetry to the solutions to parts one and two that is obscured by just hardcoding these payoff matrices. So let’s break them down again.

Let’s start with the part one solution matrix. Let’s note that it breaks down as follows:

$$ \begin{pmatrix} 4 & 8 & 3 \\ 1 & 5 & 9 \\ 7 & 2 & 6 \end{pmatrix} = M_1 + \begin{pmatrix} 1 & 2 & 3 \\ 1 & 2 & 3 \\ 1 & 2 & 3 \end{pmatrix} $$

Note also that:

$$ M_1 = 3\times \begin{pmatrix} 1 & 2 & 0 \\ 0 & 1 & 2 \\ 2 & 0 & 1 \end{pmatrix} $$

And we can break down the payoff matrix to part two as follows:

$$ \begin{pmatrix} 3 & 4 & 8 \\ 1 & 5 & 9 \\ 2 & 6 & 7 \end{pmatrix} = M_2+ \begin{pmatrix} 0 & 3 & 6\\ 0 & 3 & 6\\ 0 & 3 & 6\\ \end{pmatrix} $$

Further:

$$ M_2 = 1 + \begin{pmatrix} 2 & 0 & 1 \\ 0 & 1 & 2 \\ 1 & 2 & 0 \\ \end{pmatrix} $$

We’re seeing the same patterns in parts one and two. Each payoff matrix is a sum of a part that depends on your opponent’s move (that’s $M_1$ or $M_2$) and a part that doesn’t (that’s the part where each row is the same).

Let’s start with the second component, since it’s simpler. If we look at the second parts, we can see that they’re both derivable from the same base component:

$$ B = \begin{pmatrix} 0 & 1 & 2 \\ 0 & 1 & 2 \\ 0 & 1 & 2 \\ \end{pmatrix} $$

The second component of part one is $B+1$ and the second component of part two is $3 \times B$. Remember, this second component is the part of the score that doesn’t depend on your opponent’s move. In part one that was the score you got for the move you chose (1,2 or 3), and in part two it’s the score you got for the result you got (0,3 or 6). The other component of the score – the part that depends on your opponent’s move, the $M_i$s – is the other way round: in part one this first component of the score is the score for the result (0,3 or 6) and in part two its the score for the move you chose (1,2 or 3). You might think we can just do as we did for the second component with this first component. We first define:

$$ A = \begin{pmatrix} 1 & 2 & 0 \\ 0 & 1 & 2 \\ 2 & 0 & 1 \\ \end{pmatrix} $$

and then define the first components for parts one and two as $3\times A$ and $A+1$ respectively. This almost works. It is indeed true that $M_1 = 3\times A$, but if you look at $M_2$ you’ll see that it’s “upside down” compared to $A+1$. This kind of makes sense, because in part one, you’re calculating (a score for a) result by plugging in your move and your opponent’s move, whereas part two you’re calculating your move (or the score for your move) based on a fixed result and your opponent’s move. So for part two we need to flip the rows of the matrix upside down. We can flip the rows of a matrix by multiplying it by a matrixs with ones on the anti-diagonal (that is the diagonal that goes from North-East to South-West; the “Z”-diagonal as opposed to the “N”-diagonal).

Call this matrix:

$$ F = \begin{pmatrix} 0 & 0 & 1\\ 0 & 1 & 0\\ 1 & 0 & 0\\ \end{pmatrix} $$

Since the rows of $3\times B$ are all the same, we can move the “flipping” multiplication outside the brackets to emphasise the symmetry of the payoff matrices for parts one and two. The payoff matrices are thus:

  • Part One: $3\times A + B + 1$
  • Part Two: $F\times (A+1 + 3\times B)$

I thought that was kind of neat. The code for my solution to day 2 using matrices is on Codeberg. This post has explained the maths behind the solution. There’s a couple of slightly unusual language features I make use of that I’ll explain in a follow up post.

© Seamus Bradley 2021–3

Powered by Hugo & Kiss.