Ferguson and Game Theory

Note: In this post I use words like "player," "game," and "payout" to refer to the interactions between criminals and police officers. I do not mean to trivialize deadly encounters between criminals and police officers and I am not implying that police officers are playing a game at their jobs. These are the terms that are used in discussions of game theory and to stay consistent with this, I use these terms.

In this blog, I usually talk about how ideas of physics apply to studying problems "outside physics". As far as I know, game theory isn't too useful in physics, but physicists have contributed to the theory of it as in [1]. Game theory is an interesting way to mathematically model interactions of people.

In a game theory game such as the prisoner's dilemma, each player has a choice of cooperating or defecting with the opponent. If the player cooperates while the other cooperates, the two players both get a mid sized reward. If one player cooperates while the other defects, that player gets nothing while the opponent wins a large reward. In the player defecting, opponent defecting case, the converse occurs. If both players defect, they both get a small reward. While seemingly simple, there is a rich theory to games like these. In particular, the iterated prisoner's dilemma (where this game is repeated many times) is used to study population dynamics.

In a game, the rewards (or payouts as they will be referred to later) are characterized by a payout matrix. The payout matrix shows, for each player, what his or her payout will be given each of the possible outcomes.



Fig. 1: Examples of payout matrices for each player in the prisoner's dilemma. If both players cooperate, they both get a payout of 3. If both players defect, they both get a payout of 1. If one player defects while the other cooperates, the defector gets 5 while the cooperator gets 0. Note that for the prisoner's dilemma, the game looks the same for each player. In the game I consider below, this is not the case.

We will apply this type of game theory logic to whether a police officer should shoot a criminal or not. This game is more complicated than the classical prisoner's dilemma as the payoff is not the same to either player (police officer or criminal), even if we assume each player values their life the same. Further, the criminal's payoff matrix is certain, while the police officer's payoff matrix has some uncertainty.

The scenario I am imagining is the following, inspired by the Ferguson incident. A criminal has committed a crime and is being chased down by a (or a pair of) police officer(s). I will take cooperating to mean that one player does not attack the other player, and defecting to mean that one player does attack the other player. I assume that if a player has a gun and defects, the other player dies, though most of the following argument would also follow if instead of dying the player is severely injured. For payouts, I assume a hierarchy where the payout for nothing happening >> the payout for arrest, suspension >> the payout for death (severe injury). I will take specific values of 0,-100,-10000 for these payouts mostly to make the arguments easier to understand, but the general results should hold regardless of the specific values chosen, as long as this hierarchy is in place. I assume that the game is played only once between the criminal and police officer(s), but it would be possible to think of this as an iterated game, where both players have to assess at each stage how much danger they are in and determine to defect or cooperate.

The payout matrix for the law enforcement officer is uncertain. The payout looks like the following, where the top consequences are the payouts when the criminal has a gun (or a lethal weapon) and the bottom are consequences when the criminal does not.



Fig 2: Payout matrix for the law enforcement officer. The top consequences are the consequences when the criminal has a lethal weapon, whereas the bottom consequences are when the criminal does not.

I would like to assign numerical values to these scenarios. In order to average the two scenarios (criminal having a lethal weapon and not) I will assume a prior distribution. This would be the fraction of criminals that are arrested by police officers that are carrying lethal weapons. This prior distribution presumably depends on the type of the crime the criminal committed, but I will not take this factor into account in the simple model I am considering. I was not able to find information about this distribution easily (though I hope this information is available to law enforcement officers) so I assume that 50% of criminals are carrying lethal weapons. However, I expect that this is an overestimate.



Fig 3: Numerical values for the payout matrix for the law enforcement officer. The values are obtained by averaging the payouts of the each of the consequences weighted by the probability that the criminal has a lethal weapon.

Thus, if the law enforcement officer knows that the criminal is cooperating, the optimal thing for him or her to do is to cooperate as well, as this is what maximizes the payout. Unfortunately, there is a chance the criminal will defect. In this case, the police officer should still cooperate, as there is a chance the criminal does not have a lethal weapon. Then, the officer has no consequences, which is better than facing suspension. Note, however, that if the criminal is defecting, the difference in payout for the two strategies of the police officer is quite small (0.1% with the numbers I have chosen). The two strategies are almost equivalent to the police officer.

Now consider the criminal. The payout matrix for the criminal is certain, as it is common knowledge that law enforcement officers carry weapons. It would look like the matrices below.



Fig 4: Descriptive and numerical payout matrices for the criminal. We take the criminal defecting, police officer cooperating case to be -1000 as it is part way between -100 and -100000. This may be an underestimate as if a criminal manages to shoot a police officer, there is a chance that another police officer will shoot the criminal.

The interesting thing here is that if the police officer is defecting (which, recall is not such a bad strategy for the police officer if the police officer assumes the criminal will defect), there is no optimal strategy for the criminal. The criminal dies in both cases. Clearly then, as there is no strategy if the police officer defects, the criminal can only optimize the case when the police officer cooperates, meaning the criminal should cooperate. Thus, the Nash equilibrium is for the criminal and police officer to cooperate. This is presumably what happens most of the time. Criminals are arrested every day but we (comparatively) rarely hear about incidents where the criminal or police officer is killed. So does this mean game theory cannot explain what happened at Ferguson? Not quite.

Let's say the strategy of the police officer is to defect a fraction x of the time and to cooperate 1-x of the time. Then the expected payout (probability multiplied by the payout of that case) for the criminal is -100(1-x)-100000x if the criminal cooperates and -1000(1-x)-100000x if the criminal defects. As an example, take x to be 0.01. At this value, the expected payout for the criminal is -1099 if he or she cooperates and -1990 if he or she defects. Clearly, the defecting strategy is still worse, but the difference between cooperating and defecting is only about 50%. The difference would be even less if the fraction x were higher or if the payout of death were even more negative. Thus, if there's any reason for the criminal to think that the police officer may defect (and given news reports of incidents like Ferguson, there is probably reason for the criminal to expect this), the difference between cooperating and defecting is not so different in terms of payout. This means the actions of the criminal become unpredictable, which means the police officer may need to defect to preserve his or her life.

The main cause of this issue is that dying is really bad. Thus, if there's even a small chance that the police officer will defect, the criminal's choices become mostly equivalent. Now suppose police officers are only able to use non-lethal means against the criminal. Then the payout matrix for the criminal looks quite different.



Fig 5: Descriptive and numerical payout matrices for the criminal when the police officer cannot use lethal means against him or her. I choose a value of -200 for getting hurt, though it may not necessarily be this bad. If the criminal can prove that the police officer attacked him or her without just cause, the criminal may even receive some sympathy.

Now, if the criminal defects, the payout is -1000 no matter what, while if the criminal cooperates the expected payout can be anywhere between -100 and -200, depending on the police officer's strategy. The expected payout is no longer (approximately) degenerate as above. The clear choice here is to cooperate, as defecting is many times worse. If the criminals and police officers are acting rationally, then an assurance by law enforcement to not kill criminals seems to improve the rate that the criminals should cooperate. Of course, the problem with this is, that while the criminal's payoff matrix has changed by ensuring the police officer will not kill the criminal, the police officer's payoff matrix remains the same.

As mentioned earlier, unless the probability the criminal has a lethal weapon is extremely small, if the criminal has a chance to defect, the expected payout for the police officer is also nearly degenerate no matter the police officer's strategy. The police officer's situation could be improved though, if the chance of death of the police officer were smaller. This could include protective equipment and training to not get fatally wounded in these encounters. In addition, there is a crucial difference between the criminal and police officer in that the police officer has more experience with these encounters than the criminal does (though the police officer is disadvantaged in that he or she has less information about the opponent). Thus, it is imperative that the police officer is trained to make the right decisions in these tough situations. As stated before, this is presumably what happens most of the time when criminals get arrested, and thus the police officers are usually doing a good job. If the payoff to the police officer were truly degenerate, we would expect the police officers to defect about half of the time. However, there are incidents where the police officer unnecessarily perceives danger. The police officers should be aware aware of their risks so that tragedies do not occur.

The payoff matrix for the police officer also looks quite different based on the prior distribution (or expectation) of the criminal's likelihood to be carrying a lethal weapon. I assume I have overestimated the likelihood a criminal carries a lethal weapon, but with this information the police officer's "strategy" could be improved, and as I stated above, this could prevent people from unnecessarily dying.

Most of the analysis above has depended on the players of the game acting at least approximately rationally. It is an interesting question, though, how rationally we can expect the players in this game are playing. The criminal presumably is "on edge" from just having committed a crime and being chased down by police officers would probably not help the situation. In a study done with the game show Golden Balls, contestants were found to be more altruistic, in general, than a simple game theory analysis of rational players would predict [2]. This shows that even in scenarios where life or death is not on the line, people may not be expected to act rationally. With information about how criminals are likely to act, police officers can develop better strategies on how to deal with situations such as this one.

References
1. Press, W.H., 2012. Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent. Proc. Natl. Acad. Sci. 109-26, 10409–10413.
2. Van den Assem, M.J., 2012. Split or Steal? Cooperative Behavior When the Stakes Are Large. Management Science. 58-1, 2-20.