# Bayesian Statistics and the Law

There were many winners at the Golden Globes this year including The People v. O.J. Simpson: American Crime Story which came away with 2 wins and 5 nominations. The program, which aired at the beginning of 2016, was praised for bringing the famous case back into the public eye after 20 years. The court case is also an excellent example when researching the Prosecutors Fallacy; a theory in Bayesian Statistics.

Bayesian statistics has a different philosophy to the classic (or frequentist) statistics that most people know. Bayes Theorem is credited to Reverend Thomas Bayes (1701-1761), and is a piece of probability theory; it was actually not published until in 1763, after Bayes death, by his friend Richard Price. [1] The difference between Bayesian approaches to statistics and that of Frequentist statisticians comes from the methods of inference made from this theorem. One key comparison is that Bayesians will take the data observed in an experiment as fixed and the parameters in their model to be uncertain. Comparatively, the frequentist statisticians take the parameters as fixed and work out how likely the observed data is. Whilst both have their advantages, there is controversy over which is the superior method.

Now let us return to Bayes Theorem. This is concerned with the probability of event A given event B which is linked to the probability of event B given event A. In notational form it states:

**P(A|B) = [P(B|A) * P(A)] / P(B)**

Here **P(A|B)** stands for the probability of an event A given we have evidence of event B. This theorem is not controversial and is widely accepted (the controversy comes from the inference built from the theorem). There are various ways to write this theorem although this is the most basic form describing a discrete case with just two events.

Statistics is incredibly versatile: we use it in many areas of life. It is no surprise, therefore, how often statistics are quoted in courtrooms. The 1994 case of OJ Simpson, in which the former American football player was on trial for the murder of his ex-wife, Nicole Brown Simpson, became famous fuelling the media for months. During this trial, Simpson’s defense attorneys employed an advisor who claimed that the accusation of spousal abuse was not relevant to the trial. This claim was supported by evidence that “*only about one in 2500 men who batter their significant others… go on to kill them*.” i.e. the probability that a man killed his wife, given that he abused her is 1/2500. [2] At the time this seemed like fair evidence in

At the time this seemed like fair evidence in favour of the claim of irrelevance, and you may be thinking the same thing. However, in 1995 I.J.Good published a paper suggesting that the statistic in question should instead have been the probability that an abusive man is guilty of murder given that his abused partner is killed. This probability, according to Good’s calculations was actually one in three; making the accusation of abuse much more important. [2]

What is the difference? Well, let’s look at it in mathematical terms. We shall denote “man abused his partner” as A, “Woman is murdered” as M and “Woman killed by her abuser” as K. We can compare the statistic given in the trial of **P(K|A)** with the suggested probability from Good of **P(K|M)**. Instead of Good’s method which used simplified assumptions, Skorupski and Wainer use the Bayes Theorem,** P(K | M) = P(M | K) P(K) / P(M)**, to calculate the new probability as 0.29 or 1 in 3.5. [2] This shows that, to some degree, Good was right – there is a significant difference in the statistic reported and the one he considers more relevant. The evidence given in court was correct, however, it was arguably irrelevant. “*A murdered woman having been murdered by her batterer is 709 times more likely than a battered woman being murdered by her batterer: P(K|M) ≈ 709 × P(K|B)*”. [2]

Whilst this is an incredibly important error that can be made in court, it is also made in other situations. Often we will be interested in the probability of a false positive on a medical test when we should be concerned with the probability of a positive result being false. Again the difference is subtle but important. P(positive|no disease) isn’t the same as P(no disease| positive). However with the use of Bayes Theorem, these two are connected and it may be possible to find the P(no disease| positive) by calculating P(positive|no disease) * P(no disease)/ P(positive). Despite its simplicity, this calculation isn’t always easy because the values necessary may not be available.

In the case of the OJ Simpson trial, the statistic was easy to compute because all of the probabilities that were involved could be found. However, a controversial element of Bayesian Statistics is the use of priors. In this example, each probability was based on evidence and fact, but this is not always the case; we don’t always have all the facts available to us or some are just not in numerical form. For example, the probability that the sun won’t rise tomorrow. Priors are values based on previously known knowledge. It is sometimes necessary to apply numbers to this knowledge in a subjective manner which is why there is some controversy. How can we use subjective techniques within something like statistics? Well, there are several reasons in favour of this method, including the ability to produce priors with minimal effect or the fact that a small difference made by a piece of prior knowledge is surely better than dismissing it completely. This comic strip captures this argument well:

Bayesian statistics is becoming more popular and it is worth considering in more detail. In our society, which is so dependent on mathematics, we must be aware of the impact of the statistics we use in our lives. As you’ve read, it even has the power to change the direction of a murder trial. The prosecutor’s fallacy is an excellent example of statistics being misused and a reminder of the importance of checking that we know what we are looking for before jumping into a pool of numbers.

References:

- University of St Andrews, MT4531 Bayesian Inferences Module Notes, Len Thomas, 6/10/16
- “The Bayesian flip – Correcting the prosecutor’s fallacy ” – William P. Skorupski and Howard Wainer – Significance – The Royal Statistical Society 2015

feature image: landwehrlawmn.com/2016/02/lemon-law/

image 1 – en.wikipedia.org

image 2 – www.goldenglobes.com

image 2 (comic):https://www.explainxkcd.com/wiki/index.php/1132:_Frequentists_vs._Bayesians