Before the evidence involving the defendant's acts in a case is considered an
assumption may be made about the likelihood that the defendant is guilty. This
is different from the presumption of innocence, which simply means that the
prosecutor must prove that the defendant is guilty and that the defendant does
not have to prove innocence (see R v
Wanhalla 24/8/06, CA321/05, [2007] 2 NZLR 573, (2006) 22 CRNZ 843 at [49],
mentioned here
on 25 August 2006). But – at least on a mathematical approach to conditional
probabilities - there must be some starting assumption about whether the
defendant is guilty in order that the effect of evidence can be determined. The
ultimate decision, the verdict, will depend on how the evidence has affected
the prior likelihood of guilt.
The “priors” can be expressed as a likelihood of guilt
compared to a likelihood of innocence, each assessed before the evidence as to what the defendant did is considered. A ratio of probabilities is the same way of expressing this
comparison of likelihoods. Are there appropriate numbers for making this
comparison?
For some people - in the absence of evidence on the point given at trial - a starting point may be that a
probability of guilt of 0.02 (that is, two chances out of a hundred), and a
corresponding probability of innocence of 0.98, are a good way of reflecting
the need to be fair. Almost certainly innocent, but recognising that there
could be room for error about that, seems like a fair starting point. This
might be the same likelihood ratio as for anyone chosen at random. Indeed, a criterion of random selection can lead to very small probabilities of guilt, for example if the population of a large city is taken as the reference group.
Other people might say, well, the defendant is either
guilty or not guilty, so an equal chance of each alternative is a neutral
starting point. For these people a probability of guilt of 0.5, and the same
probability of innocence, is a fair starting point. Refusing to start with an
inclination either way seems fair.
Currently, it is not routine for this to be mentioned in a trial. This may
be because it is by no means clear that a starting point is necessary: why not
just listen to the evidence and get on with it? The reason is that logical
errors are likely to occur. A fact-finder will naturally ask, how much more
consistent with guilt than with innocence is this evidence? This is the same as
asking, what is the probability of the evidence existing on the assumption that
the defendant is guilty, compared with the probability of that evidence
existing on the assumption that the defendant is innocent. Having estimated
that ratio, it would be tempting, but wrong, to conclude that the ratio
expressed the defendant’s probability of guilt compared to probability of
innocence. For example, if the issue to be determined was whether an unseen
animal was a sheep, and the evidence was that it had four legs, the probability
of getting the evidence that it had four legs if it was a sheep (P = 1) is not
the same as the probability that it was a sheep if all that is known is that it
was a four-legged animal. The error is called transposing the conditional.
Another reason for the priors not being mentioned at trial may be that there is no need to do so. Some evidence setting the scene, background evidence, is likely to have been given as part of the narrative. For example, if a crime was committed by a person in a building, video surveillance evidence may be that only 10 people were in the building around the relevant time, including the defendant. This supports priors of P'(G) / P'(NG) = 0.1 / 0.9. Another example is where it is conceded by the prosecutor that only one of two people could have committed the crime, the defendant being one. It would be intuitive to think that this gave equal priors of P(G) = P(NG) = 0.5. But the prior likelihood of each suspect being the offender may not be equal, and the question becomes to what extent should the fact-finder be given evidence of the unevenness of the respective prior likelihoods.
Another reason for the priors not being mentioned at trial may be that there is no need to do so. Some evidence setting the scene, background evidence, is likely to have been given as part of the narrative. For example, if a crime was committed by a person in a building, video surveillance evidence may be that only 10 people were in the building around the relevant time, including the defendant. This supports priors of P'(G) / P'(NG) = 0.1 / 0.9. Another example is where it is conceded by the prosecutor that only one of two people could have committed the crime, the defendant being one. It would be intuitive to think that this gave equal priors of P(G) = P(NG) = 0.5. But the prior likelihood of each suspect being the offender may not be equal, and the question becomes to what extent should the fact-finder be given evidence of the unevenness of the respective prior likelihoods.
To get from the evidential likelihood ratio P(E|G) / P(E|NG)
[which is read as: the probability of getting the evidence, given that the
defendant is guilty, compared with the probability of getting the evidence,
given that the defendant is innocent] to the ultimate issue ratio of P(G|E) /
P(NG|E) – that is, to legitimately achieve the transposition – it is necessary
to multiply the combined (that is, multiplied) likelihood ratios for each item of evidence on the relevant issue by the priors. The need to do this comes
from mathematical logic, in a rule known as Bayes’ Rule or Bayes’ Theorem. A
form of the rule useful for lawyers is the “odds form of Bayes’ Rule” described, for example, in Bernard Robertson, GA Vignaux and Charles EH Berger, Interpreting Evidence – Evaluating Forensic Science in the Courtroom
(2nd ed, John Wiley and Sons Ltd, Chichester, 2016) at 189, [A.2.7].
The logic applies to all forms of conditional probability evidence, not just to
scientific evidence. And anything, the probability of occurrence of which varies according to context, can be expressed in terms of
conditional probability.
This ratio of priors is the starting point mentioned
above, and the problem is, how should it be assessed? The risk is that individual
jurors might choose different starting points and indeed may choose any
position between the alternatives mentioned above. This is why sufficient evidence needs to be given to establish the prior probabilities.
People who think that the priors should be P’(G) = P’(NG)
= 0.5 have the advantage of being able, without error of logic, to say that
P(E|G) = P(G|E) and that P(E|NG) = P(NG|E). This is because, for them, the
priors are 1 and do not affect the result. Using Bayes’ formula reveals that to find the defendant guilty, a
person who starts by understanding the priors to mean P’(G) = P’(NG) = 0.5 will
only need the combined likelihood ratios of the other evidence in the case to be about 50 to 1, meaning that the combined evidence is 50 times more likely to have
been obtained if the defendant is guilty than if the defendant is innocent.
But a person who understands the priors to mean P’(G)
= 0.02 and P’(NG) = 0.98, will, to find the defendant guilty, require the other evidence to be about 2400 times more likely to have been obtained if the
defendant is guilty than if the defendant is innocent. Leaving the assessment of the priors to individual jurors has obvious dangers.
In a civil case, for example an action for
compensation for wrongful conviction, the ultimate issue must be proved to a
probability of at least just over 0.5. Again, the level of proof required of
the evidence depends on the priors. In civil cases it is especially tempting to
think that priors of 0.5 each way is fair. To succeed in a claim for
compensation the former defendant (now, plaintiff) would have to prove that the
evidence in the criminal trial was slightly more likely to have been obtained
if the defendant had been innocent than it was to have been obtained if the
defendant had been guilty. But it still may be objected that the prior
assumption of a probability of guilt of 0.5 is too high and that the
probability attaching to a randomly chosen person should be used.
So a person who has been found not guilty, even on the
assumption that the priors are 0.5 each way, may nevertheless fail to obtain
compensation: this is because, although the evidence was less that 50 times
more likely to have been found if the defendant was guilty than it was to have
been found if the defendant was innocent, it may have still been more likely to
have been found if the defendant was guilty than if the defendant was innocent.
The point is that to make presuppositions about the defendant's guilt or innocence legitimate, those probabilities must be assessed from evidence given at trial.
The point is that to make presuppositions about the defendant's guilt or innocence legitimate, those probabilities must be assessed from evidence given at trial.
It is appropriate to ask whether assessment of
evidence outside a trial context should attract the same logic. For example,
does the logic apply to assessing the sufficiency of evidence to meet a
requirement of reasonable grounds to suspect that evidence will be found in a
search? As may be illustrated by the case I discussed here
on 31 July 2017, some judges might think it does, some that it doesn’t.
Judicial explanations do not go far enough for us to be sure.
I should add that when mentioning “guilt” in the above
discussion I am referring to single-issue cases (for example, who did it, or
was it done intentionally?). Where several issues are at play in a case, guilt
on each will need to be considered separately. That will avoid the swamping
effect of a large likelihood of evidence being obtained on one issue (for
example DNA evidence proving the defendant’s presence) overwhelming proof of
another issue (such as the defendant’s state of mind).