Wednesday, September 20, 2017

Beyond reasonable doubt

We may agree on what something is, even if we disagree on how it should be described. We may both be looking at a circle; you may describe it as having a circumference of a particular length, while I may describe it as having a radius of a particular length.

Do we have to agree on how to describe what “reasonable doubt” means? Does it have a utilitarian or a pragmatic function; is it a quality like “good” (remembering GE Moore’s difficulty in defining “good”). Does it have a function at all, or is it just a feeling?

Is it describable in terms of knowledge? To ask “what do I know” is to summon the ideas of knowledge and belief. What are the conditions that I require to be satisfied before I am prepared to say I believe something? Do I rely on experience, feeling, logic, or persuasive rhetoric? Do I have to use the same criteria for belief as you use?

Often juries will ask judges for a definition of “beyond reasonable doubt”. While courts may differ in their responses, do their differences conceal an agreement?

I have previously discussed the leading New Zealand case on this: R v Wanhalla [2006] NZCA 229; [2007] 2 NZLR 573. Now the High Court of Australia has considered the same issue: The Queen v Dookheea [2017] HCA 36 (13 September 2017).

The HCA prefers that explanations of what proof beyond reasonable doubt means should not be attempted, and in particular a contrast with proof beyond any doubt should be avoided. It is, however, acceptable and even useful to contrast the high criminal standard of proof beyond reasonable doubt with the lower civil standard of proof on the balance of probabilities. “[A] reasonable doubt is a doubt which the jury as a reasonable jury considers to be reasonable (albeit, of course, that different jurors might have different reasons for their own reasonable doubt)” (at [34]), and ([35]):

“... it is the votes of each of the individual members of the jury that are determinative of the verdict of the jury as a whole. Each juror is appointed to consider the evidence and to decide whether it satisfies him or her of guilt beyond reasonable doubt; and, in order to discharge that function, each individual member of the jury must in effect enquire of himself or herself whether he or she entertains a reasonable doubt. In practical reality, each individual juror may at some point in the course of the juror's consideration of an issue have a doubt which, upon reflection and evaluation, he or she is disposed to discard as an unreasonable doubt.”

Clumsily put, if one objects to unnecessary gender pronouns, but there you are.

In New Zealand, explanations of beyond reasonable doubt may be attempted: acceptable is, “an honest and reasonable uncertainty left in your mind about the guilt of the accused after you have given careful and impartial consideration to all of the evidence.” But this is not mandatory, and it “is sufficient to make it clear that the concept [of proof beyond reasonable doubt] involves a high standard of proof which is discharged only if the jury is sure or feels sure of guilt.” Focusing on doubt may be misleading, because a doubt need not be articulable and what is required is proof to the required standard. It is acceptable to tell a jury that proof to a certainty is not required. But it is wrong to tell a jury that they need to be as sure of guilt as they would be about an important decision in their own lives.

Don’t ask for more: lawyers are not philosophers. The law does, however, cherry-pick the philosophies it wants.

Utilitarianism asks, what is in the best interests of society? Individual interests are subordinate to society’s interests, individual rights are minimised, and as far as crime is concerned, a deterrent policy is pursued to protect the peace of the community. On the topic of the criminal standard of proof, a utilitarian would acknowledge that it must be higher than the civil standard, but not all that much higher.

A pragmatist would ask, what works? The ends justify the means. Pragmatism may strive for a workable balance between utilitarianism and morality. While absolute proof of a criminal charge is not required, pragmatism justifies a high, but not too high, standard of proof.

A moral view (and here I  acknowledge that these are all moral theories, but I just say "moral" here to avoid having to say deontological) is that it is better to let (insert your preferred number) guilty people go free than to convict one innocent person. It reflects a judgement about what is right or good in the context of a criminal trial, and it favours a very high standard of proof.

These themes are found in the various approaches to instructions on the standard of proof. To say that the standard is higher than the civil standard is to make a utilitarian point. It doesn’t get very far by way of explanation, but it is a start. To add that the fact-finder must “be sure” on a “reasonable” assessment of the evidence, is a pragmatic theme, taking the explanation beyond the utilitarian but not pushing it as far as morality would claim it should be taken. Also pragmatic is the illustration of taking the care one would take over an important decision in one’s own life. To say that proof to a mathematical certainty is not required but the standard is nevertheless very high, is to emphasise the moral theme.

Another area of the law in which these three themes are illustrated is the part of the law of evidence which concerns the decision whether to exclude improperly obtained evidence. A balancing model is widely used for this (and in New Zealand is enacted in s 30 of the Evidence Act 2006). Factors favouring admission of improperly obtained evidence are utilitarian: society is best served if people charged with offences have trials on all the available evidence. Exclusion factors are moral, reflecting the idea that those who enforce the law should obey the law. The balance between these factors is struck pragmatically: what is required for an effective and credible system of justice?

You can't really be surprised when a jury wants assistance with the concept of proof beyond reasonable doubt. Nor at the reluctance of judges to get into the extent to which deontological ethics may be modified by pragmatism. It should be reassuring, however, to remember that philosophy is just simple ideas dressed up in hard words, in contrast to law which is hard ideas dressed up in simple words.

Friday, August 04, 2017

Ruminate on this!

No doubt you enjoy puzzles:

There is a group of animals, one third of which are sheep. One animal was examined and found to have four legs. The scientist who did that examination tells us that the result “four legs” was three times more likely if the examined animal was a sheep than if it was not a sheep. What is the probability that the examined animal was a sheep?

Most trial lawyers should be able to correctly answer that almost instantly. The answer is in the comment posted by me.

Thursday, August 03, 2017

Legitimate presuppositions about guilt and innocence

Before the evidence involving the defendant's acts in a case is considered an assumption may be made about the likelihood that the defendant is guilty. This is different from the presumption of innocence, which simply means that the prosecutor must prove that the defendant is guilty and that the defendant does not have to prove innocence (see R v Wanhalla 24/8/06, CA321/05, [2007] 2 NZLR 573, (2006) 22 CRNZ 843 at [49], mentioned here on 25 August 2006). But – at least on a mathematical approach to conditional probabilities - there must be some starting assumption about whether the defendant is guilty in order that the effect of evidence can be determined. The ultimate decision, the verdict, will depend on how the evidence has affected the prior likelihood of guilt.

The “priors” can be expressed as a likelihood of guilt compared to a likelihood of innocence, each assessed before the evidence as to what the defendant did is considered. A ratio of probabilities is the same way of expressing this comparison of likelihoods. Are there appropriate numbers for making this comparison?

For some people - in the absence of evidence on the point given at trial - a starting point may be that a probability of guilt of 0.02 (that is, two chances out of a hundred), and a corresponding probability of innocence of 0.98, are a good way of reflecting the need to be fair. Almost certainly innocent, but recognising that there could be room for error about that, seems like a fair starting point. This might be the same likelihood ratio as for anyone chosen at random. Indeed, a criterion of random selection can lead to very small probabilities of guilt, for example if the population of a large city is taken as the reference group.

Other people might say, well, the defendant is either guilty or not guilty, so an equal chance of each alternative is a neutral starting point. For these people a probability of guilt of 0.5, and the same probability of innocence, is a fair starting point. Refusing to start with an inclination either way seems fair.

Currently, it is not routine for this to be mentioned in a trial. This may be because it is by no means clear that a starting point is necessary: why not just listen to the evidence and get on with it? The reason is that logical errors are likely to occur. A fact-finder will naturally ask, how much more consistent with guilt than with innocence is this evidence? This is the same as asking, what is the probability of the evidence existing on the assumption that the defendant is guilty, compared with the probability of that evidence existing on the assumption that the defendant is innocent. Having estimated that ratio, it would be tempting, but wrong, to conclude that the ratio expressed the defendant’s probability of guilt compared to probability of innocence. For example, if the issue to be determined was whether an unseen animal was a sheep, and the evidence was that it had four legs, the probability of getting the evidence that it had four legs if it was a sheep (P = 1) is not the same as the probability that it was a sheep if all that is known is that it was a four-legged animal. The error is called transposing the conditional.

Another reason for the priors not being mentioned at trial may be that there is no need to do so. Some evidence setting the scene, background evidence, is likely to have been given as part of the narrative. For example, if a crime was committed by a person in a building, video surveillance evidence may be that only 10 people were in the building around the relevant time, including the defendant. This supports priors of P'(G) / P'(NG) = 0.1 / 0.9. Another example is where it is conceded by the prosecutor that only one of two people could have committed the crime, the defendant being one. It would be intuitive to think that this gave equal priors of P(G) = P(NG) = 0.5. But the prior likelihood of each suspect being the offender may not be equal, and the question becomes to what extent should the fact-finder be given evidence of the unevenness of the respective prior likelihoods.

To get from the evidential likelihood ratio P(E|G) / P(E|NG) [which is read as: the probability of getting the evidence, given that the defendant is guilty, compared with the probability of getting the evidence, given that the defendant is innocent] to the ultimate issue ratio of P(G|E) / P(NG|E) – that is, to legitimately achieve the transposition – it is necessary to multiply the combined (that is, multiplied) likelihood ratios for each item of evidence on the relevant issue by the priors. The need to do this comes from mathematical logic, in a rule known as Bayes’ Rule or Bayes’ Theorem. A form of the rule useful for lawyers is the “odds form of Bayes’ Rule” described, for example,  in Bernard Robertson, GA Vignaux and Charles EH Berger, Interpreting Evidence – Evaluating Forensic Science in the Courtroom (2nd ed, John Wiley and Sons Ltd, Chichester, 2016) at 189, [A.2.7]. The logic applies to all forms of conditional probability evidence, not just to scientific evidence. And anything, the probability of occurrence of which varies according to context, can be expressed in terms of conditional probability.

This ratio of priors is the starting point mentioned above, and the problem is, how should it be assessed? The risk is that individual jurors might choose different starting points and indeed may choose any position between the alternatives mentioned above. This is why sufficient evidence needs to be given to establish the prior probabilities.

People who think that the priors should be P’(G) = P’(NG) = 0.5 have the advantage of being able, without error of logic, to say that P(E|G) = P(G|E) and that P(E|NG) = P(NG|E). This is because, for them, the priors are 1 and do not affect the result. Using Bayes’ formula reveals that to find the defendant guilty, a person who starts by understanding the priors to mean P’(G) = P’(NG) = 0.5 will only need the combined likelihood ratios of the other evidence in the case to be about 50 to 1, meaning that the combined evidence is 50 times more likely to have been obtained if the defendant is guilty than if the defendant is innocent.

But a person who understands the priors to mean P’(G) = 0.02 and P’(NG) = 0.98, will, to find the defendant guilty, require the other evidence to be about 2400 times more likely to have been obtained if the defendant is guilty than if the defendant is innocent. Leaving the assessment of the priors to individual jurors has obvious dangers.

In a civil case, for example an action for compensation for wrongful conviction, the ultimate issue must be proved to a probability of at least just over 0.5. Again, the level of proof required of the evidence depends on the priors. In civil cases it is especially tempting to think that priors of 0.5 each way is fair. To succeed in a claim for compensation the former defendant (now, plaintiff) would have to prove that the evidence in the criminal trial was slightly more likely to have been obtained if the defendant had been innocent than it was to have been obtained if the defendant had been guilty. But it still may be objected that the prior assumption of a probability of guilt of 0.5 is too high and that the probability attaching to a randomly chosen person should be used.

So a person who has been found not guilty, even on the assumption that the priors are 0.5 each way, may nevertheless fail to obtain compensation: this is because, although the evidence was less that 50 times more likely to have been found if the defendant was guilty than it was to have been found if the defendant was innocent, it may have still been more likely to have been found if the defendant was guilty than if the defendant was innocent.

The point is that to make presuppositions about the defendant's guilt or innocence legitimate, those probabilities must be assessed from evidence given at trial.

It is appropriate to ask whether assessment of evidence outside a trial context should attract the same logic. For example, does the logic apply to assessing the sufficiency of evidence to meet a requirement of reasonable grounds to suspect that evidence will be found in a search? As may be illustrated by the case I discussed here on 31 July 2017, some judges might think it does, some that it doesn’t. Judicial explanations do not go far enough for us to be sure.


I should add that when mentioning “guilt” in the above discussion I am referring to single-issue cases (for example, who did it, or was it done intentionally?). Where several issues are at play in a case, guilt on each will need to be considered separately. That will avoid the swamping effect of a large likelihood of evidence being obtained on one issue (for example DNA evidence proving the defendant’s presence) overwhelming proof of another issue (such as the defendant’s state of mind).