Voice identification parades
The Module Evidence And Forensic Investigation
Should it be used? Is it admissible?
This casebook will provide a general review of voice identification parades as a method of forensic investigation through the examination of cases. It will explore such issues as: its use historically and contemporarily, procedures, admissibility issues, reliability, current guidance for use, drawbacks and comparisons to other forensic techniques.
As early as the 15th Century, attempts had been made to establish guilt by voice recognition evidence in England. As cited by Professor David Ormerod an academic authority on voice identification, the courts were initially quite accepting of voice recognition, where the witness has prior familiarity with the voice of the offender were, in England and in other jurisdictions. However, modern research warns against its use. Especially concerning is the use of voice identification, where the voice is not previously known to the witness. This arises with dock identifications of the voice of the accused and out-of-court voice identification by the witness. In the United States case; N.J. v. Hauptmann, a conviction relied in part on a witness's purported identification of the offender from two words spoken three years previously. The case provoked revolutionary research into the reliability of voice identification, and continued concerns for its admissibility as a forensic method has resulted in extensive enquiries.
Although voice identification has been largely ignored in the legal literature in England, an increasing number of cases rely on such evidence. In the absence of legislation, courts are adapting, on a case-by-case basis, the rules governing eyewitness evidence.
In recent cases, the Court of Appeal has avoided laying down strict guidelines. Code D, the ‘Code of Practice for the Identification of Persons by Police Officers', of the Police and Criminal Evidence Act 1984, (PACE), makes no provision for voice identification procedures by witnesses, though it does not rule out the use of an ‘aural identification' procedure where the police judge it to be appropriate. The only mandatory regulation is Annexe B, paragraph 18 Code D, requiring that eyewitnesses be asked to make a visual identification before voice identification is attempted. Historically, therefore, the procedures which have been used have drawn on some of the features of procedures for visual identification as recommended by experts.
While eyewitness procedures might appear to have been a satisfactory approach in the past, the use of the procedures without a detailed examination of the relationship between visual and voice identification is rather crude. Ormerod in his 2001 article on the dangers of using voice identification and recognition evidence advises:
“Adapting rules evolved for eyewitnesses could be more dangerous than having no rules at all if it engenders a false sense of security against misidentification. To avoid injustices such as occurred with eyewitness evidence, it is necessary to make a detailed examination of voice evidence to construct appropriate regulations.”
As a result of Ormerod and other researchers and phoneticians expressing their concerns about the ability of persons to recognise or identify a voice, a Home Office Circular (HOC) was issued advising police on the use of voice identification parades. The HOC is recommended as an ‘example of good practice' and is to be followed closely. It has been anticipated that voice identification may, in due course, be in incorporated into Code D.
A voice identification procedure may be sought where a witness professes to recognise the offender by his or her speech or expresses an ability later to identify the voice; however, the HOC offers no guidance as to the circumstances in which a procedure should be held. Unlike visual identification, there is no right to a procedure where the identity is in dispute.
The principal direction required in an identification case is the Turnbull direction, given to a jury by the trial judge. The precedent setting case of R v Turnbull was decided following the wrongful convictions and imprisonment of two men whose cases become the subject of the Devlin report on ‘Evidence of Identification in Criminal Cases'. The report concluded that juries needed special instruction as to their approach to identification evidence. In a voice identification case, the essential elements of an adapted Turnbull direction are necessary.
The principal voice identification case which explained the application of the direction in detail was R v Hersey. The case concerned a defendant convicted of robbery. A shop was robbed by two men wearing balaclavas. A considerable amount of conversation was passed between the robbers during the robbery and the shopkeeper became convinced that one of the voices was that of the defendant, whom he knew as a regular customer.
A voice identification parade was held and after the defendant was found guilty, an appeal was brought on the ground, amongst others, that the judge had failed to adequately deal with the identification evidence in his summing up.
The appeal was dismissed and it was held that there was little authority on how a judge should direct a jury in respect of voice identification, and as such it was articulated that the judge should direct the jury on the basis laid down by the Court of Appeal and in the Judicial Studies Board specimen directions in respect of visual identification, but tailored for the purpose of voice identification or recognition. These would follow, suitably adapted, the guidelines given in R v Turnbull and subsequent cases. It was vital that the judge spelt out the risk of mistaken identification and the reason why a witness may be mistaken, pointed out that a truthful witness may yet be mistaken, and dealt with the strengths and weaknesses in the case before him.
It is clear from analysis of this case that if the courts are to rely with any confidence on voice identification evidence that some safeguards ought to be adopted to ensure reliability. Whilst the blanket acceptance of the present procedure for visual identification provides some very clear safeguards, it is not likely to be the ideal solution. Of course, adopting identical procedures for voice identification will protect the accused in some ways, but without confidence that the procedure for visual identification will also serve to protect against the dangers inherent in voice identification or recognition, there is a risk of the admission of unreliable evidence. An express procedure for establishing the accuracy of voice identification would be welcomed by all those involved in criminal trials, but it is vital that the correct one is implemented. The surest way to do so is to conduct a detailed inquiry into the matter. It has previously been submitted that a number of issues need to be addressed: whether and how a witness should record an early description of the offender; the correct number of people for the parade; whether the police are capable of selecting the correct number of voices which are sufficiently similar to amount to a fair comparison; whether the text to be read on parade should be of a certain content, style, length, etc.; whether the text should be read “live” or played from a tape; whether the witness ought to be allowed to hear the voice repeatedly; what safeguards can be made against the accused disguising his voice; how to deal with strong regional accents, etc.; whether the parade should be videotaped or audio taped. It is also crucial that safeguards against the misuse of the identification evidence by the jury are established.
The particular dangers associated with voice identification need to be established. This calls for further research into the dangers of mistaken voice identification and recognition. While much of the Turnbull warning will probably be appropriate, it does not deal with some issues which might need clarification. For example, unlike in visual identification, the effects of the stress of a situation such as a violent crime affect the speaker's voice.
Continuing with this issue, in the case of R v Gummerson and Steadman, the complainant had been robbed and beaten by four masked men and had identified the men by their voices as the four accused whom he had known well for some years. They were charged with robbery and causing grievous bodily harm with intent. One co-accused pleaded guilty to the charges. The defences of the other three were of mistaken identity and alibi. The judge gave the jury a direction as to the evidence of recognition or identification in accordance with the Turnbull directions. One of the remaining accused was acquitted and the other two were convicted. They appealed contending, inter alia, that the judge should have excluded the evidence of the complainant that he had recognised the voices of the accused, that there should have been a voice identification parade and that the failure to have one was a breach of paragraph 2.3 of Code D. The appeal was dismissed and it was held that, there was no duty to hold a voice identification parade under Code D since the code had no application. It related only to visual identification. The matter was properly to be dealt with by the careful application of suitably adapted Turnbull guidelines. Moreover, the judge had approached the matter correctly. He had refused to rule the evidence inadmissible, although he stated that he would consider the matter at the conclusion of the prosecution case if asked to do so. Absent the requirement of a voice identification parade there was no sensible basis for excluding the evidence at the outset.
Ironically, in this case, unlike Hersey, the defendant was claiming to be disadvantaged by having no parade. In the long term, a code of practice for voice identification parades is desirable, but only when the best procedure is confirmed by appropriate research.
The court suggested that a properly adapted version of the Turnbull warning should be administered in voice identification cases. The potential dangers of this approach were noted in both of these cases, but it at least they would provide some safeguards for the defendant.
Psychologists have recognised that voice identification is less reliable than visual identification, and so the direction ought to be worded in more fitting terms. The trial judge will have to be alert to the many factors that may affect reliability. For example, in this case one of the offenders had attempted to disguise his voice, and psychologists have found that this reduces the likelihood of accurate identification to a very low level. One of the difficulties with adopting Turnbull is that the trial judge will not necessarily be familiar with other, less obvious factors that will affect the reliability of voice identification. For this reason, the sensible long-term solution is to produce a voice identification warning specifying the particular dangers. Again, this should follow appropriate research into the area.
Another significant case on this point is R v Roberts. Here, the defendant was identified by victim of an indecent assault at a street identification after the offence. At the trial it became apparent that the identification of the defendant rested not on the victims' description of his appearance and clothing but rather on the sound of his voice, when he spoke to the police officers. The counsel for the defendant applied unsuccessfully for the jury to be discharged so that defence could reassess the evidence. The defendant denied the assault of the victim and was convicted, and appealed against his conviction by leave of the single judge on the ground that the conviction was unsafe. The counsel relied on material from Professor Bull, a researcher of identification by voice.
The appeal was allowed and it was held that the judge should have acceded to the defendant's request for an adjournment. Professor Bull's research indicated that voice identification was more difficult than visual identification, and he concluded that the warning given to jurors should be even more stringent than that given in relation to visual identification. Bull also concluded that identification of a stranger by voice was especially difficult, even where there was a good opportunity to listen to the voice. In this case, the complainant did not have a good opportunity. On the evidence the conviction could not be regarded as safe.
The case has significance for voice identification evidence for a number of reasons. Most importantly, the court acknowledges that voice identification evidence not only poses the same types of danger as those that are well recognised in visual identification cases, but some additional, distinct dangers. The fact that the victim was a Polish speaker identifying an English voice; that the duration of speech in question was very short; that the speech occurred in circumstances of extreme stress, and that the later purported identification was also in circumstances of stress. It has been suggested by and supported by much research that no prosecution should be based solely on the voice identification of a suspect. The case raised the issue that it is vital that judges recognise the relevant dangers, and are prepared to withdraw cases, in accordance with the Turnbull directions, where the evidence is of poor quality. Once again, the case demonstrates the need for further research in this area and for clear guidelines to be established for the use of such evidence.
Following the need for formal procedures, it is appropriate to note here the admissibility issues in R v Deenik. In this case, the identification was no more than a confrontation. The defendant was alleged to have been engaged in the importation of cannabis resin during the course of which he spoke by telephone to a customs officer, masquerading as an accomplice's wife. He was arrested and interviewed. During the interview, the customs officer was allowed to overhear the defendant speaking and purported to recognise his voice as belonging to the person spoken to on the telephone. It was argued on appeal that the evidence ought to have been excluded on two grounds (although not exclusively). Firstly, that no consideration had been given as to how to reduce the chance of mistaken identification, and that the defendant was not given the opportunity to refuse to provide the opportunity for the officer to hear his voice. The Court of Appeal dismissed the appeal holding that the scheme for identification parades under Code D (which did not apply) provided little help in the context of voice identification and that nothing could have been done to reduce the chance of error. It was explained that although a person had the right not to incriminate himself, he could not prevent the gathering of evidence against him, including the hearing of his voice by a witness. A suspect had the opportunity to change his voice, unlike his appearance, and if forewarned could destroy the object of the exercise. There was no obligation to inform the defendant of what was going on. It was not unfair to admit the evidence, and there was no ulterior motive or unfair advantage in listening to the defendant.
In relation to the issue of it being “fair” to allow the suspect to provide information against himself without either warning him that his voice would be listened to, or setting up some more formal procedure for voice identification analogous to an identification parade, the court gives little consideration to the argument that there is a breach of the right against self-incrimination: a suspect may have a right not to answer incriminating questions, but this affords no general right to prevent the gathering of incriminating evidence. It might be objected here, that it is the right not to answer questions which is being infringed. The defendant did in fact refuse to answer questions put to him in relation to the offence, and the officer appears to have overheard the defendant's answers to the more formal inquiries of the custody officer such as, where the defendant was staying in London. Had he realised that his answers were in a sense “incriminating,” the defendant might well have refrained from giving the officer the opportunity to identify him.
In R. v. Director of the Serious Fraud Office, ex parte Smith the House of Lords took a restrictive view of the common law right of silence, Lord Mustill described it as not a single right but “a disparate group of immunities which differed in nature, origin, incidence and importance” and which were motivated by a variety of factors including a reaction against abuses of judicial interrogation, a reluctance to place the accused in the dilemma of being convicted whatever he said, and the desire to reduce the incidence of untrue confessions. The accused's right to keep his own counsel with regard to questions about the offence is clearly rooted in all of Lord Mustill's “motivations,” however; none of them provide a policy reason against surreptitious obtaining of voice evidence. Quite the reverse is true, in fact, as the court in the present case notes, if a policy can be identified it is one which acknowledges that it is easy to disguise one's voice and a prior warning may enable the accused to obstruct the course of justice by doing so.
The issue of voice identification made by a jury by comparing the defendant's voice in the witness box with that of a recording is a contentious issue with little judicial guidance and until the case of R v O'Doherty. The maker of a 999 call was identified as the defendant by both a police officer who purported to recognise his voice and by an expert who compared it with the defendant's known voice. The Court, drawing from authorities concerning visual identification, stated that:
“If evidence of voice recognition is relied on by the prosecution, the jury should be allowed to listen to a tape-recording on which the recognition is based, assuming that the jury have heard the accused give evidence.”
It was also held that they should be warned that they are not trained experts and that they might be concentrating on what the defendant was saying rather than comparing it with the voice on the tape and they might have a subconscious bias because the defendant was in the dock. The Court continued by asserting that the warning in each case would be governed by its own set of circumstances, and in the absence of such a warning the convictions were unsafe.
The issue that evokes some criticism is the Court's apparent permission for the jury, independently of the expert, to compare the defendant's live voice with that of the recording. Ormerod identifies that the serious risks involved in allowing jurors to conduct ad hoc voice identification in the courtroom are those inherent in any stranger voice identification, but worsened by the delay between hearing the voices, the stress of the exercise in the courtroom, the danger of bias, and the risk of over-confidence from the jurors. These issues alone or combined, point strongly against this as a valuable exercise in terms of the likely accuracy of the outcome.
In a similar situation to juries, witnesses according to the Turnbull guidelines are there is no prohibition in principle against a potential witness listening to a recording of the suspect's voice at the time of the offence to see whether the voice is recognisable, where one is available. R v Robb discussed the reliability of voice identification as an exact science, and what qualifies one to be an expert in it. In R v Robb a wealthy businessman was kidnapped and it was alleged that demands for ransom were made by telephone to his wife, some of which were recorded. There were circumstantial links involving R with the kidnap and the co-conspirators. The Crown sought to adduce evidence from a lecturer in phonetics that the voice of the person making the ransom demands by telephone and the voice of R as recorded on a videotape found at his home (“control tape”) were indistinguishable and were of the same person. The Crown also sought to adduce evidence from police officers who recognised R's voice on the “ransom tapes” from his voice when he had been speaking to them during the investigation of the offence. At the trial R's counsel objected to the admissibility of the evidence of the expert, alternatively submitting that it should have been excluded under section 78 of PACE. The judge ruled it was admissible. R, who did not give evidence, was convicted and appealed submitting that the judge erred in admitting the expert evidence.
The appeal was dismissed with the judgement that it was common ground that voice identification was a field where expert opinion was admissible, but it had been submitted that the evidence of the witness was inadmissible because he failed to meet the skill required of such expert. Although he held a Ph.D. in phonetics and had years of experience as a university lecturer and an expert witness, it had been contended that his method, an auditory one listening carefully to the “ransom” and “control” tapes for comparisons, was not scientific and was not generally respected in the field of phonetics: there was no acoustic analysis and no objective measurement of resonance. In cross-examination the witness had conceded that voice identification was not an exact science and that such identification by auditory technique alone was not received in evidence elsewhere in Western Europe. It might therefore be that the jury should treat such evidence with caution. The approach of English courts to expert opinion was, however, simple; study and experience would give a witness authority which the opinion of a layman would not have, and the weight of the evidence must be for the jury. Although his view was a minority within his profession, the expert's judgment was not shown to be wrong. It was significant that no challenge had been mounted to the voice identification itself by contrary expert evidence and R himself had not given evidence so that the jury did not hear the voice. The judge had correctly directed the jury that no expert could usurp their function and the matter had been left fairly for the jury's decision. No unfairness under section 78 of the Act had been shown.
The case identifies the acute danger with voice identification, and with any other type of forensic method, that the courts may be misled into accepting either the need for expertise, or the qualifications of a particular person to give expert testimony, because of the nature of the adversarial process. Where there are financial differences, situations can arise where only one side (in a criminal trial, usually the prosecution) is supported by a number of experts, and the trial judge may be “unwittingly misled about the scientific credentials of a novel field by being privy to only a subset of current scientific thinking on the issue”. It was perhaps this sort of situation which Bingham L.J. had in mind in the present case when he said:
“We are alive to the risk that if, in a criminal case, the Crown are permitted to call an expert witness of some but tenuous qualifications the burden of proof may imperceptibly shift … A defendant cannot fairly be asked to meet evidence of opinion given by a quack a charlatan or an enthusiastic amateur. But we do not regard Dr Baldwin as falling anywhere near these categories.”
While bogus experts are, it is to be hoped, not a common phenomenon, a qualified practitioner working on the edge of scientific knowledge is another matter. Expert evidence on voice identification seems, as the court in the present case acknowledges, to have become acceptable, despite some doubts as to its accuracy. This point about auditory techniques with which the court was concerned has been before the Court of Appeal in R v Bentum, where it was held that a witness, who was qualified in mechanical science and who had prepared a tape recording comparing the defendant's voice with the voice in dispute, could give his opinion, based on listening to the tape, that the two voices were probably different. While an opinion such as this may not be worth a great deal, but it is thought to assist the jury. Given the risk that the jury may make too much of it, it calls into question whether the situation would be helped if experts were appointed by the court rather than by the parties.
It is also important for the courts to only accept voice identification evidence as admissible when the reliability of the witness has been inspected, with regards to their familiarity with the accused as been enough that they can make a positive identification. This can occur, particularly where covert bugging evidence is relied upon and the speaker is named by the police officer who has prepared them. R v Chenia explores this point. The defendant was alleged to have been supplying drugs from a golf club. 179 covert recordings were made of conversations at the club from which a compilation, together with transcripts prepared by the police were adduced in the evidence. The Court of Appeal held that none of the police officers who were trying to deduce who was speaking when they named a particular speaker on the transcript was an expert in voice identification, and if the evidence was to be relied upon, the basis of that evidence should have been written out in a statement so that the admissibility and reliability, based on the officer's sufficient familiarity with the voice to enable him to recognise it or otherwise, could be clarified.
More recently, another covert recording case, R v Flynn explored this point. Turning to the question of how similar voice recognition evidence should be treated in future, the court observed that “the key to the admissibility [of lay listener evidence] is the degree of familiarity of the witness with the suspect's voice” and that “where the prosecution wish to rely on such evidence it is desirable that an expert should be instructed to give an independent opinion on the validity of such evidence”.
The case raised some pertinent issues regarding the current contradictive guidance for lay voice recognition evidence based on a recording. On the issue that lay listeners may make comparisons of voices on covert recordings, the Court of Appeal did not insist that such comparisons should always be guided by expert advice but merely stated that it was “desirable” that “an expert should be instructed to give an independent opinion on the validity of such evidence”. To put this in context with other forensic methods, precedent exists in the cases of R v Tilly that a jury must not make what must be the significantly less challenging task of handwriting comparison without expert assistance.
In the light of the UK government's expressed intention that covertly captured interception evidence should be used in trials, and the general “increasing use sought to be made of lay listener evidence” it is clear that future judgements need to address the issues involved in greater detail establishing thorough guidance, as appears to be required in all areas of voice identification.