Audio Enhancement vs Forensic Transcription

Thanks to movies and shows like CSI and NCIS, people have unrealistic expectations about the audio enhancement capabilities of forensic audio examiners. In the world of fiction, not only can experts do the job with a click or two of the mouse, but they can make completely unintelligible speech completely clear and easy to understand.

In the real world, this is not how it works. Audio enhancement services are sometimes good for speech intelligibility, but more often than not, enhancement isn鎶?all that great for intelligibility issues. A forensic audio examiner cannot enhance what is not there. If speech is masked by other noises that cover the same range of frequencies, eliminating those noises will also eliminate whatever speech is underneath it.

Audio enhancement is best suited for improving the listenability of a recording. If you were to record an interview for a documentary but discovered an electric buzzing noise or a loud constant humming sound, a forensic audio engineer could eliminate it or at least reduce it. Dog barking, car horns, doorbells and other noises can also often be removed leading to improved listenability of the interview.

If, on the other hand, your interview is covered up with TV noise or was recorded improperly or with poor quality equipment causing speech intelligibility to be sub par, enhancement techniques aren鎶?likely to help much. The same goes for difficult-to-understand dialog due to a speech impediment, overlapping speech, inebriated speech, or heavy accents.

If your goal is to decipher what is said and a transcript is sufficient, a forensic transcription service is what you need. When it isn鎶?possible for your recording to be improved and made easy to understand for your audience, subtitles or a transcript of the speech is the next best thing.

The professionals that are best suited to decipher marginally intelligible speech are forensic linguists, or more specifically, forensic phoneticians. Forensic linguists have advanced degrees and a strong background in phonetics and linguistics and are able to use special software to visually inspect the speech to help determine the words used.

This special software is a speech spectrogram. A spectrogram is basically a visual representation of the speech waveform. It is used to identify the speech sounds in a recording. When a person speaks, he makes sounds using various speech organs such as the lips, tongue, and hard and soft palates. These different sounds have names like plosive, fricative, and diphthong. Identifying them on a spectrogram can help the linguist determine which words are spoken, even if he can’t be 100% certain when just listening to them.

The linguist also relies on his education to examine the linguistic aspects of dialog such as syntax, pragmatics, phonetics, and phonology to help determine what is being said or not said.

Forensic phoneticians also have experience with audio technology and speech enhancement software. Thus, when audio enhancement techniques are useful, the linguist, like the audio forensics expert, is able to apply them giving you the best of both worlds.

Forensic transcription services are unlike regular transcription services, and thus are much more expensive. You are hiring a forensic expert with advanced degrees to use specialized software and hardware to examine audio recordings. For clean audio, a general, legal, or medical transcriptionist can transcribe an hour of audio in four to six hours. One minute of marginally intelligible speech may take a forensic linguist a number of hours to transcribe. It all depends on the individual recording.

Before beginning work, the linguist will usually perform an evaluation in order to determine how much of the recording is likely to be successfully decoded, how long it will take, and how much the project will cost.