Spy radio
Burst encoders
• • • Donate • • •
   Logo (click for homepage)
Voice Protection Units
Secure speech solutions

This section deals with secure speech equipment, such as voice encryption devices, from a variety of manufacturers. Such devices come in many flavours, ranging from simple speech scramblers to digital voice encryptors. Most of the devices shown below, are also featured elsewhere on this website as they fall into multiple categories. Secure telephones are a class of their own, but since they also belong to the group of voice encryption devices, they are linked from this page as well.
Voice encryption units on this website
Racal MA-4204 MA-4014B Audio Encryption Unit Racal MA-4204 Racal MA-4470 Voice Crypto Unit Racal MA-4225 Portable voice encryption unit Racal MA-44741 Secure Phone Adapter Racal Digital voice encryptor Tadiran SEC-13
Tadiran SEC-15 Telsy TS-500 Teltron SP-810 Teltron SP-612 voice scrambler Hagelin CRM-008 (HC-235, Cryptocom) BBC Cryptophon 1100 BBC Vericrypt 1100 voice scrambler Siemens MSC-2001
Wide-band Voice and Data Encryption Unit Narrow-band Voice and Data Terminal Philips Spendex-10, Narrow-band Voice and Data Terminal Telsy TDS-2004M mobile voice encryptor Telsy TDS-2003 mobile voice encryptor Telsy TX-900S mobile phone encryptor Telsy TX-1020C narrow-band radio voice encryptor Teltron SP-850 voice scrambler
Mobile secure radio voice system Motorola Saber II secure portable radio Motorola STX secure trunking radio
PFX-PM portable radio with digital encryption SE-20 secure handheld VHF/UHF radio with scrambler Ascom Cryptovox SE-160 secure handheld VHF/UHF radio Skanti DS-6001 digital voice scrambler The Siemens DSM Voice telephone encryptor
Tele Security Timmann, TST-7595 Voice scrambler for HF radio Tele Security Timmann, TST-7700 voice and data encryption system Gretacoder 101, speech scrambler Elcrovox 1-4D narrow band voice and data terminal (STU-II compatible) Tait T-2000/II mobile radio with optional voice scrambler Tait T-3000/II handheld radio with optional voice scrambler BID/250 SAVILLE-based voice encryption unit for Clansman DMU Replacement for the BID/250 and other (obsolete) cryptographic units
Voice scrambler handset Wide-band voice encryption unit used by the former Yugoslav Army Norrow-band voice encryption unit used in the former Yugoslav Republic Racal PRM-4515 Cougar handheld radio with encryption Racal Cougar PRM-4735 body-wearable covert radio with voice encryption Selex/Marconi H-4855 Personal Role Radio (PRR) - not really crypto, but listed because LPI/LPD and use of Spread Spectrum technology
Selex H4855 ELSA Enhanced Encrypted Personal Role Radio (EZ-PRR) SIGSALY secure telephony system
Secure Telephones (Crypto Phones)

 More crypto phones

Secure speech systems are known by various names, such as Voice Privacy Unit, Secure Speech System, Voice Protection Device, Speech Encryptor, etc. In principle, there are only two systems for voice protection:

  1. Frequency domain voice scrambler
    In this analogue system, the frequency domain of the human speech is mirrored around a given center frequency, so that it becomes unintelligible. Such systems can easily be broken, even if the audio band is split into multiple smaller bands first.

  2. Time domain voice scrambler
    In this system, the human speech is first stored in some kind of memory, after which the individual parts are then scrambled in the time domain. It is more secure than a frequency domain scrambler, but can still be broken as the individual sound samples still bear the properties of human speech.

  3. Frequency and Time Domain voice scrambler
    This system, also known as the F/T Scrambler, is a combination of the above methods. It is the most complex one, but can still be broken with the right equipment, no matter how complex the randomizer is, as the individual samples still bear the properties of speech.

  4. Digital Encryption
    This method uses a digital representation of the analogue voice signal (samples), which is mixed with a digital key stream. This method is much safer than the ones above and is the only one that can really be called encryption.

Voice scramblers
Before digital speech encryption became widely available, an analogue technique was used to protect voice transmissions. This technique is commonly known as Voice Scrambling and comes in three flavours, which are further explained below. Scramblers are inherently insecure and only provide protection against the occasional eavesdropper, such as a telephone exchange operator.
Frequency domain scrambling   FD
This technique is based on frequency inversion and is commonly called voice scrambling. It is based on mirroring of the audio frequency spectrum around a given center frequency, sometimes divided over multiple frequency bands. This principle is best explained using a simplified model:

The audio spectrum of the voice data (1) is mixed with a carrier frquency fc (2). This results in two spectra: one that is the sum of the original sectrum and the carrier (3), and another one that is the difference of the two signals (4). A low-pass filter (LPF) is then applied to filter-off the sum and leave only the difference, effectively resulting in a mirrored audio band (5). At the receiving end, this process of mirroring the spectrum is repeated to make the speech 'legible' again.

To make things more complex, one could vary the carrier frequency and also split-up the audio band in several (e.g. five) smaller bands that are then mirrored individually. Continuously varying these parameters by putting them under digital control, can make it harder to decode the signal.

The advantage of this technique is that it completely takes place within the audio bandwidth of a channel, whereas digital encryption generally requires a more space. This allows scrambling to be used in existing systems. At the time, scramblers were also cheaper than digital encryptors, which is why scramblers were used by the police in many countries from the 1970's well into the 1990's.

The disadvantage of this method is that an evesdropper can easily reverse the mirroring process with a simple electronic circuit. In addition, experienced listeners could sometimes even extract useful information from the seemingly garbled speech directly, without a descrambling circuit.
Time domain scrambling   TD
Another method for speech protection is the so-called time-division or time-domain (TD) speech scrambling. This method is more secure than the simpler frequency-inversion system, but far less secure than modern digital speech encryptors. The simplified diagram below, shows how it works.

Human speech is cut into a number of small time segments which are then scrambled in an ever changing order. The order in which the packets are scrambled is determined by a pseudo random number generator (PRNG) which is seeded or initialised by the user by means of a secret KEY.

In this diagram, the top row shows the clear speech (input) in time. The second row shows the speech after it is scrambled. The bottom row finally shows the speech once it is descrambled again (output). The whole process of scrambling and descrambling, causes a noticable delay which is typically in the range of 0.3 to 0.6 seconds. This delay sometimes causes confusion.

As the time segments are scrambled in an ever changing pattern, it is important that transmitter and receiver are correctly synchronised. To ensure that both ends are kept in sync, a pilot signal is transmitted with the scrambled speech by means of Audio Frequency Shift Keying (AFSK). An example of a speech scrambler that uses Time Domain Scrambling, is the BBC Cryptophon 1100.

Although scramblers of this type are not safe, many police and other law enforcement agencies around the world, used this method for securing their conversations for many years, as it has the advantage that it can be used on existing narrow-band FM radio channels. Despite the fact that the experienced listener can't make any sense of the garbles, the system is prone to cryptanalytic attacks, as it is possible to reconstruct the original signal (and hence the cryptographic key) by examining the output signal on an oscilloscope or by means of a modern computer.
Frequency and Time domain scrambling   F/T
The third and most complex type of voice scrambler, is the so-called Frequency and Time Domain Scrambler, also known as the F/T Scrambler, which is basically a combination of the two methods explained above. Although scrambling and descrambling of this method is much more complex, the system is equally prone to cryptanalysis as the previous ones. It is inherently insecure.
Below are some examples of scrambled speech. These samples were recorded by Barry Wels [1] from the built-in analogue voice scrambler of the Icom IC-H11 radio. If you listen carefully to the scrambled audio, you may actually be able to descramble some of it yourself.

Digital Encryption
Most - if not all - modern secure voice terminals use digital encryption. Speech is first digitized by means of an Analog-to-Digital Convertor (ADC) or a Vocoder. The resulting digital data stream is then mixed by means of an XOR-operation with a data stream from a pseudo-random number generator (PRNG), that in turn is seeded by a KEY. This principle is also known as the Vernam Cipher. The resulting encrypted data stream that is then converted back to the analog domain (modem), so that it can be transmitted. This process is shown in the simplified diagram below:

In the 1970s many systems, such as the KY-57, used Continuous Variable Slope Delta Modulation (CVSD) to convert speech into digital data. This wide-band solution was only suitable for VHF and UHF radios. In the 1980s narrow-band systems were introduced, such as the KY-99, that used (enhanced) Linear Predictive Coding (LPC), limiting the data-rate to 2400 baud or even 800 baud.

The Pseudo Random Number Generator (PRNG) is seeded by a KEY that is either entered manually or by means of a key fill device. Modern systems sometimes use asymmetric encryption methods (e.g. AES) to exchange the keys over an insecure channel. This is known as Public Key Encryption.
Before human voice data can be encrypted, is has to be converted to the digital domain, by means of a sound sampler or digitizer. Generally speaking, a digital signal needs more bandwidth than its analogue equivalent (typically twice the bandwidth), but methods have been developed to reduce this effect, by analysing the properties of human speech and sending these parameters to the other end, where they are used to reconstruct or synthesize the human speech again.

This method is known as a Vocoder and is not always good enough to recognise a person's voice. The first vocoder, named VODER, was developed at Bell Labs in 1939. Its principle was first used during WWII on the transatlantic SIGSALY crypto phone. A speech analyser/synthesizer is also known as a CODEC (coder-decoder). Here are some examples of speech digitisers:
  • PCM - Pulse Code Modulation
    PCM is a general expression for digitizing an analogue signal. A PCM signal is in fact the numerical or digital representation of the analogue signal. Sending PCM data typically requires twice the bandwidth of the analogue original, but the quality is unsurpassed. Sampling and data rates are typically in the range of 16 to 32 kbps.  Wikipedia

  • CVSD - Continuous Variable Slope Delta modulation
    Reasonable quality vocoder with 1 bit/sample that only registers the difference between the current sample and the previous one (1 = higher, 0 = lower). It has a sample rate between 8 and 16 kHz, which results in 8 to 16 kbps data. Examples of equipment that used CVSD are the Philips Spendex 10, the Spendex 50 and the American KY-68.

  • LPC - Linear Predictive Coding
    Early vocoder for narrow bandwidth connections. LPC-10 has a sampling rate of 8 kHz and a coding rate of 2.4 kbps. Developed by the NSA and used in the first generation secure terminal units (STU).

  • RELP - Residual-Excited Linear Prediction
    Improved (but now obsolete) variant of LPC, and predecessor of CELP.  Wikipedia

  • CELP - Code-Excited Linear Prediction
    Improved variant of LPC and RELP that provides better speech quality at lower bitrates. CELP exists in many variants and is also used in MPEG-4 audio coding. It is the most widely used speech coding algorithm today.  Wikipedia

  • MELP - Mixed-Excitation Linear Prediction
    Medium quality vocoder, mainly used by the US Department of Defense for secure communication via satellites and and military radios. It has a sampling rate of 8 kHz and a coding rate of 2.4 kbps.  Wikipedia

  • MELPe - Enhanced Mixed-Excitation Linear Prediction
    High-quality low-bitrate enhanced version of MELP with a sampling rate of 8 kHz and a coding rate of 2400, 1200 and 600 bps.  Wikipedia

  • MRELP - Modified Residually-Excited Linear Prediction
    Improved variant of MELP and MELPe that produces better results at higher bitrates, such as 9600 bps.

Below are some sound samples of digitally encrypted speech, recorded by Barry Wels [1] from an Icom IC-H10SR radio. The first file contains the original audio file. The seconds file plays the encrypted audio. Finally, the last file produces the resulting audio once it has been decrypted.

  1. Barry Wels
    Audio samples of ICOM equipment.

  2. George Sugar, Voice Privacy Equipment for Law Enforcement Communication Systems
    US Department of Justice. LESP-RPT-0204.00. May 1974. Page 16.

Further information

Any links shown in red are currently unavailable. If you like the information on this website, why not make a donation?
Crypto Museum. Created: Tuesday 04 August 2009. Last changed: Sunday, 11 December 2016 - 15:05 CET.
Click for homepage