Why Breaking Encryption Will Not Work

There is a fair amount of “chatter” on news shows today about the use of the Sony PlayStation by Islamic Terrorists for communications and the desire by many to have more “cooperation” from Silicon Valley companies in putting back doors into things like encryption methods.

Many talking heads gravely pronouncing that Edward Snowden is slightly worse than Hitler but not quite as bad as the Devil Himself, at least on some days… /sarc;

Frankly, I find it tedious.

It displays a gross ignorance of the technology of coded messages and encryption.

In short, that horse left the barn a hundred years ago.

Yes, I know the great victory pulled off by British Intelligence in breaking Enigma. I’ve taught it as part of my History Of Computing segment in college classes on computing. And it WAS great. Yet even then the Germans had available to them unbreakable coded messages. That they were so arrogant as to think they had a tech solution was their downfall.

There exists, and has existed for a long time, many secure encoding methods. Fundamentally unbreakable. Well known. You do not need an ‘end to end encrypting app’ to use them. I’ll give just 2 examples. They may in fact be the same example, but we’ve never broken the first one so can’t really say.

Number Codes

I will be using Wiki links as much as possible just to show how well known all of this is.

Anyone who has used a shortwave radio has heard stations that just consist of long sequences of number groups. Sometimes the language changes. Sometimes the length of each group. They were, for decades, suspected of being some kind of spy operations, but nobody who knew was talking. Lately they have.

The technology still works, and as near as I can tell has not been broken. It is most likely a kind of ‘one off pad’ code, but who knows.

https://en.wikipedia.org/wiki/Numbers_station

A numbers station is a type of shortwave radio station characterized by unusual broadcasts, reading out lists of numbers or incomprehensible coded messages. The voices are often created by speech synthesis and are transmitted in a wide variety of languages. The voices are usually female, although sometimes men’s or children’s voices are used. Some voices are synthesized and created by machines; however, some stations used to have live readers. Many numbers stations went off the air due to the end of the Cold War in 1989, but many still operate and some have even continued operations but changed schedules and operators.

The first known use of numbers stations was during World War I, and the first possible listener was Anton Habsburg of Austria. The numbers were transmitted in Morse. The Czech Ministry of Interior and the Swedish Security Service have both acknowledged the use of numbers stations by Czechoslovakia for espionage, with declassified documents proving the same. With a few exceptions, no QSL responses have been received from numbers stations by shortwave listeners who sent reception reports to said stations, which is the expected behavior of a non-clandestine station.

The best known of the numbers stations was the “Lincolnshire Poacher”, which is thought to have been run by the British Secret Intelligence Service.

In 2001, the United States tried the Cuban Five on the charge of spying for Cuba. That group had received and decoded messages that had been broadcast from Cuban numbers stations. Also in 2001, Ana Belen Montes, a senior US Defense Intelligence Agency analyst, was arrested and charged with espionage. The federal prosecutors alleged that Montes was able to communicate with the Cuban Intelligence Directorate through encoded messages, with instructions being received through “encrypted shortwave transmissions from Cuba”. In 2006, Carlos Alvarez and his wife, Elsa, were arrested and charged with espionage. The U.S. District Court Florida stated that “defendants would receive assignments via shortwave radio transmissions”.

In June 2003, the United States similarly charged Walter Kendall Myers with conspiracy to spy for Cuba and receiving and decoding messages broadcast from a numbers station operated by the Cuban Intelligence Directorate to further that conspiracy.

It has been reported that the United States used numbers stations to communicate encoded information to persons in other countries. There are also claims that State Department operated stations, such as KKN50 and KKN44, used to broadcast similar “numbers” messages or related traffic.

Suspected origins and use

According to the notes of The Conet Project, which has compiled recordings of these transmissions, numbers stations have been reported since World War I. If accurate, this would count numbers stations among the earliest radio broadcasts.

It has long been speculated, and was argued in court in one case, that these stations operate as a simple and foolproof method for government agencies to communicate with spies working undercover. According to this theory, the messages are encrypted with a one-time pad, to avoid any risk of decryption by the enemy. As evidence, numbers stations have changed details of their broadcasts or produced special, nonscheduled broadcasts coincident with extraordinary political events, such as the August Coup of 1991 in the Soviet Union.

The theory is that they use a “one time pad”, but exactly what they do, or if the do different things, is known to the public… But at least one is known to have used such a system:

The Atención spy case evidence

The “Atención” station of Cuba became the world’s first numbers station to be officially and publicly accused of transmitting to spies. It was the centerpiece of a United States federal court espionage trial following the arrest of the Wasp Network of Cuban spies in 1998. The U.S. prosecutors claimed the accused were writing down number codes received from Atención, using Sony hand-held shortwave receivers, and typing the numbers into laptop computers to decode spying instructions. The FBI testified that they had entered a spy’s apartment in 1995, and copied the computer decryption program for the Atención numbers code. They used it to decode Atención spy messages, which the prosecutors unveiled in court.

United States government evidence included the following three examples of decoded Atención messages. (Not reported whether the original clear texts were in Spanish, although the phrasing of “Day of the Woman” would indicate so.)

“prioritize and continue to strengthen friendship with Joe and Dennis” [68 characters]
“Under no circumstances should [agents] German nor Castor fly with BTTR or another organization on days 24, 25, 26 and 27.” [112 characters] (BTTR is the anti-Castro airborne group Brothers to the Rescue)
“Congratulate all the female comrades for International Day of the Woman.” [71 characters] (Probably a simple greeting for International Women’s Day on 8 March)

At the rate of one spoken number per character per second, each of these sentences takes more than a minute to transmit.

The moderator of an e-mail list for global numbers station hobbyists claimed, “Someone on the Spooks list had already cracked the code for a repeated transmission [from Havana to Miami] if it was received garbled.” Such code-breaking is possible if a one-time pad decoding key is used more than once. If used properly, however, the code cannot be broken.

Notice that the USA had to capture the “decryption program” in the computer in order to decrypt the messages. It is not clear if this was a ‘one time pad’ or not.

The One Time Pad

At its simplest, you could have a pad with matching sentences at each end. As a page is used, it is destroyed.

Going to lunch. Poison the water supply.
I’m tired of this. You are being watched, leave now.
Are you hungry? Call me soon.
See you at Christmas! Attack now!
I hate Thanksgiving. Poison the water supply.
Do you like movies? You are being watched, leave now.

etc.

As the plaintext never repeats, you can not know the codetext without capturing the pad.

https://en.wikipedia.org/wiki/One-time_pad

In cryptography, the one-time pad (OTP) is an encryption technique that cannot be cracked if used correctly. In this technique, a plaintext is paired with a random secret key (also referred to as a one-time pad). Then, each bit or character of the plaintext is encrypted by combining it with the corresponding bit or character from the pad using modular addition. If the key is truly random, is at least as long as the plaintext, is never reused in whole or in part, and is kept completely secret, then the resulting ciphertext will be impossible to decrypt or break. It has also been proven that any cipher with the perfect secrecy property must use keys with effectively the same requirements as OTP keys. However, practical problems have prevented one-time pads from being widely used.

First described by Frank Miller in 1882, the one-time pad was re-invented in 1917. On July 22, 1919, U.S. Patent 1,310,719 was issued to Gilbert S. Vernam for the XOR operation used for the encryption of a one-time pad. It is derived from the Vernam cipher, named after Gilbert Vernam, one of its inventors. Vernam’s system was a cipher that combined a message with a key read from a punched tape. In its original form, Vernam’s system was vulnerable because the key tape was a loop, which was reused whenever the loop made a full cycle. One-time use came later, when Joseph Mauborgne recognized that if the key tape were totally random, then cryptanalysis would be impossible.

The “pad” part of the name comes from early implementations where the key material was distributed as a pad of paper, so that the top sheet could be easily torn off and destroyed after use. For ease of concealment, the pad was sometimes reduced to such a small size that a powerful magnifying glass was required to use it. The KGB used pads of such size that they could fit in the palm of one’s hand, or in a walnut shell. To increase security, one-time pads were sometimes printed onto sheets of highly flammable nitrocellulose, so that they could be quickly burned after use.

There is some ambiguity to the term because some authors use the terms “Vernam cipher” and “one-time pad” synonymously, while others refer to any additive stream cipher as a “Vernam cipher”, including those based on a cryptographically secure pseudorandom number generator (CSPRNG).

Talking Code

But there are many other kinds of cipher and coding. One, used by the USA, was so secure that it was a top secret until fairly recently. The Navaho Code Talkers were showcased in a movie about them. First, it was all in Navaho. A fairly little known language and not very easy to learn to begin with. Yet even when the Japanese captured a native speaker of Navaho, he could not decrypt the messages. They were encoded. Things like “turtle” meant “tank” and a specific kind of bird was a fighter, another a bomber. Unless you know the encoding, you can’t break the code without a very large body of material AND some idea what it is talking about.

We also used other native languages, and other encodings. Nothing at all prevents that system from being used again by others.

https://en.wikipedia.org/wiki/Code_talker

Code talkers are people in the 20th century who used obscure languages as a means of secret communication during wartime. The term is now usually associated with the United States soldiers during the world wars who used their knowledge of Native American languages as a basis to transmit coded messages. In particular, there were approximately 400–500 Native Americans in the United States Marine Corps whose primary job was the transmission of secret tactical messages. Code talkers transmitted these messages over military telephone or radio communications nets using formal or informally developed codes built upon their native languages. Their service improved the speed of encryption of communications at both ends in front line operations during World War II.

The name code talkers is strongly associated with bilingual Navajo speakers specially recruited during World War II by the Marines to serve in their standard communications units in the Pacific Theater. Code talking, however, was pioneered by Cherokee and Choctaw Indians during World War I.

Other Native American code talkers were deployed by the United States Army during World War II, including Lakota, Meskwaki, and Comanche soldiers. Soldiers of Basque ancestry were also used for code talking by the U.S. Marines during World War II in areas where other Basque speakers were not expected to be operating.

Of note, that article also mentions use by China and the British with other languages and goes into some depth about how the code layer is added to the language layer.

Many Other Codes and Cyphers

There are many many other kinds of codes and cyphers. The current fad of computer encryption is NOT special. In some ways it is less secure than carefully crafted codes. Their major advantage is that you do not need to control and distribute code books.

https://en.wikipedia.org/wiki/Code_%28cryptography%29

In cryptology, a code is a method used to transform a message into an obscured form so it cannot be understood. Special information or a key is required to read the original message. The usual method is to use a codebook with a list of common phrases or words matched with a codeword. Encoded messages are sometimes termed codetext, while the original message is usually referred to as plaintext.

Terms like code and cipher are often used to refer to any form of encryption. However, there is an important distinction between codes and ciphers in technical work; it is, essentially, the scope of the transformation involved. Codes operate at the level of meaning; that is, words or phrases are converted into something else. Ciphers work at the level of individual letters, or small groups of letters, or even, in modern ciphers, with individual bits. While a code might transform “change” into “CVGDK” or “cocktail lounge”, a cipher transforms elements below the semantic level, i.e., below the level of meaning. The “a” in “attack” might be converted to “Q”, the first “t” to “f”, the second “t” to “3”, and so on. Ciphers are more convenient than codes in some situations, there being no need for a codebook, with its inherently limited number of valid messages, and the possibility of fast automatic operation on computers.

Codes were long believed to be more secure than ciphers, since (if the compiler of the codebook did a good job) there is no pattern of transformation which can be discovered, whereas ciphers use a consistent transformation, which can potentially be identified and reversed (except in the case of the one-time pad).

Further down the article, it has another reference to the difficulty in breaking even a simple code:

Idiot code

An idiot code is a code that is created by the parties using it. This type of communication is akin to the hand signals used by armies in the field.

Example: Any sentence where ‘day’ and ‘night’ are used means ‘attack’. The location mentioned in the following sentence specifies the location to be attacked.

Plaintext: Attack X.
Codetext: We walked day and night through the streets but couldn’t find it! Tomorrow we’ll head into X.

An early use of the term appears to be by George Perrault, a character in the science fiction book Friday by Robert A. Heinlein:

The simplest sort [of code] and thereby impossible to break. The first ad told the person or persons concerned to carry out number seven or expect number seven or it said something about something designated as seven. This one says the same with respect to code item number ten. But the meaning of the numbers cannot be deduced through statistical analysis because the code can be changed long before a useful statistical universe can be reached. It’s an idiot code… and an idiot code can never be broken if the user has the good sense not to go too often to the well.

Terrorism expert Magnus Ranstorp said that the men who carried out the September 11, 2001, attacks on the United States used basic e-mail and what he calls “idiot code” to discuss their plans.

Note that last sentence. I’m not “spilling the beans” to point out these codes…

My Own Code

I’ve pondered codes and encryption a great deal. The idea for this system first came to me after watching one too many movies… In one, the key was to find a particular book and use it to decode a message. I’ve since seen the same plot in several other movies and now can’t keep straight which one was the first one ;-) I think it involved a travel guide to a particular city… or that might be the later movie…

Just think if it as an ISBN enhancement of that system.

The basic idea is that you have a set of numbers. (Sound familiar?) Each number is a page, paragraph, sentence, and word. So 12,3,1,5 would be the fifth word of the first sentence of paragraph 3 on page 12. Since common words like “the” are on many pages and in many sentences, it is easy to ‘mix it up’. Since it does NOT encode letters, the frequency count of “e” being the most common in English is of no use in doing a ‘frequency count’ attack; it isn’t a cypher.

What I thought to add to this is some “forward mutation” semantics and an encryption of the ISBN# into the text so that one didn’t have to arrange in advance what book to use. There can be MANY variations on the exact syntax of the mutation, so knowing how to attack it becomes extremely hard.

So, as an example, we could start with the ‘base book’ being something like a particular dictionary or a common translation of the Bible. You use that to encode the first 10 or “whatever” semantic items. They can be coded to mean something like: Distance to start, direction to read, count to read, skip distance, shift amount, stride of additional text, final skip.

5, 23, 43, 98, 2, 8, 11,

So (remember this is the decrypt of the initial top text) 5 tells use to skip the next 5 entries, to 11.
11 is odd, so we read forward.

75, 45, 3, 25, 34

read 75 numbers, after skipping the first 45, shift each number by 3, use the next 25, skip another 34.

Personally, I’d also use the first letter of words as the skip and shift numbers, modulo base 16. So “apple” is “10”, “fast” would be 16, and “go” (rolling past the end of 16…) becomes 0 with “hello” being 1, etc. But I’m a bit masochistic when it comes to numbers ;-) Probably overkill…

Now at any time, you can insert “ISBN#” (or a coded equivalent) and that changes what book is the decrypt pad. “Yodle504923470x” meaning to use “Is God A Mathematician?”. Oh, and I have the ISBN number written backwards just to add another twist to things.

At any time another code word can mean “apply another skip / shift set”. So “Mumble, 24, 3, 1, 8, 14” would mean “Read 24 after skipping the first 3, shift each number by 1, and use the next 8, then skip 14 more”

Remember that this mutation is now encoded with “Is God A Mathematician?”, not with the first book. Or the second if there was a similar directive in that “read 75″… so you can’t just look for “Yodle” to figure out what’s up.

Cumbersome? Yes. But have a computer do it and it is less so. (Though don’t let the computer be copied or stolen…)

Oh, and if you DO have a computer assist, by using “special” copies of the library on it, ones where, for example, some paragraph breaks are changed or some sentences have a word left out like ‘a’ or ‘an’, even if the idea is known and somehow the ISBN is figured out, the code is sporadically ‘wrong’ and it will drive them nuts…

As you can see, too, there are a thousand and one variations on this system possible. Carefully done, it would be impossible to break, IMHO. Capture of the device would be needed, or capture of a person who knew the code.

In Conclusion

So while cyphers make it fast and easy for anyone to have a crypto-message, and encryption is really the only practical solution for things like whole hard disks and images; for simple messaging, it is hardly needed at all.

It will achieve NOTHING to force all of us to have open communications, as the Bad Guys can just shift over to well proven code systems. It WILL provide convenient “back doors” to be found, exploited, and used by all of:

China,
Russia,
Iran,
India,
Japan,
EU ‘Agencies’,
UK ‘Agencies’,
CIA,
NSA,
Anonymous,
Hackers World Wide,
Your Competitors,
That high school kid next door,
Your Spouse,
Me,
You,
That creepy guy next to you at Starbucks…

So please, despite what all the police agencies are asking to be given, do not mistake a cryptographic back door into common applications as providing any security to anyone. All it does is expose all of US to a police state (a few dozen of them, actually…) and let the Bad Guys know they need to use another method, of which there are dozens that are better.

Oh, and ‘by the way’: All the best crypto-code is already public. Nothing at all prevents them from just using it to encrypt a file and send it through your ‘back door broken’ system… A minor inconvenience at most.

Subscribe to feed

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in Political Current Events, Tech Bits and tagged , , . Bookmark the permalink.

4 Responses to Why Breaking Encryption Will Not Work

  1. Larry Ledwick says:

    Nice intro. Also even for codes which are vulnerable to being broken, there are minimum numbers of intercepts needed to successfully crack the code. Take for example a 3rd grade level of sophistication where every letter in the alphabet is shifted right by 3 letters. A becomes D, G becomes J.

    A message “meet at dawn” becomes “PHHW DW GDZQ”

    Most school kids could decipher that code text with only a little effort. It only takes one code sample to break it – “provided”, you know the code is highly likely to be english text and using the common 26 letter alphabet, change either of those assumptions and suddenly you open up a huge number of other possibilities. Mutate the alphabet sequence, but stick with english and unless you know the mutation of the alphabet it would take a period of pounding on the code to figure out the likely meaning. But what happens if the phrase “meet at dawn” does not mean meet at dawn but itself, is a code phrase which has one unique meaning? Perhaps representing an entire paragraph of instructions which were prearranged, like “you have been compromised burn all code material and execute plan B immediately”.

    But wait what is plan B??

    In some respects “idiot codes” can be more difficult to break than complex ciphers as you literally have no way of knowing about how many layers of obfuscation are being used in the message. In the case of a book code as you mentioned above where you notate the page, paragraph, line and word of a prearranged text, that source text could be any widely available text which both participants have access to. For example a book code based on the front page of a 1975 newspaper from a major city. Both parties could access that information at the local library going through microfilm of old news papers. The paper would not even need to be from the same country and language as the participants now that online translation tools are easily available on line.

    The major weakness of these simplified methods is that both persons need to meet ahead of time or share information to agree on the conventions of use, or be trained in the same conventions.

    In cases like enigma, they needed thousands of intercepts of the coded messages to work out the basic structure of the code (ie how many wheels were in the code machines) and then they needed some careless operation methods by the users to get their first hints about how the machines were configured. Things like always signing the messages with the same closing line of text, gave a common reference to allow them to determine when the codes changed.

    Onetime idiot codes where a key phrase has a unique meaning would be like the WWII BBC codes broadcast in the open by shortwave with phrases like “The picture is broken” having a prearranged meaning for a single clandestine group, and might actually refer to a long set of instructions memorized ahead of time.

  2. Pingback: Secure Communication, App for that, and Comments on ISIL | Musings from the Chiefio

  3. simple-touriste says:

    “known known to the public” ?

    [ Reply: Fixed. Thanks. Sometimes if the brain thinks about something else for a second the fingers type a repeat while they wait… It’s been a few decades since I was actively involved in how typing happens ;-) Down in the brain stem now…]

  4. Pingback: Apple vs. FBI vs. Spy vs. Spy | Musings from the Chiefio

Comments are closed.