I’ve always loved learning about different ways messages are encoded. As a little kid I used simple substitution ciphers and created my own alphabets to encode messages to myself. From different books and my computer science classes in college, I learned more about cryptography. I found that, no matter how complex a code is, there are always ways to crack it, whether by figuring out how it was designed, through social engineering, or through interception at a point where encryption isn’t used.
Encryption is becoming more and more standard on the Internet and on storage devices like hard drives and USB sticks because people want to protect their personal information. Every company that handles private information should use up-to-date cryptography techniques to prevent hackers from accessing and using their data.
Here’s an explanation of a few very simple codes and more complicated modern cryptography techniques that are frequently used on the Internet today.
Any way to write a message by hand that would be hard for someone else to read falls into this category. This includes writing things in a different alphabet. I’ve played with Icelandic runes and the International Phonetic Alphabet (IPA), as well as more niche created alphabets like the Deseret Alphabet (which was extremely tedious to write a message in).
Language can also be used as a code. I’ve looked into created languages like Elvish and Esperanto, but real languages can also be effective. The book Code Talker is a memoir by Chester Nez and Judith Schiess Avila that tells how the Navajo language was used as a code in World War II that was never cracked in extremely intense situations. When there weren’t words in Navajo for a specific concept, the code talkers decided on a word that would be used instead. For instance, a fighter plane became the Navajo word for “hummingbird,” and Germany was “iron hat.”
I love books like Tamora Pierce’s Trickster’s Choice and Trickster’s Queen, where spies and secret messages play a prominent role. Many fiction and nonfiction books have examples of simple cryptography techniques. Here are some older, simpler codes that can be used and cracked by hand.
In this kind of code, each letter is equivalent to another. If I had a five-letter alphabet (a, b, c, d, e), I might say:
a => c, b => e, c => b, d => d, e => a
A message is then encoded by replacing each letter with its value (a becomes c, etc.). Decoding the message just requires reversing the key (c becomes a, etc.).
This kind of code is extremely easy to crack. It can be solved with time by hand like a puzzle, starting with guessing letters for shorter words. It is extremely easy to solve with a computer, using frequency analysis of different letters. In each language, some letters are more common than others. For example, in English, “e” is far more common than any other letter, so if the message is long enough to have many reused characters, whichever letter is used with the greatest frequency can probably be replaced with “e.”
There are many variations on this kind of cipher or ways that people try to make it more difficult to crack. A couple versions are the Caesar Cipher and ROT13. Morse code is also an example of a substitution cipher, but instead of replacing letters with other letters, letters are replaced with sequences of long and short beeps.
A substitution cipher could also use another language’s alphabet, but that may be a little more complicated than replacing one letter with another since there’s not always a 1:1 transfer of letters and sounds between languages.
Another cryptography technique relies on both the writer and the recipient having exact copies of a book or some other written material. The person writing the message can indicate the page, line, and number of the character in the line they are referencing for each letter in their message. So a single letter may look like 35:7:18.
If a codebreaker does not know which book was used, this code can be hard to crack, because there may be no repetition in the coded message at all (the first time I code the letter “e,” I may take it from a different place in the book than the second time).
The movie National Treasure has an example of this kind of code. The main characters find a code made of numbers and figure out that the numbers point to words in the Silence Dogood letters. Knowing that their enemies are watching the text, they pay a boy to go count through the words in the documents and find the clue they need.
If someone can guess or find out what book or written source is being used for encoding, then this code is cracked.
Steganography or hidden messages
Another method used to transfer information secretly is steganography. A message can’t be decoded if it can’t be found. Using invisible ink or putting the message in an unexpected place, like hidden in a painting, are examples of steganography.
National Treasure also has a good example of steganography. A message hidden in invisible ink on the back of the Declaration of Independence gives the characters clues about where to go.
To crack hidden messages, they first need to be found. They can be hidden in obvious or very unexpected places, and the hidden message can also be encoded in other ways, making it more difficult to decipher even if it is found.
The Code Book by Simon Singh shows how codes have evolved over time, and it is a great read if you are interested in finding out how we went from simple ciphers to the more complex encryption methods we use today. Here are a few modern cryptography techniques.
A different base
A relatively simple way to encode a message, although one that can be extremely tedious to perform by hand, involves using a different base. While we are used to the base-10 number system, programmers are generally familiar with binary (base-2), hexadecimal, commonly referred to as hex (base-16), and other systems as well. Once the basic concept behind different bases is understood, it can be easily applied to any base.
To convert letters to a different base, you can use the letters’ ASCII values or create your own method for linking your alphabet to numbers. Then it’s a question of converting a number from one base to another. If you’re not familiar with that process, here’s an example of turning a decimal number into binary and into hex.
100 is a nice round number in a base-10 system, so we’ll use that.
In binary, 1 is 1, 10 is 2, 11 is 3, 100 is 4 and so on. So if you wanted to do this conversion just in your head, you could see that 1000 is 8, 10000 is 16, 100000 is 32, and 1000000 is 64 (all powers of 2). So:
The same thing can be done in the hexadecimal system, which has 16 symbols (0-9 and A-F). Generally “0x” is placed before a hex number to indicate that it is base 16.
So, in hex 0x1 is 1, 0x10 is 16, and 0x100 is 256 (16*16). 0x20 is 32, 0x40 is 64, and 0x60 is 96, as each 10 is a multiple of 16.
While some of these calculations can be performed by hand, it becomes much easier to transform bigger numbers into larger bases with a script (here’s an explanation of programmatically encoding Base64). Many languages have libraries with functions that will encode and decode different common bases for you as well.
Messages that have simply been transformed into a different base are generally easy to recognize and easy to decipher. A message that is all 1s and 0s is almost definitely binary, while a string with letters and numbers that ends in “=” is probably Base64. A simple script can then be used for deciphering.
Symmetric encryption is when both the sender and the receiver of a message have the same key, which is used both to encrypt and decrypt a message.
Substitution ciphers (like the ones mentioned above) are technically symmetric encryption, but modern symmetric encryption can be much more complicated.
A stream of symbols at least as long as the message that is being encrypted can be used to encode and decode the message. Here is a very simple example, where the key is made of numbers that are added to the ASCII symbols to encode and then subtracted to decode. I used single-digit numbers for the key for simplicity.
|Key Added to ASCII Values||73||105||115||112||119||53||34||122||112||118||104||94||42|
|ASCII Symbols of Encoded Message||l||i||s||p||w||5||“||z||p||v||h||^||*|
Without the key, it can be impossible to decipher the encoded message. An important aspect of this is the key not repeating itself. If the key repeats itself or is reused, then statistics and the frequencies of letters and words can be used to crack the code. This is one of the reasons that the idea of the one-time pad came about, which is a list of codes that are never reused. This type of encryption could also be applied to every bit of a binary message or in other ways that could be much more secure than just adding numbers to ASCII values like I did in the example.
The main problem with good symmetric key encryption is finding a secure way to share the symmetric key. If someone finds out what the symmetric keys are, then the code is totally cracked, so finding a secure way to transfer that information, either in person or with another type of encryption, is important. Methods like Diffie-Hellman key exchange make it possible to communicate and establish a shared key that is secret, even if others have tapped into the communication.
Asymmetric encryption: a private and public key
The other main type of encryption is asymmetric encryption, where each person has a private key and a public key. If I want to write a message to John, I would look up John’s public key and use it to encode my message, and then he would use his private key to decipher it.
One theoretical problem with asymmetric encryption is that the public key is public. A hacker could pretend to have the public key of someone, disperse it, and then intercept the messages encoded with the key they provided in order to decipher them. So there needs to be a way to verify that public keys are real and are connected to the actual recipient of the message.
Because asymmetric encryption is much more complex and takes more time and computing power than symmetric encryption, it will commonly be used only to encrypt the symmetric key that is then used for the rest of the conversation.
While codes can be fun to play with, because most people now have access to the Internet and use it for everything from games to medical records and banking, encryption has become a vital part of keeping information private. More complex cryptography techniques are constantly developed and cracked. If you’re interested in learning more about cryptography, check out The Code Book and the other books I mentioned.
Bonus round: Can you crack the codes?
Vs guvf zrffntr unq orra jevggra va Rfcrenagb, vg jbhyq unir orra zhpu uneqre gb penpx.
49 66 20 79 6f 75 20 77 61 6e 74 20 74 6f 20 77 6f 72 6b 20 61 74 20 74 68 65 20 62 65 73 74 20 63 6f 6d 70 61 6e 79 20 69 6e 20 74 68 65 20 77 6f 72 6c 64 2c 20 63 68 65 63 6b 20 6f 75 74 20 4c 75 63 69 64 21
Sometimes computers don’t work do what you want them to at when you really need it. Llike when a battery dies right bufore a precentation. I We all know what that feels like, but some people improvise so well you woulddn’t even know they were having a bad day.