Oxford University Press's
Academic Insights for the Thinking World

# Codes and Ciphers

My book group recently read a 2017 mystery called The Lost Book of the Grail by Charlie Lovett. In the novel, an English bibliophile and an American digitizer track down a mysterious book thought to lead to the Holy Grail. The chief clue: a secret message hidden in the rare books collection of the fictional Barchester Cathedral Library. The message is a complex polyalphabetic substitution cipher that can only be solved by finding key words hidden in the books. Coded messages are common plot devices, used not just by Dan Brown but also by Edgar Allan Poe, Sir Arthur Conan Doyle, Jules Verne, Dorothy Sayers, Agatha Christie, and Neal Stephenson, among many others.

Aficionados distinguish among codes and ciphers. They also talk about steganography, which involves hiding messages, sometimes covertly as in a microdot and sometimes in plain sight as when the first letters of the paragraphs of a text spell out a word. Aficionados also refer to anagrams, which are expression made up by rearranging the letter (or numbers) of another expression. My name, for example, anagrams as BATTLED WAISTLINE.

There is also a distinction between codes and ciphers. A code is a technique for rendering one set of meanings using other, usually shorter, symbols. In early Morse Code telegraphy, for example, a word in the code book could be used to stand for a whole sentence or phrase, enabling efficient messaging. Stenographers and journalists use shorthand and the US Secret Service uses code names for its protectees—like Lancer (for JFK) and Rawhide (for Ronald Reagan). Ciphers refer to messages which are systematically altered by some algorithm, such as replacing one symbol for another. Cryptography refers to both ciphers and codes.

How do ciphers work? The classic example is one called the Caesar shift. This is an encryption in which each character is replaced by one a certain number of places down the alphabet. Julius Caesar’s encrypted messages were said to use a shift of three characters to the left. Edwin Battistella would become BATFK YXQQFPQBIIX. Simple ciphers like the Caesar shift are (said to be) easy to decrypt.

In literary works, such as mystery and spy fiction, encrypted messages can be used as plot devices, obstacles for the protagonists to overcome. Or they can be used as part of the plot itself, where the technique of decipherment is a major part of the story. In Arthur Conan Doyle’s short story “The Adventure of the Dancing Men,” a woman named Elsie Patrick is harassed by coded messages in which each character looks like a dancing person. Realizing the messages are written in a substitution cipher, Sherlock Holmes deciphers them by analyzing the frequency of the symbols. He explains to Watson that “E is the most common letter in the English alphabet, and it predominates to so marked an extent that even in a short sentence one would expect to find it most often.” Noting that T, A, O, I, N, S, H, R, D, and L are the next most frequent letters, he quickly deciphers the message, which said. “Elsie, prepare to meet thy God.”

Sometimes the cipher appears quite complex. Edgar Allan Poe used one as a plot device in his story “The Gold Bug.” The cipher was supposedly devised by Captain William Kidd, the Scottish pirate, giving directions to his buried treasure. It’s a simple letter-to-symbol cipher using numbers and punctuation marks, but without spaces between the word divisions. Poe’s fictional cryptographer solves the cipher by using frequency analysis. You can give it a try yourself. Here’s a clue, the letters E T A O I N S H R D L are represented by 8 ; 5 ‡ 6 * ) 4 ( † 0.

53‡‡†305))6*;4826)4‡.)4‡);806*;48†8

¶60))85;;]8*;:‡*8†83(88)5*†;46(;88*96

*?;8)*‡(;485);5*†2:*‡(;4956*2(5*—4)8

¶8*;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡

1;48†85;4)485†528806*81(‡9;48;(88;4

(‡?34;48)4‡;161;:188;‡?;

Figuring out the cipher in The Lost Book of the Grail was more complex, and the deciphering takes place over many pages of the novel. Frequency analysis leads the protagonists to the letters U, Q and D, which they associate with the Latin words unus, quinque and decem: 1, 5, and 10. The numbers point to books and chapters in the library’s medieval manuscript collection where the key words are found. That discover allows the cipher be decrypted by using the key to partially scramble the alphabet. So the keyword corpus goes before the English alphabet minus the letters in the key. The keyed English is aligned with the slightly shorter Latin alphabet (missing J and W, which were absent in classical Latin).

C O R P U S A B D E F G H I J K L M N Q T V W X Y Z

A B C D E F G H I K L M N O P Q R S T U V X Y Z

Ultimately, the key allows the protagonists to decipher strings like JULMCURQF CMQJLCHIQ UGBCULUFD as PERSAECUL ASUPRANOV EMHAERELI or per saecula supra novem hae reli-. Finding successive keys and applying them to further bits of text, they decipher the full Latin message. It’s a complex puzzle spread over nearly sixty pages.

Not all secrets are so complex. In the Da Vinci Code, symbologist Robert Langdon is confronted with, among other clues the lines:

13-3-2-21-1-1-8-5

O, Draconian devil!

Oh, lame saint!

Each line is an anagram. “O, Draconian Devil” yields “Leonardo Da Vinci” and “Oh, lame saint” becomes “The Mona Lisa.” The line of numbers is an anagram of the beginning of the Fibonacci Sequence, in which numbers after 1 are the sum of the two previous numbers: 1-1-2-3-5-8-13-21. It is the combination of a later lockbox.

Hidden messages, from anagrams to codes and ciphers are part of a long literary tradition. Take some time to enjoy them or create one yourself.

Featured image credit: “Enlightening Math” by John Moeses Bauan. CC0 via Unsplash.