The Code Book (4 page)

Read The Code Book Online

Authors: Simon Singh

Tags: ##genre

BOOK: The Code Book
6.01Mb size Format: txt, pdf, ePub

Figure 3
The Caesar cipher applied to a short message. The Caesar cipher is based on a cipher alphabet that is shifted a certain number of places (in this case three), relative to the plain alphabet. The convention in cryptography is to write the plain alphabet in lower-case letters, and the cipher alphabet in capitals. Similarly, the original message, the plaintext, is written in lower case, and the encrypted message, the ciphertext, is written in capitals.

Although Suetonius mentions only a Caesar shift of three places, it is clear that by using any shift between 1 and 25 places it is possible to generate 25 distinct ciphers. In fact, if we do not restrict ourselves to shifting the alphabet and permit the cipher alphabet to be any rearrangement of the plain alphabet, then we can generate an even greater number of distinct ciphers. There are over 400,000,000,000,000,000,000,000,000 such rearrangements, and therefore the same number of distinct ciphers.

Each distinct cipher can be considered in terms of a general encrypting method, known as the
algorithm
, and a
key
, which specifies the exact details of a particular encryption. In this case, the algorithm involves substituting each letter in the plain alphabet with a letter from a cipher alphabet, and the cipher alphabet is allowed to consist of any rearrangement of the plain alphabet. The key defines the exact cipher alphabet to be used for a particular encryption. The relationship between the algorithm and the key is illustrated in
Figure 4
.

An enemy studying an intercepted scrambled message may have a strong suspicion of the algorithm, but would not know the exact key. For example, they may well suspect that each letter in the plaintext has been replaced by a different letter according to a particular cipher alphabet, but they are unlikely to know which cipher alphabet has been used. If the cipher alphabet, the key, is kept a closely guarded secret between the sender and the receiver, then the enemy cannot decipher the intercepted message. The significance of the key, as opposed to the algorithm, is an enduring principle of cryptography. It was definitively stated in 1883 by the Dutch linguist Auguste Kerckhoffs von Nieuwenhof in his book
La Cryptographie militaire:
“Kerckhoffs’ Principle: The security of a cryptosystem must not depend on keeping secret the crypto-algorithm. The security depends only on keeping secret the key.”

Figure 4
To encrypt a plaintext message, the sender passes it through an encryption algorithm. The algorithm is a general system for encryption, and needs to be specified exactly by selecting a key. Applying the key and algorithm together to a plaintext generates the encrypted message, or ciphertext. The ciphertext may be intercepted by an enemy while it is being transmitted to the receiver, but the enemy should not be able to decipher the message. However, the receiver, who knows both the key and the algorithm used by the sender, is able to turn the ciphertext back into the plaintext message.

In addition to keeping the key secret, a secure cipher system must also have a wide range of potential keys. For example, if the sender uses the Caesar shift cipher to encrypt a message, then encryption is relatively weak because there are only 25 potential keys. From the enemy’s point of view, if they intercept the message and suspect that the algorithm being used is the Caesar shift, then they merely have to check the 25 possibilities. However, if the sender uses the more general substitution algorithm, which permits the cipher alphabet to be any rearrangement of the plain alphabet, then there are 400,000,000,000,000,000,000,000,000 possible keys from which to choose. One such is shown in
Figure 5
. From the enemy’s point of view, if the message is intercepted and the algorithm is known, there is still the horrendous task of checking all possible keys. If an enemy agent were able to check one of the 400,000,000,000,000,000,000,000,000 possible keys every second, it would take roughly a billion times the lifetime of the universe to check all of them and decipher the message.

Figure 5
An example of the general substitution algorithm, in which each letter in the plaintext is substituted with another letter according to a key. The key is defined by the cipher alphabet, which can be any rearrangement of the plain alphabet.

The beauty of this type of cipher is that it is easy to implement, but provides a high level of security. It is easy for the sender to define the key, which consists merely of stating the order of the 26 letters in the rearranged cipher alphabet, and yet it is effectively impossible for the enemy to check all possible keys by the so-called brute-force attack. The simplicity of the key is important, because the sender and receiver have to share knowledge of the key, and the simpler the key, the less the chance of a misunderstanding.

In fact, an even simpler key is possible if the sender is prepared to accept a slight reduction in the number of potential keys. Instead of randomly rearranging the plain alphabet to achieve the cipher alphabet, the sender chooses a
keyword
or
keyphrase
. For example, to use JULIUS CAESAR as a keyphrase, begin by removing any spaces and repeated letters (JULISCAER), and then use this as the beginning of the jumbled cipher alphabet. The remainder of the cipher alphabet is merely the remaining letters of the alphabet, in their correct order, starting where the keyphrase ends. Hence, the cipher alphabet would read as follows.

The advantage of building a cipher alphabet in this way is that it is easy to memorize the keyword or keyphrase, and hence the cipher alphabet. This is important, because if the sender has to keep the cipher alphabet on a piece of paper, the enemy can capture the paper, discover the key, and read any communications that have been encrypted with it. However, if the key can be committed to memory it is less likely to fall into enemy hands. Clearly the number of cipher alphabets generated by keyphrases is smaller than the number of cipher alphabets generated without restriction, but the number is still immense, and it would be effectively impossible for the enemy to unscramble a captured message by testing all possible keyphrases.

This simplicity and strength meant that the substitution cipher dominated the art of secret writing throughout the first millennium
A.D
. Codemakers had evolved a system for guaranteeing secure communication, so there was no need for further development-without necessity, there was no need for further invention. The onus had fallen upon the codebreakers, those who were attempting to crack the substitution cipher. Was there any way for an enemy interceptor to unravel an encrypted message? Many ancient scholars considered that the substitution cipher was unbreakable, thanks to the gigantic number of possible keys, and for centuries this seemed to be true. However, codebreakers would eventually find a shortcut to the process of exhaustively searching all keys. Instead of taking billions of years to crack a cipher, the shortcut could reveal the message in a matter of minutes. The breakthrough occurred in the East, and required a brilliant combination of linguistics, statistics and religious devotion.

The Arab Cryptanalysts

At the age of about forty, Muhammad began regularly visiting an isolated cave on Mount Hira just outside Mecca. This was a retreat, a place for prayer, meditation and contemplation. It was during a period of deep reflection, around
A.D
. 610, that he was visited by the archangel Gabriel, who proclaimed that Muhammad was to be the messenger of God. This was the first of a series of revelations which continued until Muhammad died some twenty years later. The revelations were recorded by various scribes during the Prophet’s life, but only as fragments, and it was left to Abū Bakr, the first caliph of Islam, to gather them together into a single text. The work was continued by Umar, the second caliph, and his daughter Hafsa, and was eventually completed by Uthmān, the third caliph. Each revelation became one of the 114 chapters of the Koran.

The ruling caliph was responsible for carrying on the work of the Prophet, upholding his teachings and spreading his word. Between the appointment of Abū Bakr in 632 to the death of the fourth caliph, Alī, in 661, Islam spread until half of the known world was under Muslim rule. Then in 750, after a century of consolidation, the start of the Abbasid caliphate (or dynasty) heralded the golden age of Islamic civilization. The arts and sciences flourished in equal measure. Islamic craftsmen bequeathed us magnificent paintings, ornate carvings, and the most elaborate textiles in history, while the legacy of Islamic scientists is evident from the number of Arabic words that pepper the lexicon of modern science such as
algebra, alkaline
and
zenith
.

The richness of Islamic culture was to a large part the result of a wealthy and peaceful society. The Abbasid caliphs were less interested than their predecessors in conquest, and instead concentrated on establishing an organized and affluent society. Lower taxes encouraged businesses to grow and gave rise to greater commerce and industry, while strict laws reduced corruption and protected the citizens. All of this relied on an effective system of administration, and in turn the administrators relied on secure communication achieved through the use of encryption. As well as encrypting sensitive affairs of state, it is documented that officials protected tax records, demonstrating a widespread and routine use of cryptography. Further evidence comes from many administrative manuals, such as the tenth-century
Adab al-Kuttāb
(“The Secretaries’ Manual”), which include sections devoted to cryptography.

The administrators usually employed a cipher alphabet which was simply a rearrangement of the plain alphabet, as described earlier, but they also used cipher alphabets that contained other types of symbols. For example, a in the plain alphabet might be replaced by # in the cipher alphabet, b might be replaced by +, and so on. The
monoalphabetic substitution cipher
is the general name given to any substitution cipher in which the cipher alphabet consists of either letters or symbols, or a mix of both. All the substitution ciphers that we have met so far come within this general category.

Other books

Satan's Pony by Robin Hathaway
The Quick Red Fox by John D. MacDonald
The Nerdy Dozen #2 by Jeff Miller
Trash by Dorothy Allison
Love Me Crazy by Camden Leigh
The Montgomery Murder by Cora Harrison
No Greater Love by Janet MacLeod Trotter
Accelerando by Charles Stross