Enigma Machine
Sunday, May 24, 2009
Polyalphabetic Ciphers - Playfair
Polygraphic ciphers allow the encoding of multiple characters at one time (rather than substituting a single character at a time). There are a number of schemes that encode 2 or more characters simulataneously. One of the most popular of these is the Playfair cipher.
Playfair used a 5x5 box of characters. A keyword is chosen (with all unique characters.) The remaining characters of the alphabet are listed in alphabetical order left to right and top to bottom. One more note, the 5x5 box fits only 25 charcters. The letters i and j share a cell. As an example, using the keyword sleuth, the box would be:
S L E U T
H A B C D
F G IJ K M
N O P Q R
V W X Y Z
The message to be encoded, is divided into pairs of letters. These pairs are called a digraph. If there is a repeated letter in a digraph, an x is inserted between the letters and the digraphs are reorganized. Also, in cases where the messages have an odd number of characters, an x is added at the end.
Now, there are three different ways the letters can be related in the 5x5 encoding box. They can:
1.Both be in the same row
2.Both be in the same column
3.Be neither in the same row nor the same column.
If they are in the same row, substitute the next character in that row for each of the two characters. If you are at the end of a row, wrap around to the first character in the same row. Thus, UT becomes TS.
If the letters are in the same column, use the character below each letter in the box. If the letter is at the bottom of a column, use the character at the top of the column. For example, GO becomes OW.
If the letters are in different rows and different columns, then do the following:
1.Starting at the first letter, scan across the row until you are in the same column as the second letter (remember you can wrap around to the beginning of the row if this letter is further 'right' in the box). Use that letter as the ciphertext.
2.Starting at the second letter, scan across its row until you are in the same column as the first letter (again wrapping around if necessary).
Here is an example of encoding that digraph: If you have NE as your plain text, start with N and travel to the right until you are in the same column as E (column 3). The letter is P. Now starting at E, scan to the right (and wrap around) to end over the letter N (column 1). The letter is S. So NE becomes PS.
Note that in this case, the two plaintext letters and the resulting ciphertext letters form a small box within the 5x5 encoding box.
Let's try a full message. With the keyword SLEUTH, we now encrypt the message:
This is so silly. It is elementary my dear watson.
Removing punctuation and pairing letters, we get:
TH IS IS SO SI LX LY IT IS EL EM EN TA RY MY DE AR WA TS ON
And the resulting cipher is:
SD FE FE LN EF EW UW ME FE UE TI SP LD QZ KZ BT DO LG SL PO
Regrouping as we should in collections of 5 letters we get:
SDFEF ELNEF EWUWM EFEUE TISPL DQZKZ BTDOL GSLPO
This is quite a mess, but a cryptanalyst does have some ways to try and break the encoding. We will look at those techniques in a future column.
Friday, May 15, 2009
Polyalphabetic Ciphers
Well, I've been gone for a while. Things have been busy. I'm going to try to be more regular at this again.
So let's start talking about more substitution ciphers. This time, we'll look at Polyalphabetic ciphers. Remember that Monoalphabetic substitution ciphers used an alphabet that mapped your plaintext characters into the associated ciphertext character. But since there was only one map or table, the same substitution takes place each time a particular character shows up (i.e. 's' always becomes 't' when the alphabet is shifted by 1 place). This is a problem because it allows someone trying to break your code to use character frequency counting to guess at which characters translate to certain frequently used characters. Also, doubled characters are easy to pick out (i.e. 'see' becomes 'tff' since 'e' always becomes 'f').
Starting in the 1500s and 1600s, some people began devising ways of using multiple alphabets within an encoded message (hence, the term 'polyalphabetic'). These maps or tables were either fixed (as with Trimethius' Tabula Recta) or could be based on key words or phrases (Vignere's table). The former used 24 alphabets each shifted 1 additional character to the left. The characters 'i' and 'j' were considered equivalent as well as 'u' and 'v' leaving 24 different alphabets. Successive characters in a message used the next alphabet down for translation. After 24 characters, the encoding with start with the original alphabet again. This masked patterns found in monoalphabetic methods (i.e. the double character problem). Vignere's table as a square with the alphabet across the top and down the left side. A keyword or phrase would be picked and that would be used as the key 'alphabet' to use for each successive character. The keyword was repeated as many times as needed to cover the original message.
Here is part of the table:
a b c d e f g h i …
A A B C D E F G H I …
B B C D E F G H I J …
C C D E F G H I J K …
D D E F G H I J K L
...
The full table is 26x26. The labels on the top (lowercase) are the letters from the plaintext. The letters along the side are from the 'keyword' and select the alphabet used for each successive letter. Using the keyword 'CAB', we can encode the word 'cage' as follows:
c a g e
C A B C
e a h g
The final ciphertext is 'eahg'. For longer messages, this would skew character count statistics. Additionally, using a longer keyword or phrase would be much stronger. You may want to write out the whole 26x26 table and try this with a much longer message and a longer keyword or phrase. Next time, we will look at other polyalphabetic ciphers with some more interesting properties.
Enjoy.