Algorithm Sundays: Encrypting and Decrypting the Vigenère Cipher

We're back for another Algorithm Challenge. Today we will be looking at a cipher that was considered to be unbreakable for over three centuries. It was only in the middle of the 19th century, that the code breakers finally caught up with this ingenious method of encryption. The Vigenère Cipher originated at a time when cryptography was flourishing. Encrypting sensitive messages had not only become an important part of military and diplomatic communication but found its way into private households as well. Talented cryptanalysts were sought out across nations to decipher encrypted messages of enemies. More and more ciphers were broken by those cryptanalysts and in consequence, did not offer the security that was needed to ensure safe communication anymore. It was at that time that new ciphers were popping up on a regular basis, claiming to be unbreakable. It turned out, that most ciphers were simple variations of the caesar cipher and could be broken almost as easily by conducting a frequency analysis of the encrypted text. I won't go into further detail about the frequency analysis in this post, if you are curious how it works, you can find a good write-up on practical cryptography.

What made the Vigenère Cipher so special? The most important part to the cipher is the so called Vigenère Square, a combination of 26 Caesar Ciphers, which were shifted by 1 value in each row:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
AABCDEFGHIJKLMNOPQRSTUVWXYZ
BBCDEFGHIJKLMNOPQRSTUVWXYZA
CCDEFGHIJKLMNOPQRSTUVWXYZAB
DDEFGHIJKLMNOPQRSTUVWXYZABC
EEFGHIJKLMNOPQRSTUVWXYZABCD
FFGHIJKLMNOPQRSTUVWXYZABCDE
GGHIJKLMNOPQRSTUVWXYZABCDEF
HHIJKLMNOPQRSTUVWXYZABCDEFG
IIJKLMNOPQRSTUVWXYZABCDEFGH
JJKLMNOPQRSTUVWXYZABCDEFGHI
KKLMNOPQRSTUVWXYZABCDEFGHIJ
LLMNOPQRSTUVWXYZABCDEFGHIJK
MMNOPQRSTUVWXYZABCDEFGHIJKL
NNOPQRSTUVWXYZABCDEFGHIJKLM
OOPQRSTUVWXYZABCDEFGHIJKLMN
PPQRSTUVWXYZABCDEFGHIJKLMNO
QQRSTUVWXYZABCDEFGHIJKLMNOP
RRSTUVWXYZABCDEFGHIJKLMNOPQ
SSTUVWXYZABCDEFGHIJKLMNOPQR
TTUVWXYZABCDEFGHIJKLMNOPQRS
UUVWXYZABCDEFGHIJKLMNOPQRST
VVWXYZABCDEFGHIJKLMNOPQRSTU
WWXYZABCDEFGHIJKLMNOPQRSTUV
XXYZABCDEFGHIJKLMNOPQRSTUVW
YYZABCDEFGHIJKLMNOPQRSTUVWX
ZZABCDEFGHIJKLMNOPQRSTUVWXY

Once we have prepared the square, the encyrption process is quite straightforward. We first define a message: LETSMEETATFIVE and a key that is easy to share: CAKE.

Now we match the key with our message. This is done by repeating the key over and over again until we've covered all characters of our message:

LETSMEETATFIVE
CAKECAKECAKECA

In the next step, we start looking up values in the Vigenère Square. Our first pair is L and C. We will look in the column L and the row C. Our encrypted value is N. For the next pair we have E and A. The value for column E and row A is E. With this pattern we continue to match pairs and create the encrypted message. To visualize this process, I've highlighted the matched pairs in the Vigenère Square below:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
AABCDEFGHIJKLMNOPQRSTUVWXYZ
BBCDEFGHIJKLMNOPQRSTUVWXYZA
CCDEFGHIJKLMNOPQRSTUVWXYZAB
DDEFGHIJKLMNOPQRSTUVWXYZABC
EEFGHIJKLMNOPQRSTUVWXYZABCD
FFGHIJKLMNOPQRSTUVWXYZABCDE
GGHIJKLMNOPQRSTUVWXYZABCDEF
HHIJKLMNOPQRSTUVWXYZABCDEFG
IIJKLMNOPQRSTUVWXYZABCDEFGH
JJKLMNOPQRSTUVWXYZABCDEFGHI
KKLMNOPQRSTUVWXYZABCDEFGHIJ
LLMNOPQRSTUVWXYZABCDEFGHIJK
MMNOPQRSTUVWXYZABCDEFGHIJKL
NNOPQRSTUVWXYZABCDEFGHIJKLM
OOPQRSTUVWXYZABCDEFGHIJKLMN
PPQRSTUVWXYZABCDEFGHIJKLMNO
QQRSTUVWXYZABCDEFGHIJKLMNOP
RRSTUVWXYZABCDEFGHIJKLMNOPQ
SSTUVWXYZABCDEFGHIJKLMNOPQR
TTUVWXYZABCDEFGHIJKLMNOPQRS
UUVWXYZABCDEFGHIJKLMNOPQRST
VVWXYZABCDEFGHIJKLMNOPQRSTU
WWXYZABCDEFGHIJKLMNOPQRSTUV
XXYZABCDEFGHIJKLMNOPQRSTUVW
YYZABCDEFGHIJKLMNOPQRSTUVWX
ZZABCDEFGHIJKLMNOPQRSTUVWXY

If we follow this pattern for every letter of our message, the encrypted version will look like this:

LETSMEETATFIVE
NEDWOEOXCTPMXE

If we know the key, we are able to revert the changes by repeating the key over all the letters like we did before.

NEDWOEOXCTPMXE
CAKECAKECAKECA

Looking up the correct values in the Vigenère square is simple. We take the current letter of our key (for example the first one, 'C') and locate the encrypted letter 'N' in the 'C' row of our Vigenère square. The letter at the top of row 'C' at the position of 'N' is our decrypted letter ('L'). We then continue this pattern for every letter. We search for 'E' in row 'A' and restore the top letter in this column ('E'). Then we search for 'X' in row 'K' and take the top letter of that column ('T'). By repeating this a couple of times we will restore the entire message.

As you can see this is quite a lengthy process that is prone to human error. Now after this lengthy introduction, let's finally write some code and see how we can automate the encrpytion and decryption of messages.

We start by writing the encoding function that will accept our original message and a key as arguments. We already know that we will have to iterate over an alphabet. Since our computers do not know what an alphabet is, we have to declare a variable that we will use as a foundation for our iterations. I've also split this string into an array, to make it a bit easier to work with indexes.

If we run a for loop and log the positions of each letter in our key (cake) in the alphabet, we can log 2, 0, 10, 4. These numbers will tell us which rows to use for the lookup. If you scroll up to the original square you can see that each of these numbers corresponds to the correct row in our Vigenère square. (e.g C = row 2, A = row 0, K = row 10, E = row 4). Remember that we start counting at 0 and not at 1 (because the first index in our array is 0, not 1).

let key = 'cake';

var encode = function (message, key) {  
  var alphabet = 'abcdefghijklmnopqrstuvwxyz'.split('');
  for (var i = 0; i < key.length; i++) {
    console.log(alphabet.indexOf(key[i])); // logs 2, 0, 10, 4
  }
}

encode('letsmeetatfive', key);  

Now we need to shift our starting index correctly to reflect the right row in our square. The third row for example starts with a C and not with an A. If we look at the structure of the Vigenère square, we can see that each row has been moved forward by X letters, where X is the number of the row. So the third row has been moved ahead by two letters - it starts with C instead of A. The fifth row starts with E instead of A, it has been moved ahead by four letters. This is why we can take the number of the row that we were logging to the console before (2, 0, 10, 4) and move our starting index ahead by that amount of characters.

Now we have the correct starting index but we still need to define how many steps we need to move to the right to pick an encrypted character. If you look at the square above, the column we want to pick from is simply the index of each character of our message in the original alphabet (e.g. E = 4, B = 1, A = 0 etc.).

This means we take our modified starting index rowIndex and add the column colIndex, which will tell us the index of the right encrypted character:

We also need to continue looping over our key for the length of the entire message. Right now we complete our for loop after a single iteration of our key. In the code below i've wrapped the for loop in a while loop that continues to run until we have iterated over each character of our message. To keep track how far we've progressed through the message, I've added the variable letterCounter.

let key = 'cake';

var encode = function (message, key) {  
  var letterCounter = 0;
  var alphabet = 'abcdefghijklmnopqrstuvwxyz'.split('');
  var rowIndex;
  var colIndex;
  while (letterCounter < message.length) { // continue to run until we're done with our message
    for (var i = 0; i < key.length; i++) {
      alphabet = 'abcdefghijklmnopqrstuvwxyz'.split('');
      rowIndex = alphabet.indexOf(key[i]);
      colIndex = alphabet.indexOf(message[letterCounter]);
      letterCounter++; // increment our counter after each iteration
    }
  }
}

encode('letsmeetatfive', key);  

This code is still not quite what we need. First of all, we have a problem if we reach the last letter of our message within the for loop - it will then proceed to loop over our key but we're already of of characters in our message. To prevent this, we need to break of the loop. We also need to return our result from the function. Finnally, we need to add a way to handle indexes that go past our intial alphabet, imagine we have a row that starts near the end of the alphabet - in that case we need to make sure we reset the index and start back at the beginning of the alphabet. This can be achieved using the modulus operator. If we have an alphabet that is 27 characters long, and we have an index that is less than 27, it'll return that index in the alphabet (e.g. 23, 13 or 5). However, if it is higher than 27, we will in return have the remainder to work with. Let's say we are looking for an index of 30, we will then have a remainder of 3, which means we start back at the beginning of the alphabet and take the third letter.

let key = 'cake';

var encode = function (message, key) {  
  var result = [];
  var letterCounter = 0;
  var alphabet = 'abcdefghijklmnopqrstuvwxyz'.split('');
  var rowIndex;
  var colIndex;
  while (letterCounter < message.length) {
    for (var i = 0; i < key.length; i++) {
      rowIndex = alphabet.indexOf(key[i]);
      colIndex = alphabet.indexOf(message[letterCounter]);
      // using the modulus operator to make sure that we start back at the beginning of the alphabet
      result.push(alphabet[(rowIndex + colIndex) % alphabet.length]); 
      letterCounter++;
      // breaking from the loop if we've reached the end of our message
      if (letterCounter >= message.length) break;
    }
  }
  return result.join('');
}

encode('letsmeetatfive', key);  

Now we're getting the correct output. There are still a few things this code doesn't cover. How are whitespaces and special characters supposed to be handled? Does the key continue to run over those or do we pause the key and just ignore special characters? This would largely depend on the way that you want to implement the encryption, you could use a RegExp for example to only accept regular characters as input. As soon as there are any characters in the message that we did not specify in our alphabet array, we will run into issues. I will leave it to you to play around with the code and handle the edge cases as you see fit. Let me know your thoughts in the comment section.

As this post has gotten quite long already, I will follow up with an approach to decryption in the next post. Until then, I hope you enjoyed following along. If you found any mistakes in the code or have a suggestion to improve the examples, let me know in the comments.

Further down the rabbit hole