Vigenère Cipher in Python

A while ago I wrote a post on implementing the Caesar Shift Cipher in Python. I will now expand on the theme by implementing the Vigenère Cipher.

The Vigenère Cipher

The Vigenère Cipher was invented in 1553 by the Italian Giovan Battista Bellaso but is now erroniously named after the Frenchman Blaise de Vigenère. It was not broken until 1863 although these days it is highly insecure and is now only useful as an interesting programming exercise. If you want to read up on it in full check out the Wikipedia article

The problem with the Caesar Shift Cipher is that each letter is always enciphered the same way, so for example if E, the most common letter, is always enciphered as S then simple frequency analysis will reveal the shift if you have a reasonably long piece of encrypted text.

The Vigenère Cipher is more sophisticated in that a letter may be encrypted as any other letter depending on its position in the plaintext. It may even be encrypted as itself which the famous German WW2 Enigma machine could not do, providing the Allies with a way to crack the code.

The processes of both encipherement and decipherement involve the use of a table of letters, variously know as a Vigenère Square, Vigenère Table of Tabula Recta, which looks like this.

Tabula Recta used by Vigenère Cipher

To encipher a string of text you need a keyword. This can be anything but ideally should be of a reasonable length. For this example I will use "Imagination is more important than knowledge." as plaintext and "orchestra" as a keyword.

The plaintext is stripped of all characters except the letters a-z and A-Z, and then converted to upper case. The keyword is then repeated to create a string the same length as the plaintext. Using the examples mentioned this gives us the following plaintext and keyword:

IMAGINATIONISMOREIMPORTANTTHANKNOWLEDGE

ORCHESTRAORCHESTRAORCHESTRAORCHESTRAORC

Each letter in the plaintext now has a corresponding letter in the keyword. If we find the plaintext letter along the top of the Tabula Recta, the keyword letter along the left side, the letter where the column and row meet is used to encipher that particular instance of the plaintext letter.

For example, the first letter of the plaintext is I and its keyword letter is O, and they intersect at W. The second I is paired with E, so is enciphered as M. The entire plaintext is enciphered as:

WDCNMFTKICEKZQGKVIAGQYXSGKTVRPRRGPCERXG

To decipher the text we just reverse the process: go to the row for the corresponding keyword letter, search along to find the enciphered letter, and then take the letter at that column's heading.

There is also an algebraic method of enciphering/deciphering with the Vigenère Cipher which does not need a Tabula Recta. It uses zero-based alphabet to integer mappings, so A = 0, B = 1 etc., an easy concept for any programmer to understand!

The formula for encipherement is:

enciphered index = (plaintext index + keyword index) mod 26

For the first letter of the example I = 8 and O = 14 so:

enciphered index = (8 + 14) mod 26 = 22 = W

Algebraic decipherement uses the formula:

deciphered index = (enciphered index - keyword index) mod 26

so:

enciphered index = (22 - 14) mod 26 = 8 = I

The actual encipherement/decipherement can be carried out in just one line of code whether you are using a table or algebra so I will implement both methods which you can try out by commenting/uncommenting the relevant lines.

Project Plan

I will implement the Vigenère Cipher as a class with two methods, one to encipher and one to decipher. The plaintext, enciphered text and keyword will be method arguments so the class will be stateless except for the Tabula Recta which will be created in __init__.

As well as the methods we'll need a few utility functions for the following tasks:

  • Creating the Tabula Recta

  • Processing the plaintext by removing anything other than letters and converting to upper case

  • Creating a keyword the length of the plaintext by repeating the base keyword

The Code

This project consists of the following two Python files. You can download the source code as a zip or clone/download from Github if you prefer.

  • vigenerecipher.py
  • vigenerecipherdemo.py

Source Code Links

ZIP File
GitHub

This is vigenerecipher.py

vigenerecipher.py

import re

class VigenereCipher(object):

    """
    This class provides methods for enciphering and
    deciphering text using the Vigenere cipher.
    """

    def __init__(self):

        self.tabularecta = self.__create_tabula_recta()

    def __create_tabula_recta(self):

        tabularecta = []

        for r in range(0, 26):

            offset = 0
            row = []

            for column in range(0, 26):
                row.append(chr(r + 65 + offset))
                offset += 1
                if offset > (25 - r):
                    offset = offset - 26

            tabularecta.append(row)

        return tabularecta

    def encipher(self, plaintext, keyword):

        """
        The plaintext argument can be any string, but
        only the letters a-z and A-Z will be included
        in the encrypted text.
        """

        plaintext = self.__process_plaintext(plaintext)
        keywordrepeated = self.__get_keyword_repeated(keyword, len(plaintext))
        ciphertext = []

        for index, letter in enumerate(plaintext):

            plaintextindex = ord(letter.upper()) - 65
            keywordindex = ord(keywordrepeated[index]) - 65

            #--------------------#
            # Using tabula recta #
            #--------------------#
            encipheredletter = self.tabularecta[keywordindex][plaintextindex]

            #---------------#
            # Using algebra #
            #---------------#
            # encipheredletter = chr(((plaintextindex + keywordindex) % 26) + 65)

            ciphertext.append(encipheredletter)

        return "".join(ciphertext)

    def decipher(self, ciphertext, keyword):

        """
        Decrypts the ciphetext using the keyword.
        Only the letters a-z and A-Z in the
        original text will be present in the
        decrypted text.
        """

        keywordrepeated = self.__get_keyword_repeated(keyword, len(ciphertext))
        decipheredtext = []

        for index, letter in enumerate(ciphertext):

            keywordindex = ord(keywordrepeated[index]) - 65

            #--------------------#
            # Using tabula recta #
            #--------------------#
            decipheredletter = chr(self.tabularecta[keywordindex].index(letter) + 65)

            #---------------#
            # Using algebra #
            #---------------#
            # decipheredletter = chr((((ord(letter) - 65) - keywordindex) % 26) + 65)

            decipheredtext.append(decipheredletter)

        return "".join(decipheredtext)

    def __process_plaintext(self, plaintext):

        plaintext = plaintext.upper()
        plaintext = re.sub("[^A-Z]", "", plaintext)

        return plaintext

    def __get_keyword_repeated(self, keyword, length):

        keyword = keyword.upper()
        keywordrepeated = []
        keywordlength = len(keyword)
        keywordindex = 0

        for i in range(0, length):
            keywordrepeated.append(keyword[keywordindex])
            keywordindex += 1
            if keywordindex > keywordlength - 1:
                keywordindex = 0

        return "".join(keywordrepeated)

Firstly notice that we import re (Regex).

In __init__ we simply call __create_tabula_recta.

The __create_tabula_recta function itself is a bit fiddly because as we populate the rows we have to add an offset each time which is then incremented. Every time this is done we need to check whether it has run off the end of the alphabet. If so we just subtract 26 to go back to the beginning.

Next comes the encipher method which firstly gets a processed version of the plaintext and a repeated version of the keyword, and then creates an empty list for the enciphered letters.

Then we iterate the plaintext letters, calculating the indexes of the two letters in the current pair, subtracting 65 from the ASCII codes. There are then two implementations of the actual encipherement, one using the Tabula Recta and the other using algebra. We then append the enciphered letter and after the loop join the list to create and return the enciphered text as a string.

The decipher method works in very much the same way, and again gives you a choice of Tabula Recta or algebraic implementations.

The __process_plaintext function uses very simple Regex to remove everything except the letters A-Z. (Regex makes my brain hurt but even I can understand this one!)

Lastly the __get_keyword_repeated function repeats the keyword as many times as necessary to create a string the same length as the plaintext.

So now we just need a bit of code to try it out.

vigenerecipherdemo.py

import vigenerecipher

def main():

    print("-------------------")
    print("| Vigenere Cipher |")
    print("-------------------\n")

    vc = vigenerecipher.VigenereCipher()

    keyword = "orchestra"

    plaintext = "Imagination is more important than knowledge."
    print(f'Plaintext:  {plaintext}')

    enciphered = vc.encipher(plaintext, keyword)
    print(f'Enciphered: {enciphered}')

    deciphered = vc.decipher(enciphered, keyword)
    print(f'Deciphered: {deciphered}')


main()

This is all very straightforward, just creating a VigenereCipher object and calling its methods. Now let's run it.

Run

python3.8 main.py

The output is:

Program Output

-------------------
| Vigenere Cipher |
-------------------

Plaintext:  Imagination is more important than knowledge.
Enciphered: WDCNMFTKICEKZQGKVIAGQYXSGKTVRPRRGPCERXG
Deciphered: IMAGINATIONISMOREIMPORTANTTHANKNOWLEDGE

Remember that both encipherement and decipherement is implemented in two ways so you can try out both with a bit of commenting/uncommenting in the right places.