Geometric and Harmonic Means in Python

The most commonly known and used statistical mean is the arithmetic mean, calculated by adding all values and dividing the result by the number of values. The arithmetic mean is one of a "family" of three means called the Pythagorean means, the other two being the geometric mean and the harmonic mean. In this post I will explain when you might need to use these alternatives and then show how to calculate them using Python.

Continue reading

Zipf’s Law in Python

In this post I will write a project in Python to apply Zipf's Law to analysing word frequencies in a piece of text.

Zipf's Law describes a probability distribution where each frequency is the reciprocal of its rank multiplied by the highest frequency. Therefore the second highest frequency is the highest multiplied by 1/2, the third highest is the highest multiplied by 1/3 and so on.

This is best illustrated with a graph.

Zipf's Law in Python

Continue reading

Frequency Analysis in Python

Simple codes such as substitution cyphers can be cracked or broken using a technique called frequency analysis which I will implement in Python.

In a previous post I implemented a very simple and very insecure substitution cypher. It is insecure because each letter in the original text is always encrypted the same way, for example the most common letter "e" might always be encrypted as "h", so if we find that "h" is the most common letter in the encrypted text then we can assume it represents "e". This can be carried out for all letters, a process called frequency analysis which in this post I will implement in Python.

Continue reading