Playing With Bits in Python Part 1: Representing Integers

The first in a series of articles on dropping down to the level of bits and bytes in Python.

Overview

Python is a high level programming language and Python programmers rarely if ever need to dive down to the level of bits and bytes, the building blocks of both variables and the programming instructions that operate on them. That's fine and enables us to concentrate on the big picture but it may leave some of us feeling curious and unfulfilled. Even if you are never going to mess around with bits and bytes in earnest you might have a yearning to know a bit (excuse the pun!) about what's going on inside your microchips.

In this, the first of what could grow to be a long and comprehensive series of articles, I'll look at how integers (whole numbers) are represented as a series of 0s and 1s. You may well be thinking "I know that already" and if you are just thinking of positive numbers then you probably do. However, it's also necessary to be able to handle negative numbers which are a bit more complex and don't work in the obvious way.

Positive and Negative Integers in Python

Let's first have a quick refresher on storing positive (or to be precise non-negative, ie. 0 or higher) integers in binary. Take a look at the following table.

27 = 128 26 = 64 25 = 32 24 = 16 23 = 8 22 = 4 21 = 2 20 = 1
1 1 0 0 1 0 1 0

The headings show the values of each bit in a byte, calculated as 2 to the power of the bit position, zero-based but running from right to left. (Note that any number to the power of 0 is 1, and any number to the power of 1 is itself.) Using combinations of the numbers 1, 2, 4, 8, 16, 32, 64 and 128 we can build up any number between 0 and 255 by setting the corresponding bit to 1. For 0 all bits are 0, for 255 all bits are 1 and so on. The byte in this example is 128 + 64 + 8 + 2 = 202.

As I mentioned that was a quick refresher of something you may well have understood already but now let's take a look at how negative numbers are represented. The numbers 0 to 255 take up all possible combinations of 0s and 1s so with 8 bits we cannot respresent all numbers between -255 and +255. What we can do though is store all numbers between -128 and +127 (not +128 - we'll see why shortly).

The first key point to remember here is that the so-called most significant bit (MSB), ie. the leftmost bit which has the highest value, signifies the sign. If the number is >= 0 the MSB is 0. If the number is negative the MSB is 1.

The second key point to remember is that for negative numbers the other 7 bits aren't just a binary representation of the equivalent positive number. This means you can't just flip between negative and positive by changing the MSB.

Negative numbers are actually represented by what is known as the two's complement of the equivalent positive number. I'll get to what that means in a moment but I think it's best to first shown an example and explain how to comprehend its meaning.

-27 = -128 26 = 64 25 = 32 24 = 16 23 = 8 22 = 4 21 = 2 20 = 1
1 1 0 0 1 0 1 0

This table is almost identical to the first but with one slight difference. The MSB is shown as representing minus 27 which is -128. The bits are the same but this time represent -54.

Possibly the easiest way to understand how this can be is to add up all the values for bits set to 1, but treat the MSB as -128, not +128. This gives us -128 + 64 + 8 + 2 = -54. Another subtly different approach is to interpret the rightmost 7 bits as a positive number, counting up from -128 if the MSB is 1 or from 0 if it is 0.

We'll see plenty more examples when we get to writing code but before that I need to explain how to flip between negative and positive by calculating the two's complement. Let's have another table, this time starting with +88, converting it to -88 and then back to +88.

Start with +88 in binary01011000
Flip bits with binary NOT to get -8910100111
Add 1 to get -8810101000
Flip bits with binary NOT to get 8701010111
Add 1 to get back to 8801011000

So here you can see the simple two-step process of taking the two's complement of a number to flip between positive and negative: NOT the number to flip the bits and then add 1.

Before we finally get to writing a bit of Python you might be wondering why we use two's complement for negative instead of just the positive equivalent with the MSB set to 1. The simple answer is that it simplifies arithmetic, although I'll go into this in more detail in a future article.

The Project

This project consists of a single file called integers.py which you can downloaded as a zip, or you can clone/download the Github repository if you prefer. For future articles in this series I'll add the source code to the same zip and repository, therefore depending on how far in the future you are reading this you may find additional files.

Source Code Links

ZIP File
GitHub

The Code

The purpose of this little program is to print a list of numbers from -128 to +127 with their binary representation and a brief interpretation of each one. The list also includes the Python generated binary numbers which for negative numbers are simply the positive equivalents suffixed with a minus sign and with no leading 0s. My feeling is that anybody nerdy enough to want to see a number in binary is going to want to see the actual bits as they exist in memory, not a dumbed-down version. However, I won't moan about it too much . . . ! At least it gives us the chance to write an interesting little bit of code to create a string representation of the correct bit pattern.

So here it is, the source code from integers.py.

integers.py

def main():

    print("---------------------------------")
    print("| codedrome.com                 |")
    print("---------------------------------")
    print("| Playing With Bits in Python   |")
    print("| Part 1: Representing Integers |")
    print("---------------------------------\n")

    iterate_8_bit_ints()


def iterate_8_bit_ints():

    heading = "Decimal     Python Binary     Actual Binary     Interpretation"

    print(heading)
    print("-" * len(heading))

    for i in range(-128, 129, 8):

        # botch to iterate to 127
        if i == 128:
            i = 127

        bin_string = as_binary(i)

        print(f"{i:>7}", end="")
        print(f"{i:>18b}", end="")
        print(f"{bin_string:>18}", end="")
        print(f"{interpret_binary_string(bin_string):>19}")


def as_binary(x):

    # Generates a string of the correct Two's Complement 
    # representation of an 8-bit integer, not the
    # minus sign and positive number shown by Python.

    if(x < 0):
        x = 256 + x

    return("{0:b}".format(x).zfill(8))


def interpret_binary_string(bin_string):

    # Positive numbers have the MSB set to 0
    # and are 0 + the number represented by the other 7 bits.
    # Negative numbers have the MSB set to 1
    # and are -128 + the number represented by the other 7 bits.

    interpretation = ""
    offset = int(bin_string[1:], 2)

    if(bin_string[0] == "1"):
        interpretation += "-128 + "
    else:
        interpretation += "0 + "

    interpretation += str(offset)

    return interpretation


if __name__ == "__main__":

    main()

iterate_8_bit_ints

The main function calls iterate_8_bit_ints which, after printing a series of headings, loops from -128 to +128 with an interval of 8. (You can edit or remove the interval if you wish, but I didn't want the output to be excessively long.) The reason for looping to 128 is so that we can cheat slightly to show 127 in the list by decrementing the i variable when it gets to 128.

For each value we obtain its binary representation using the as_binary function we'll see next. Then the value is printed in decimal, Python-generated binary and the binary generated by as_binary.

The > plus number notation in the print statements stretches the output to the specified number of characters, the actual value being right-aligned. You can use < or ^ to left-align or centre the output. The second print statement also includes a b character to format the number as binary.

as_binary

The if statement in this function looks a bit cryptic but is actually very simple. If the number is non-negative we can just use Python's binary formatting. If not then adding the negative value to 256 (effectively subtracting it) will give us an unsigned number which, when interpreted as signed, will be the correct negative value. This will be clearer when we see a few more examples in the program output.

interpret_binary_string

This function generates a concise interpretation of the binary string argument. For negative numbers the return value is "-128 + " followed by the number represented by the rightmost 7 bits. For positive values the function returns "0 + ", again followed by the number represented by the rightmost 7 bits.

Running the Program

Now we can run the program:

Running the Program

python3 integers.py

This is the end result.

Let's look at a few examples from the output.

-128 - I mentioned earlier that for negative numbers Python-generated binary consists of a minus sign and the bits for the positive number. In most cases this is different to the two's complement but is correct for -128 (if you ignore the superfluous minus sign). If you look at the bits in the Actual Binary column you can see the MSB is 1, indicating the number is negative, and the other 7 bits are 0. This indicates that, as shown in the Interpretation column, the decimal value is -128 + 0.

Note that if interpreted as an unsigned integer the bits 10000000 are 128. For -128 the as_binary function calculates 256 + (-128) = 128, which is then formatted by Python. This is an example of how the as_binary works for negative numbers.

-88 - the Python Binary column shows the same bits as for +88, confirming how Python generates binary output which we know is completely different from two's complement. The rightmost 7 bits in the Actual Binary column represent 40 (32 + 8) so, as the MSB is 1, the decimal equivalent is -128 + 40. You might want to also check how as_binary would work for -88.

0 - for non-negative numbers the Python binary is correct, albeit without leading zeros. For 0 the MSB is 0, as are all the other bits. This gives us the rather boring interpretation of 0 + 0.

8 - again we get the correct but truncated bits in the Python Binary column, the same but with the full 8 bits in the Actual Binary column, and therefore "0 + 8" as the interpretation.

127 - as I mentioned, with 8 bits we can represent numbers between -128 and +127. Looking at the last row of the output shows that at +127 we have run out of possible combinations of bits. There are 256 combinations of bits in an 8-bit byte which, being an even number, has no centre value. The possible range of decimal numbers is therefore asymmetrical.

Going Forward . . .

This article only scratches the surface of bits, bytes and binary. In future articles I'll look at binary arithmetic, using individual bits as binary values, bitwise operators and floating point numbers.

Me on Mastodon (not Twitter!) where I post new CodeDrome articles and other interesting or useful programming stuff