Image Editing with NumPy Part 1: Intro

The first in a series of articles on editing images in Python using the NumPy library.

Overview

I've written a few articles on the Pillow library which is used for manipulating images in Python. It's an excellent library for editing entire images as a single entity but if you need to manipulate image data at the level of individual pixels or code any manipulations not provided by Pillow then it's fiddly and slow. Fortunately it is very easy to grab the pixel data from an image loaded with Pillow and copy it into a NumPy array. We can then do whatever we like with the data - the only limit is your imagination - before copying it back to a Pillow image to save.

In this introductory article I'll show how to copy pixel data into NumPy and apply a few simple edits before saving the changes. In future articles I'll dive a lot deeper with more complex and sophisticated manipulations. In the longer term I'll begin to look at the scikit-image library which provides a very comprehensive range of image manipulation functionalities.

The Project

This project consists of a single Python file called numpyimageintro.py and also a tiny 3 pixel by 3 pixel image. I'm not trying to save your bandwidth by using such a small image, but the program actually prints the raw RGB pixel data to the terminal so I wanted to keep this as small as possible.

The files can be downloaded as a zip, or you can clone/download the Github repository if you prefer. I'll add files from subsequent articles in this series to the same zip and repository.

Source Code Links

ZIP File
GitHub

This is the image named 3x3.png enlarged to 12800%, ie. each pixel in the original is 128 pixels here.

The Code

This is the numpyimageintro.py file.

numpyimageintro.py

from PIL import Image
import numpy as np


def main():

    print("-----------------")
    print("| codedrome.com |")
    print("| NumPy Image   |")
    print("| Part 1: Intro |")
    print("-----------------\n")
    
    # open Pillow image and create NumPy array from pixel data
    npimage = np.array(Image.open('3x3.png'))

    # print size and color depth
    print("IMAGE INFO\n----------")
    print(f"shape        {npimage.shape}")
    print(f"height       {npimage.shape[0]}")
    print(f"width        {npimage.shape[1]}")
    print(f"colour depth {npimage.shape[2]}")

    # first dimension: rows
    print("\nROWS\n----")
    print(f"1st row\n {npimage[0]}\n")
    print(f"2nd row\n {npimage[1]}\n")
    print(f"3rd row\n {npimage[2]}")

    # second dimension: columns
    print("\nCOLUMNS\n-------")
    print(f"1st column\n {npimage[:,0]}\n")
    print(f"2nd column\n {npimage[:,1]}\n")
    print(f"3rd column\n {npimage[:,2]}")

    # third dimension: color channels
    print("\nCOLOUR CHANNELS\n---------------")
    print(f"RED   \n {npimage[:,:,0]}\n")
    print(f"GREEN \n {npimage[:,:,1]}\n")
    print(f"BLUE  \n {npimage[:,:,2]}\n")

    # create copy of NumPy array
    npimagecopy = np.copy(npimage)

    # multiply green channel by 0.5
    npimagecopy[:,:,1] = npimagecopy[:,:,1] * 0.5

    # set top right pixel to orange
    npimagecopy[0,2] = (255,128,0)

    # create new Pillow image from copy of NumPy array
    imagecopy = Image.fromarray(npimagecopy)
    # and save it
    imagecopy.save("3x3edited.png")


if __name__ == "__main__":

    main()

Imports

Firstly we need to import PIL, or specifically its Image class, as well as NumPy, aliased as np as per convention. Both are on PyPI and can be installed with pip. These are the respective links.

https://pypi.org/project/Pillow/

https://pypi.org/project/numpy/

Pillow is a fork of the now-defunct PIL or Python Imaging Library so still uses PIL instead of pillow for backward compatibility.

You don't need any knowledge of either Pillow or NumPy to follow this code, and I'll explain how they are used as we go along.

Getting Pixels into NumPy

As this is a short and simple program I have kept all the code in the main function. The first thing we need to do is open an image file using the Pillow Image's open method, and then create a NumPy array from it. For brevity I have combined these two operations into one line, and also omitted error handling. Of course in The Real World any code reading or writing files should have try/except blocks.

Image Info

Having got a NumPy array containing the pixel data we can pick up a few pieces of information about it using its shape property. This is a tuple and the first element is the rows so its value is the image height in pixels. The second is the columns therefore its value is the width. Finally, assuming we've opened a colour image, the third dimension is the colour channels.

Generally colour images will have a colour depth of 3, ie. red, green and blue channels. Some image types such as PNG can have transparency and if so there will be an additional alpha channel. For monochrome images there is no third dimension in which case this code will raise an exception, something which must be allowed for in production code.

Accessing RGB Values

As I mentioned the first dimension of the NumPy array contains the rows starting at the top of the image, so we can access a whole row of RGB data just by indexing the row number. Here I have printed each of the three rows separately.

To get column data we index the rows using a : (colon) to grab them all and then index a column by its position starting from the left, for example npimage[:,2] gives us all rows, third column.

To get colour channels we apply the same principle but dig deeper, this time using, for example, npimage[:,:,0] for red. Remember monochrome images have no third dimension, and some files may also have an alpha or transparency channel. If so this is indexed 0, with red, green and blue being 1, 2 and 3 respectively.

Creating and Editing a Copy

Let's now create a copy of the NumPy array, make a couple of changes to it, and then use it to create and save a new Pillow image. We create a copy called npimagecopy using NumPy's copy method, passing the original array as a parameter.

A key feature of NumPy arrays is that we can manipulate them in a single operation. This is shown here where I have multiplied the entire green channel by 0.5, effectively reducing the brightnesses by 50%, in a single line. Without this capability it would be necessary to use a loop. And imagine having to carry out an operation on each colour of each pixel: you'd need a loop within a loop within a loop. Not nice!

The next line sets the RGB values of the top right pixel (row 0, column 2) to orange, specifically a tuple of RGB values. This illustrates that we can, of course, get and set individual pixels if needed.

Finally we create a new Pillow image using the fromarray method, and save it with a different filename.

Running the Program

Now let's run the program:

Running the Program

python3 numpyimageintro.py

This is the output, complete with the enlarged 3x3.png file to compare.

Firstly we see the shape as a tuple and then broken down into height, width and colour depth.

Next we have the RGB values by row. The pixels of the first row are all red, all blue and then all blue again. The pixels of the first column are all red, all green and all green, and so on.

The colour channels are a bit more interesting as they are laid out in the same way as the original image. You can see the red diagonal, the green bottom left corner and the blue top right corner.

There is no console output for the edit/save but you'll find another image, 3x3edited.png, in the folder where you saved the Python code and 3x3.png file. This is it, again blown up so you don't need a microscope. As you can see the green pixels are only half as bright, and the top right pixel is now orange.

Conclusion

The code in this article is very simple but we now know how to copy an image file's pixel data into a NumPy array, access that data by row, column, colour channel or individual pixel, edit the data, and save it back to an image file. These build a firm foundation for any image manipulation task you can think of.

For the next article in this series I'll create some histograms of the sort you might be familiar with from image editing software or on your camera's screen. These show the relative frequencies of the various RGB values in the image and represent the overall lightness or darkness of the image.

Me on Mastodon (not Twitter!) where I post new CodeDrome articles and other interesting or useful programming stuff