Graph Theory in Python

Graph Theory is a vast area of study based around the simple idea of individual points - known as vertices - connected by lines known as edges, each of which may have an associated numeric values called a weight and perhaps also a direction.

These simple sets of vertices, edges, weights and directions are known as graphs (not to be confused with the more familiar graph of data or a mathematical function) and can be used to represent many real-world situations or abstract concepts. Once you have a graph stored in some kind of data structure there are many algorithms which can be applied to solve problems, especially those of optimisation.

This post is the first of potentially many, and in it I will develop a data structure to store a graph. In future posts I will implement some of the algorithms which operate on graphs.

The most obvious use of graphs is to represent geographical locations and routes between them. The graph shown below represents four Scottish cities and the lines and numbers represent connections and distances (in miles) between them.

Probably the best known problem in graph theory is the Travelling Salesman Problem. In this problem a hypothetical salesman needs to visit a number of cities as efficiently as possible. There are six possible combinations - which should he choose to minimize distance or time?

That's a problem for a future post but for now we need to create a data structure to hold the graph, and I will start off by writing a class to do just that.

The majority of data can easily be fitted into some sort of row/column structure, in Python this might be a list of lists, a dictionary of tuples, some other combination of these, or maybe your own custom data structures. Graphs are a bit more messy but there are two main ways to represent them.

The first is called an adjacency list, which is what I will be using here. It consists of a list of vertices (eg cities), each of which contains a list of edges (eg a list of other cities to which it is connected along with distance).

The graph of Scottish cities which I will be using in this project would look like this when represented in an adjacency list.

Vertices Edges
Inverness Aberdeen 103 Glasgow 168 Edinburgh 154
Aberdeen Inverness 103 Glasgow 139 Edinburgh 119
Glasgow Inverness 168 Aberdeen 139 Edinburgh 44
Edinburgh Inverness 154 Aberdeen 119 Glasgow 44

There are a couple of points worth noticing here. Firstly each edge exists twice, for example there is an edge from Inverness to Aberdeen with a weight (distance) of 103, and also one from Aberdeen to Inverness with the same weight. The edge is undirected but by omitting one of these edges we can represent a directed edge, eg a one-way street.

Secondly, in this example all cities are connected to all others. If we were to add all the cities, towns and villages in Scotland none of them would be connected to all of the others; in fact most would only have direct roads to a handful of others. Using an adjacency list is efficient as it only needs to store actual edges.

This brings us to the second main type of data structure used to represent graphs, the adjacency matrix. This is a two-dimensional grid looking like this:

Inverness Aberdeen Glasgow Edinburgh
Inverness - 103 168 154
Aberdeen 103 - 139 119
Glasgow 168 139 - 44
Edinburgh 154 119 44 -

In this case there is little wasted memory but if such a data structure were to be used to represent all cities, towns and villages in Scotland most of the memory would be wasted. Adjacency lists are therefore only suitable if most or all vertices are connected.

The Project

This project consists of the following four files which can be downloaded as a zip, or you can download or clone the Github repository if you prefer.

  • vertex.py
  • edge.py
  • graphadjacencylist.py
  • graphtheory.py

Source Code Links

Zip File
GitHub

Firstly there are a couple of very simple classes to represent vertices and edges. This is the Vertex class.

vertex.py

class Vertex(object):

    def __init__(self, label):

        self.label = label
        self.edges = []
        self.visited = False

As well as a label the Vertex has a list of edges as well as an attribute called visited which will be used in the future for various algorithms.

edge.py

class Edge(object):

    def __init__(self, to, weight):

        self.to = to
        self.weight = weight

The Edge class has attributes for to (ie the vertex it is to) and weight. The starting vertex is the vertex to which the edge is attached.

Now let's look at the GraphAdjacencyList class, the core of this project.

graphadjacencylist.py

import vertex
import edge


class GraphAdjacencyList(object):

    def __init__(self):

        self.vertices = []

    def __str__(self):

        hline = ("-" * 44) + "\n"
        leftspace = ("|" + (" " * 16) + "|")
        string = []

        string.append(hline)
        string.append("|    Vertices    |    Edge To     | Weight |\n")
        string.append(hline)

        for vertex in self.vertices:

            string.append("|{0: <16}|".format(vertex.label))

            if len(vertex.edges) > 0:

                for index, edge in enumerate(vertex.edges):

                    if index > 0:
                        string.append(leftspace)

                    string.append("{0: <16}|".format(edge.to))

                    if edge.weight is not None:
                        string.append("{0: 8}|".format(edge.weight))
                    else:
                        string.append((" " * 8) + "|")

                    string.append("\n")

                string.append(hline)

            else:

                string.append((" " * 16) + "|" + (" " * 8) + "|\n")
                string.append(hline)

        return "".join(string)

    def add_vertex(self, label):

        """
        Adds a vertex with the given label to the list.
        Raises ValueError if already exists.
        """

        if self.__find_vertex_index(label) == -1:
            v = vertex.Vertex(label)
            self.vertices.append(v)
        else:
            raise ValueError("Vertex with this label already exists")

    def add_edge(self, label1, label2, directed, weight):

        """
        Adds edge between given two vertices with given weight.
        Recursively adds "reverse" edge if directed is False.
        Raises ValueError if edge already exists, or if
        either or both vertices do not exist.
        """

        if self.__edge_exists(label1, label2) is True:
            raise ValueError("Edge already exists")
        else:
            index1 = self.__find_vertex_index(label1)
            index2 = self.__find_vertex_index(label2)

            if index1 == -1 or index2 == -1:
                raise ValueError("One or both vertices not found")

            else:
                e1 = edge.Edge(label2, weight)
                self.vertices[index1].edges.append(e1)

                if directed is False:
                    self.add_edge(label2, label1, True, weight)

    def edit_vertex(self, label, new_label):

        """
        Changes vertex name to new value.
        Raises ValueError if label does not exist.
        """

        found = False

        for vertex in self.vertices:

            if vertex.label == label:
                vertex.label = new_label
                found = True

            for edge in vertex.edges:
                if edge.to == label:
                    edge.to = new_label

        if not found:
            raise ValueError("Vertex with this label does not exist")

    def edit_edge(self, from_label, to_label, weight):

        """
        Changes the weight of the edge to/from the
        given labels to new value.
        Raises ValueError if edge does not exist.
        """

        found = False

        for vertex in self.vertices:
            if vertex.label == from_label:
                for edge in vertex.edges:
                    if edge.to == to_label:
                        edge.weight = weight
                        found = True

        if not found:
            raise ValueError("Edge between labels does not exist")

    def delete_vertex(self, label):

        """
        Deletes the vertex with the given label.
        Raises ValueError if label does not exist.
        """

        found = False

        for vertex in self.vertices:

            for edge in vertex.edges:
                if edge.to == label:
                    vertex.edges.remove(edge)

            if vertex.label == label:
                self.vertices.remove(vertex)
                found = True

        if not found:
            raise ValueError("Vertex with this label does not exist")

    def delete_edge(self, from_label, to_label):

        """
        Deletes edge from/to given labels.
        Raises ValueError if edge does not exist.
        """

        found = False

        for vertex in self.vertices:
            if vertex.label == from_label:
                for edge in vertex.edges:
                    if edge.to == to_label:
                        vertex.edges.remove(edge)
                        found = True

        if not found:
            raise ValueError("Edge between labels does not exist")

    def __find_vertex_index(self, label):

        for index, vertex in enumerate(self.vertices):
            if vertex.label == label:
                return index

        return -1

    def __edge_exists(self, label1, label2):

        vertex_index = self.__find_vertex_index(label1)

        for edge in self.vertices[vertex_index].edges:
            if edge.to == label2:
                return True

        return False

__init__

Well this is nice and simple isn't it? Just create an empty list.

__str__

A rather tedious but essential method which provides a neat string representation of the object, primarily for us by the print function.

In the table above demonstrating an adjacency matrix I showed the edges of each vertex horizontally but here they are shown vertically as we'll see in a moment. If a vertex has no edges, or if an edge has no weight, then blanks are displayed.

add_vertex

Firstly we check to make sure a vertex with the given label does not already exist, and if not a new Vertex object is created and added.

add_edge

We need to check the edge does not already exist, and also that the given vertex labels do exist, raising errors as necessary. If everything is OK we create a new Edge object and add it to the vertex.

If the edge is not directed we also need to add the "reverse" edge which is done using a recursive call to add_edge. This call needs to have directed set to True or the method will attempt to re-create the first edge, thereby hitting an error.

edit_vertex

This method is used to rename vertices, and therefore needs to change the to attributes of edges. This is done by iterating vertices, and using an inner loop to iterate edges. Note that an error is raised if no vertex with the given label exists.

edit_edge

This method changes edge weights by iterating the vertices to find the "from" vertex, and then iterating its edges to find the "to" vertex. An error is raised if the edge does not exist.

The method does not work in reverse: if there is also an edge from the "to" vertex to the "from" vertex the weight is not changed. This is deliberate and allows for situations where weights are different in each direction.

delete_vertex

This method deletes te vertex with the given label, as well as any edges in other vertices pointing back to it. As usual an error is raised if the vertex does not exist.

delete_edge

This method iterates the vertices to find the "from", and then iterates its edges to find the "to" before deleting it. As with edit_edge the method does not affect edges in the opposite direction.

__find_vertex_index

This is a utility function which finds the index, if any, of the vertex with the given label.

__edge_exists

This function checks whether an edge exists between the two given vertices.

That our GraphAdjacencyList class finished so we just need to try it out in graphtheory.py.

graphtheory.py

import graphadjacencylist


def main():

    print("----------------")
    print("| codedrome.com |")
    print("| GraphTheory   |")
    print("----------------\n")

    g = create_graph()

    print(g)

    # g = edit_graph(g)

    # print(g)


def create_graph():

    g = graphadjacencylist.GraphAdjacencyList()

    g.add_vertex("Inverness")
    g.add_vertex("Aberdeen")
    g.add_vertex("Glasgow")
    g.add_vertex("Edinburgh")

    g.add_edge("Inverness", "Aberdeen", directed=False, weight=103)
    g.add_edge("Inverness", "Glasgow", directed=False, weight=168)
    g.add_edge("Inverness", "Edinburgh", directed=False, weight=154)

    g.add_edge("Aberdeen", "Glasgow", directed=False, weight=139)
    g.add_edge("Aberdeen", "Edinburgh", directed=False, weight=119)

    g.add_edge("Glasgow", "Edinburgh", directed=False, weight=44)

    return g


def edit_graph(g):

    g.edit_vertex("Inverness", "Inerness")
    g.edit_vertex("Aberdeen", "Aiberdeen")
    g.edit_vertex("Glasgow", "Glesga")

    return g


main()

In main we first call a function to create and populate a graph with the Scottish cities data and then print it.

Run the code as it is...

Run

python3.7 graphtheory.py

...which will give you this output.

Program Output

-----------------
| codedrome.com |
| Graph Theory  |
-----------------

--------------------------------------------
|    Vertices    |    Edge To     | Weight |
--------------------------------------------
|Inverness       |Aberdeen        |     103|
|                |Glasgow         |     168|
|                |Edinburgh       |     154|
--------------------------------------------
|Aberdeen        |Inverness       |     103|
|                |Glasgow         |     139|
|                |Edinburgh       |     119|
--------------------------------------------
|Glasgow         |Inverness       |     168|
|                |Aberdeen        |     139|
|                |Edinburgh       |      44|
--------------------------------------------
|Edinburgh       |Inverness       |     154|
|                |Aberdeen        |     119|
|                |Glasgow         |      44|
--------------------------------------------

The edit_graph function translates the city names into Scots (Edinburgh is the same in English and Scots). Comment out the first print in main, then uncomment the call to edit_graph and the second print. Run the program again which gives you...

Program Output

-----------------
| codedrome.com |
| Graph Theory  |
-----------------

--------------------------------------------
|    Vertices    |    Edge To     | Weight |
--------------------------------------------
|Inerness        |Aiberdeen       |     103|
|                |Glesga          |     168|
|                |Edinburgh       |     154|
--------------------------------------------
|Aiberdeen       |Inerness        |     103|
|                |Glesga          |     139|
|                |Edinburgh       |     119|
--------------------------------------------
|Glesga          |Inerness        |     168|
|                |Aiberdeen       |     139|
|                |Edinburgh       |      44|
--------------------------------------------
|Edinburgh       |Inerness        |     154|
|                |Aiberdeen       |     119|
|                |Glesga          |      44|
--------------------------------------------

Thinking of Starting Your Own Blog?

If you are thinking of starting your own blog or looking to migrate to a better host check out BlueHost. Free domain name for 1 year, free SSL certificate and 1-Click WordPress Install.