Exploring Set Theory with Python

There is plenty of information around on Python's set data structure but it usually approaches the topic from a programming perspective. In this article I will look at Python sets from the mathematical point of view.

Set Theory

Set theory has been around since the 1870 so although it is quite new compared to many branches of mathematics it's still a lot older than Python!

A set is basically a list of objects, each of which can exist within a given set once and is therefore unique in the set. The objects can be anything but here I will concentrate on sets of numbers.

Set theory has many uses, both theoretical and practical (or to be precise, in both pure and applied mathematics). In this article I will look at the functionality provided by Python for representing and manipulating sets, as well as creating a few handy bits of additional functionality. I also have a number of further articles in the pipeline showing sets in use.

In this article I will cover the following topics:

  • Symbols and notation used in set theory

  • Creating a set, adding and removing items

  • Using a few Python builtin functions with sets

  • Cardinality notation, or the number of items in a set

  • Testing items and sets for membership of a set

  • Combinations of 2 sets

  • Using various update methods

The Project

This project consists of the following files:

  • setutilities.py

  • setdemo.py

The files can be downloaded as a zip from the Downloads page, or you can clone/download the Github repository if you prefer.

Source Code Links

ZIP File
GitHub

The Code

Before getting stuck in to creating and using sets let's look at a short bit of Python containing a handy dictionary of symbols and a simple function.

setutilities.py

from types import MappingProxyType


symbols = MappingProxyType({ "emptySet": "∅",
                             "union": "∪",
                             "intersection": "∩",
                             "is_element_of": "∈",
                             "is_not_element_of": "∉",
                             "is_subset_of": "⊆",
                             "is_proper_subset_of": "⊂",
                             "is_not_subset_of": "⊄",
                             "is_superset_of": "⊇",
                             "is_proper_superset_of": "⊃",
                             "is_not_superset_of": "⊅",
                             "natural_numbers": "ℕ",
                             "integers": "ℤ",
                             "rational_numbers": "ℚ",
                             "real_numbers": "ℝ",
                             "complex_numbers": "ℂ",
                             "power_set": "ℙ",
                             "universal_set": "ξ",
                             # this is minuscule Greek letter xi
                             # minuscule zeta ζ. V, U or S are also used.
                             "cartesian_product": "×",
                             "set_difference": "\\",
                             "symmetric_difference": "",
                             "cardinality": "|",
                             "complement": "'",})


def cardinality_string(set, setname):

    result = f"{symbols['cardinality']}{setname}{symbols['cardinality']} = {len(set)}"

    return result

The symbols dictionary (wrapped as a MappingProxyType to make it immutable) contains a number of symbols used in set theory, many of which I'll use later on.

The cardinality_string function takes a set and its name, and returns a string showing the cardinality of length using the standard notation.

Now let's get to the main part of the code in setdemo.py.

setdemo.py

import setutilities as su


def main():

    print("-----------------")
    print("| codedrome.com |")
    print("| Set Demo      |")
    print("-----------------\n")

    print(su.symbols)

    # create_and_manipulate()

    # builtins()

    # cardinality_string()

    # tests_and_comparisons()

    # combinations()

    # update()


def create_and_manipulate():

    A = set()
    print(A)

    A.add(98)
    print(A)

    A.update([101, 103, 108])
    print(A)

    n = A.pop()
    print(n)
    print(A)

    try:
        A.remove(103)
    except KeyError as err:
        print(err)
    else:
        print("element 103 removed")

    print(A)

    try:
        A.remove(103)
    except KeyError as err:
        print("element not in set")
    else:
        print("103 removed")

    print(A)

    # does not raise exception if element not present
    A.discard(103)

    print(A)

    # clear
    A.clear()
    print(A)


def builtins():

    title = "Selection of built in functions applied to set"
    print(title)
    print("-" * len(title))

    A = {7,-67,85,-86,4,0}

    print(f"set A    {A}")

    print(f"len(A)   {len(A)}")
    print(f"min(A)   {min(A)}")
    print(f"max(A)   {max(A)}")
    print(f"sum(A)   {sum(A)}")
    print(f"all(A)   {all(A)}")
    print(f"any(A)   {any(A)}")

    B = set(filter(lambda n: n > 0, A))
    print(f"filter   {B}")

    P = set(map(lambda n: n ** 2, A))
    print(f"map      {P}")


def cardinality_string():

    s = {7,-67,85,-86,4,0}

    cs = su.cardinality_string(s, "myset")

    print(cs)


def tests_and_comparisons():

    A = {1,2,4,7,9,13,16,22}
    B = {1,2,4,7}
    C = {3,5,6,8}

    print(f"is element of\n9 {su.symbols['is_element_of']} {A}\n{9 in A}\n")

    print(f"is not element of\n3 {su.symbols['is_not_element_of']} {A}\n{3 not in A}\n")

    print(f"is disjoint\n{A} {su.symbols['intersection']} {B} = {su.symbols['emptySet']}\n{A.isdisjoint(B)}\n")

    print(f"is subset of\n{B} {su.symbols['is_subset_of']} {A}\n{B.issubset(A)}\n") # or B <= A

    print(f"is proper subset of\n{B} {su.symbols['is_proper_subset_of']} {A}\n{B < A}\n")
    # no method for proper subset, can only use <

    print(f"is superset of\n{A} {su.symbols['is_superset_of']} {B}\n{A.issuperset(B)}\n") # or A >= B

    print(f"is proper superset of\n{A} {su.symbols['is_proper_superset_of']} {B}\n{A > B}\n")
    # no method for proper superset, can only use >


def combinations():

    A = {1,2,4,7,9}
    B = {1,2,4,8,10}

    U = A.union(B)
    # or U = A | B

    I = A.intersection(B)
    # or I = A & B

    D = A.difference(B)
    # or D = A - B

    S = A.symmetric_difference(B)
    # or D = A ^ B

    print(f"{A} {su.symbols['union']} {B} = {U}")
    print(f"{A} {su.symbols['intersection']} {B} = {I}")
    print(f"{A} {su.symbols['set_difference']} {B} = {D}")
    print(f"{A} {su.symbols['symmetric_difference']} {B} = {S}")


def update():

    A = {1,2,3,4}
    B = {4,5,6,7}
    C = {5,6,7,8,9}
    D = {7,8,9,10}
    E = {6,11,12}

    print(f"A = {A}\n")

    # A is now union of itself and B
    print(f"{A}.update({B})")
    A.update(B) # or |=
    print(f"= {A}\n")

    # A now contains only elements in itself and C
    print(f"{A}.intersection_update({C})")
    A.intersection_update(C) # or &=
    print(f"= {A}\n")

    # A now contains only elements not also in D
    print(f"{A}.difference_update({D})")
    A.difference_update(D) # or -=
    print(f"= {A}\n")

    # A now contains elements in itself but not E, and in E but not itself
    print(f"{A}.symmetric_difference_update({E})")
    A.symmetric_difference_update(E) # or ^=
    print(f"= {A}\n")


if __name__ == "__main__":
    main()

After importing setutilities the main function calls seven functions which we can uncomment and run one at a time.

Symbols

Firstly I have printed the symbols dictionary to show its contents. As I mentioned I'll be using a number of them later. Run the program with this command.

Running the Code

python3 setdemo.py

Which will give you this output.

Program Output: symbols

{'emptySet': '∅', 'union': '∪', 'intersection': '∩', 'is_element_of': '∈', 'is_not_element_of': '∉', 'is_subset_of': '⊆', 'is_proper_subset_of': '⊂', 'is_not_subset_of': '⊄', 'is_superset_of': '⊇', 'is_proper_superset_of': '⊃', 'is_not_superset_of': '⊅', 'natural_numbers': 'ℕ', 'integers': 'ℤ', 'rational_numbers': 'ℚ', 'real_numbers': 'ℝ', 'complex_numbers': 'ℂ', 'power_set': 'ℙ', 'universal_set': 'ξ', 'cartesian_product': '×', 'set_difference': '\\', 'symmetric_difference': 'Δ', 'cardinality': '|', 'complement': "'"}

create_and_manipulate

This function creates a set, performs a few basic operations on it, and prints the set after each one. The methods used for these operations are:

  • add - adds an item to the set

  • update - adds a list of items to the set. (Note that add and update silently ignore any duplicate items.)

  • pop - returns and removes an item. Sets are unordered so you might get any value back irrespective of insertion order.

  • remove - removes the specified item. Raises KeyError if item not present as demonstrated here by attempting to remove the same value twice.

  • discard - same as remove but does not raise an error if the item is not present.

  • clear - removes all items

Uncomment create_and_manipulate in main and run the program again. The output shows the set after each operation.

Program Output: create_and_manipulate

set()
{98}
{98, 108, 101, 103}
98
{108, 101, 103}
element 103 removed
{108, 101}
element not in set
{108, 101}
{108, 101}
set()

builtins

This function demonstrates a few builtin functions used on sets. Note the set A is created with an alternative syntax using curly brackets containing values to initialize the set with. I have used uppercase letters for set names in this project which follows mathematical convention.

The functions used here are self-explanatory and include filter and map to create new sets of positive values and squares respectively.

This is the output of the function.

Program Output: builtins

Selection of built in functions applied to set
----------------------------------------------
set A {0, 4, 7, -86, 85, -67}
len(A) 6
min(A) -86
max(A) 85
sum(A) -57
all(A) False
any(A) True
filter {4, 85, 7}
map {0, 7396, 4489, 16, 49, 7225}

cardinality_string

A very short function which creates a set and then passes it to the cardinality_string function, printing the result which looks like this.

Program Output: cardinality_string

|myset| = 6

tests_and_comparisons

This function creates a few sample sets and compares them with various methods and operators, printing out the Trueness or Falseness of each test. The tests are:

  • is element of - here we simply use the in keyword, eg 9 in A

  • is not element of - nearly as simple, we use 3 not in A

  • is disjoint - 2 sets are disjoint if they have no items in common, tested using the isdisjoint method. The notation for disjointedness is a bit convoluted, and basically states that the intersection (ie set of shared values) is the empty set.

  • is subset of - the first set is a subset of the second if the second contains all the items in the first, and Python provides the issubset method. We can also use the <= operator. Note that the first set is still a subset if the sets are identical.

  • is proper subset of - this is similar to subset but the second set must also contain at least one other item. There is no method for this, we can only use the < operator.

  • is superset of - this is like subset but reversed.

  • is proper superset of - a reversal of proper subset. Again we have to use an operator, >, as there is no method.

You probably noticed that I have used various symbols from setutilities, for example symbols['is_element_of']. This is the output.

Program Output: tests_and_comparisons

is element of
9 ∈ {1, 2, 4, 7, 9, 13, 16, 22}
True

is not element of
3 ∉ {1, 2, 4, 7, 9, 13, 16, 22}
True

is disjoint
{1, 2, 4, 7, 9, 13, 16, 22} ∩ {1, 2, 4, 7} = ∅
False

is subset of
{1, 2, 4, 7} ⊆ {1, 2, 4, 7, 9, 13, 16, 22}
True

is proper subset of
{1, 2, 4, 7} ⊂ {1, 2, 4, 7, 9, 13, 16, 22}
True

is superset of
{1, 2, 4, 7, 9, 13, 16, 22} ⊇ {1, 2, 4, 7}
True

is proper superset of
{1, 2, 4, 7, 9, 13, 16, 22} ⊃ {1, 2, 4, 7}
True

combinations

Now we'll combine two sets into new ones in various ways using the following methods:

  • union - the new set contains the items of both sets, without duplicates of course. The | operator can also be used.

  • intersection - the new set contains only items in both the original sets. Can also use the & operator.

  • difference - items in the first set but not the second. You can also use the Python operator - (minus sign) although the mathematical notation is \ (backslash).

  • symmetric_difference - the set of items unique to each set. Can also use the ^ operator.

This is the function's output, using the correct symbols. Yours outputs might be ordered differently as sets have no inherent order and can be displayed any which way.

Program Output: combinations

{1, 2, 4, 7, 9} ∪ {1, 2, 4, 8, 10} = {1, 2, 4, 7, 8, 9, 10}
{1, 2, 4, 7, 9} ∩ {1, 2, 4, 8, 10} = {1, 2, 4}
{1, 2, 4, 7, 9} \ {1, 2, 4, 8, 10} = {9, 7}
{1, 2, 4, 7, 9} Δ {1, 2, 4, 8, 10} = {7, 8, 9, 10}

update

The four methods (and equivalent operators) used in the previous function all return new sets. There are also four methods which do basically the same thing but update an existing set. Three have the same names suffixed with _update, the exception just being called update. (If they had asked me I would have called it union_update. But they didn't . . . !). The four methods also have operator equivalents, |=, &=, -= and ^=.

This is the output of the function.

Program Output: update

A = {1, 2, 3, 4}

{1, 2, 3, 4}.update({4, 5, 6, 7})
= {1, 2, 3, 4, 5, 6, 7}

{1, 2, 3, 4, 5, 6, 7}.intersection_update({5, 6, 7, 8, 9})
= {5, 6, 7}

{5, 6, 7}.difference_update({8, 9, 10, 7})
= {5, 6}

{5, 6}.symmetric_difference_update({11, 12, 6})
= {5, 11, 12}

I hope you found this overview of set theory in Python interesting and possibly even useful. As I mentioned at the beginning there are many practical applications of set theory and in the future I'll look at a few of them.