Working with Python Sets

A set is a collection of distinct objects in mathematics. Python lets you create sets and perform basic operations on them. Before we get into how to work with sets, here’s a quick review of some key terms / phrases.

Set Terminology

  • x is a member of B if x is an element in the set B
  • An empty set is a set with no members
  • A is a subset of B if every member in A is also in B. Another way to express this is that B inclues A.
  • If sets A and B have the same members, then they are equal sets
  • If A is a subset of B and B has more members, then B is a superset of A
  • cardinality is the number of members in a set

Set Creation

Creating a set literal

What’s neat is that the syntax for sets in python is the same as the mathematical notation! Just put your elements between braces:

>>> a = {1, 2, 3, 4}
>>> type(a)
<class 'set'>

The cardinality of a is 4.

If you include duplicate elements, they’ll get excluded from the set automatically.

>>> b = {1, 1, 1, 1}
>>> b
{1}

So while you included four elements, python enforces set properties so that the cardinality of the set is still 1.

Sets are not homogenous - they can contained mixed types. However, they are limited to hashable objects. Generally speaking, built-in immutable structures like strings and integers are hashable by default but mutable structures like lists and dictionaries are not.

Converting iterables to sets

One thing you’ll do often is convert some iterable (usually a list) into a set. You can turn pretty much any iterable into a set using the set builtin.

Here’s converting a dictionary:

>>> a = {"name": "joe", "age": 15}
>>> set(a)
{'age', 'name'}

Note that it converts the keys by default.

And here’s converting a list of integers:

>>> b = [1, 2, 3, 4, 5]
>>> set(b)
{1, 2, 3, 4, 5}

Basic set operations

Now lets look at some common set operations. Let A be the set {1, 2, 3, 4} and B be the set {4, 5, 6}. For each operation, I’ve included their definition:

Union

The union of A and B, denoted by A ∪ B, is the set of all things that are members of either A or B.

You can call union method on the set:

>>> A.union(B)
{1, 2, 3, 4, 5, 6}

Or you can use the binary | (bitwise OR) operator:

>>> A | B
{1, 2, 3, 4, 5, 6}

One common mistake I make is attempting to use the + operator:

>>> A + B
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'set' and 'set'

Intersection

The intersection of A and B, denoted by A ∩ B, is the set of all things that are members of both A and B

You can call intersection on the set:

>>> A.intersection(B)
{4}

Or you can use the binary & (bitwise AND) operator:

>>> A & B
{4}

Complements

The relative complement of B in A is the set of all elements that are members of A but not members of B

Just like set notation, you use the subtraction operator:

>>> A - B
{1, 2, 3}
>>> B - A
{5, 6}

Cartesian Product

The Cartesian product of two sets A and B, denoted by A × B is the set of all ordered pairs (a, b) such that a is a member of A and b is a member of B.

This isn’t something I do often - and there’s no built-in for this - but here’s some code just to demonstrate set iteration.

>>> {(a, b) for a in A for b in B}
{(4, 4), (2, 4), (3, 4), (1, 5), (4, 6), (1, 4), (4, 5), (2, 6), (3, 6), (1, 6), (2, 5), (3, 5)}

Membership Operations

  • Testing if object x is a member of A: x in A
  • Testing if two sets are equal: A == B
  • Testing if A is a subset of B: A < B
  • Testing if A is a superset of B: A > B

References

https://en.wikipedia.org/wiki/Set_(mathematics)