4.2. Sets and Relations¶
4.2.1. Set Notation¶
The concept of a set in the mathematical sense has wide application in computer science. The notations and techniques of set theory are commonly used when describing and implementing algorithms because the abstractions associated with sets often help to clarify and simplify algorithm design.
A set is a collection of distinguishable members or elements. The members are typically drawn from some larger population known as the base type. Each member of a set is either a primitive element of the base type or is a set itself. There is no concept of duplication in a set. Each value from the base type is either in the set or not in the set. For example, a set named \(\mathbf{P}\) might consist of the three integers 7, 11, and 42. In this case, \(\mathbf{P}\)’s members are 7, 11, and 42, and the base type is integer.
The following table shows the symbols commonly used to express sets and their relationships.
Table 4.2.1
Here are some examples of this notation in use. First define two sets, \(\mathbf{P}\) and \(\mathbf{Q}\).
\(|\mathbf{P}| = 3\) (because \(\mathbf{P}\) has three members) and \(|\mathbf{Q}| = 2\) (because \(\mathbf{Q}\) has two members). Both of these sets are finite in length. Other sets can be infinite, for example, the set of integers.
The union of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} \cup \mathbf{Q}\), is the set of elements in either \(\mathbf{P}\) or \(\mathbf{Q}\), which is {2, 3, 5, 10}. The intersection of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} \cap \mathbf{Q}\), is the set of elements that appear in both \(\mathbf{P}\) and \(\mathbf{Q}\), which is {5}. The set difference of \(\mathbf{P}\) and \(\mathbf{Q}\), written \(\mathbf{P} - \mathbf{Q}\), is the set of elements that occur in \(\mathbf{P}\) but not in \(\mathbf{Q}\), which is {2, 3}. Note that \(\mathbf{P} \cup \mathbf{Q} = \mathbf{Q} \cup \mathbf{P}\) and that \(\mathbf{P} \cap \mathbf{Q} = \mathbf{Q} \cap \mathbf{P}\), but in general \(\mathbf{P} - \mathbf{Q} \neq \mathbf{Q} - \mathbf{P}\). In this example, \(\mathbf{Q} - \mathbf{P} = \{10\}\). Finally, the set {5, 3, 2} is indistinguishable from set \(\mathbf{P}\), because sets have no concept of order. Likewise, set {2, 3, 2, 5} is also indistinguishable from \(\mathbf{P}\), because sets have no concept of duplicate elements.
The set product or Cartesian product of two sets \(\mathbf{Q} \times \mathbf{P}\) is a set of ordered pairs. For our example sets, the set product would be
The powerset of a set \(\mathbf{S}\) (denoted \(2^S\)) is the set of all possible subsets for \(\mathbf{S}\). Consider the set \(\mathbf{S} = \{ a, b, c \}\). The powerset of \(\mathbf{S}\) is
A collection of elements with no order (like a set), but with duplicate-valued elements is called a bag 1. To distinguish bags from sets, we will use square brackets [] around a bag’s elements. For example, bag [3, 4, 5, 4] is distinct from bag [3, 4, 5], while set {3, 4, 5, 4} is indistinguishable from set {3, 4, 5}. However, bag [3, 4, 5, 4] is indistinguishable from bag [3, 4, 4, 5].
A sequence is a collection of elements with an order, and which may contain duplicate-valued elements. A sequence is also sometimes called a tuple or a vector. In a sequence, there is a 0th element, a 1st element, 2nd element, and so on. We will use angle brackets \(\langle\rangle\) to enclose the elements of a sequence. For example, \(\langle3, 4, 5, 4\rangle\) is a sequence. Note that sequence \(\langle3, 5, 4, 4\rangle\) is distinct from sequence \(\langle3, 4, 5, 4\rangle\), and both are distinct from sequence \(\langle3, 4, 5\rangle\).
- 1
The object referred to here as a bag is sometimes called a multilist. But, the term multilist also refers to a list that may contain sublists.
4.2.1.1. Relations¶
A relation \(R\) over set \(\mathbf{S}\) is a set of ordered pairs from \(\mathbf{S}\). As an example of a relation, if \(\mathbf{S}\) is \(\{a, b, c\}\), then
is a relation, and
is a different relation. If tuple \(\langle x, y\rangle\) is in relation \(R\), we may use the infix notation \(xRy\). We often use relations such as the less than operator (\(<\)) on the natural numbers, which includes ordered pairs such as \(\langle1, 3\rangle\) and \(\langle2, 23\rangle\), but not \(\langle3, 2\rangle\) or \(\langle2, 2\rangle\). Rather than writing the relationship in terms of ordered pairs, we typically use an infix notation for such relations, writing \(1<3\).
Define the properties of relations as follows, with \(R\) a binary relation over set \(\mathbf{S}\).
\(R\) is reflexive if \(aRa\) for all \(a \in \mathbf{S}\).
\(R\) is irreflexive if \(aRa\) is not true for all \(a \in \mathbf{S}\).
\(R\) is symmetric if whenever \(aRb\), then \(bRa\), for all \(a, b \in \mathbf{S}\).
\(R\) is antisymmetric if whenever \(aRb\) and \(bRa\), then \(a = b\), for all \(a, b \in \mathbf{S}\).
\(R\) is transitive if whenever \(aRb\) and \(bRc\), then \(aRc\), for all \(a, b, c \in \mathbf{S}\).
As examples, for the natural numbers, \(<\) is irreflexive (because \(aRa\) is never true), antisymmetric (because there is no case where \(aRb\) and \(bRa\)), and transitive. Relation \(\leq\) is reflexive, antisymmetric, and transitive. Relation \(=\) is reflexive, symmetric (and antisymmetric!), and transitive. For people, the relation “is a sibling of” is symmetric and transitive. If we define a person to be a sibling of themself, then it is reflexive; if we define a person not to be a sibling of themself, then it is not reflexive.
4.2.2. Equivalence Relations¶
\(R\) is an equivalence relation on set \(\mathbf{S}\) if it is reflexive, symmetric, and transitive. An equivalence relation can be used to partition a set into equivalence classes. If two elements \(a\) and \(b\) are equivalent to each other, we write \(a \equiv b\). A partition of a set \(\mathbf{S}\) is a collection of subsets that are disjoint from each other and whose union is \(\mathbf{S}\). An equivalence relation on set \(\mathbf{S}\) partitions the set into disjoint subsets whose elements are equivalent. The UNION/FIND algorithm efficiently maintains equivalence classes on a set. One application for such disjoint sets computing a minimal cost spanning tree.
Example 4.2.1
For the integers, \(=\) is an equivalence relation that partitions each element into a distinct subset. In other words, for any integer \(a\), three things are true.
\(a = a\),
if \(a = b\) then \(b = a\), and
if \(a = b\) and \(b = c\), then \(a = c\).
Of course, for distinct integers \(a\), \(b\), and \(c\) there are never cases where \(a = b\), \(b = a\), or \(b = c\). So the requirements for symmetry and transitivity are never violated, and therefore the relation is symmetric and transitive.
Example 4.2.2
If we clarify the definition of sibling to mean that a person is a sibling of themself, then the sibling relation is an equivalence relation that partitions the set of people.
Example 4.2.3
We can use the modulus function to define an equivalence relation. For the set of integers, use the modulus function to define a binary relation such that two numbers \(x\) and \(y\) are in the relation if and only if \(x \bmod m = y \bmod m\). Thus, for \(m = 4\), \(\langle1, 5\rangle\) is in the relation because \(1 \bmod 4 = 5 \bmod 4\). We see that modulus used in this way defines an equivalence relation on the integers, and this relation can be used to partition the integers into \(m\) equivalence classes. This relation is an equivalence relation because
\(x \bmod m = x \bmod m\) for all \(x\);
if \(x \bmod m = y \bmod m\), then \(y \bmod m = x \bmod m\); and
if \(x \bmod m = y \bmod m\) and \(y \bmod m = z \bmod m\), then \(x \bmod m = z \bmod m\).
4.2.3. Partial Orders¶
A binary relation is called a partial order if it is antisymmetric and transitive. If the relation is reflexive, it is called a non-strict partial order. If the relation is irreflexive, it is called a strict partial order. The set on which the partial order is defined is called a partially ordered set or a poset. Elements \(x\) and \(y\) of a set are comparable under a given relation \(R\) if either \(xRy\) or \(yRx\). If every pair of distinct elements in a partial order are comparable, then the order is called a total order or linear order.
Example 4.2.4
For the integers, relations \(<\) and \(\leq\) define partial orders. Operation \(<\) is a total order because, for every pair of integers \(x\) and \(y\) such that \(x \neq y\), either \(x < y\) or \(y < x\). Likewise, \(\leq\) is a total order because, for every pair of integers \(x\) and \(y\) such that \(x \neq y\), either \(x \leq y\) or \(y \leq x\).
Example 4.2.5
For the powerset of the integers, the subset operator defines a partial order (because it is antisymmetric and transitive). For example, \(\{1, 2\}\subseteq\{1, 2, 3\}\). However, sets {1, 2} and {1, 3} are not comparable by the subset operator, because neither is a subset of the other. Therefore, the subset operator does not define a total order on the powerset of the integers.