3.Review§

How do we prove that a language is regular?

We have a number of approaches in our toolbox.

How do we know if a language is non-regular?

Given so many tools for creating a regular language, are there languages that are not regular?

(The very fact that we are concerned with this question is a hint that this can happen.)

If a language \(L\) is finite, is \(L\) regular?

If \(L\) is infinite, is \(L\) regular?

Something to Think About, Revisited

\(L = \{a^nb^n \mid n>0\}\)

Is language \(L\) regular? Can you draw a DFA, regular expression, or Regular grammar for this language?

Our First Non-regular Language (1)

Prove that \(L_2 = \{a^nb^n | n > 0 \}\) is not regular.

Proof 1 (by contradiction)
If \(L_2\) is regular then \(\exists\) DFA \(M\) that recognizes \(L_2\).
\(M\) has a finite number of states, say \(k\) states.
Consider a long string \(a^kb^k \in L_2\).
Since there are \(k\) states and \(k\) a’s, some state in \(M\) must be reached more than once when following the path of \(a^k\).
In that case, there is a loop with one or more a’s (say \(t\) a’s for some \(t > 1\)) along the path.

Our First Non-regular Language (2)

Proof 1 (continued)
Suppose we start at the initial state, traverse the same path for \(a^kb^k\), but we traverse the loop of \(a\) ‘s one additional time.
We will end up in the same final state that \(a^kb^k\) did, but our actual number of a’s is some other value (call it \(m\)).
Therefore, the string \(a^mb^k\) is accepted by \(M\), but this string is not in \(L_2\). Contradiction!
Thus, \(L_2\) is not regular.

Pigeonhole Principle

This is an example of the Pigeonhole Principle.

The Pigeonhole Principle states that, given \(n\) pigeonholes and \(n+1\) pigeons, when all of the pigeons go into the holes we can be sure that at least one hole contains more than one pigeon.

In our case, the number of \(a\) ‘s are the pigeons, and the states in the DFA are the pigeonholes.

We can’t distinguish the various possibilities for the number of \(a\) ‘s, so we can’t verify that they properly match the number of \(b\) ‘s.

Pumping Concept (1)

We introduce the concept of “pumping” the string as we go around the loop.
Loops are how we get infinite languages.
They are also how we lose count or otherwise lose the ability to distinguish various properties of the string being processed.

Another way of looking at it is that the memory of a DFA is embodied explicitly in its set of states.

Since the number of states is finite, the memory is finite.

A program in a traditional programming language might have one integer (that conceptually at least stores an infinite number of values, even if that is not literally true), a DFA has no such thing.

Pumping Concept (2)

In general, consider DFAs with or without cycles.
If there is no cycle, the language accepted is finite.
If there is one or more cycle, then the language is infinite.
Let’s assume that there is exactly one cycle.
The cycle might be skipped, executed once, or executed more than once.
We can consider any string accepted by this DFA when the cycle is executed exactly once to be of the form \(w = w_1vw_2\) where the \(v\) represents the part of the string captured by the cycle.
If the cycle is skipped, then we get \(w = w_1w_2\)
If its executed twice we get \(w = w_1v^2w_2\).
In general, the DFA accepts all strings like \(w = w_1v^*w_2\).

Pumping Lemma

Let \(L\) be an infinite regular language.
Then there exists a constant \(m > 0\) such that any \(w \in L\) with \(|w| \ge m\) can be decomposed into three parts as \(w=xyz\) with:
\(|xy| \le m\)
\(|y| \ge 1\)
\(xy^iz \in L\) for all \(i\ge 0\)

Meaning: Every sufficiently long string in \(L\) (the constant \(m\) corresponds to the finite number of states in \(M\)) can be partitioned into three parts such that the middle part can be “pumped”, resulting in strings that must be in \(L\).

P.L. Used to Prove L Non-regular

Proof by Contradiction.
Assume L is regular.
Therefore \(L\) satisfies the pumping lemma.
Choose a long string \(w \in L\), \(|w| \ge m\).
(The choice of the string is crucial. We must pick a string that will yield a contradiction).
Show that there is NO division of \(w\) into \(xyz\)
(must consider all possible divisions) such that \(|xy| \le m\), \(|y| \ge 1\) and \(xy^iz \in L \forall i \ge 0\).
The pumping lemma does not hold. Contradiction!
\(\Rightarrow L\) is not regular.

Example

Prove that \(L = \{a^nb^n | n \geq 0\}\) is not regular.
Assume \(L\) is regular, therefore the pumping lemma holds.
Choose \(w = a^mb^m\) where \(m\) is the constant in the pumping lemma.
(Note that \(w\) must be chosen such that \(|w| \ge m\).)
Choose \(n = m\). Therefore, substring \(y\) must be some number of \(a\) ‘s.
So the partition is \(x=a^{n-k}\quad |\quad y=a^k\quad |\quad z=b^n\) where \(n \leq m\) and \(k > 0\).
It should be true that \(xy^iz \in L\) for all \(i\ge 0\).
But clearly this is not true. Contradiction!

Observations

Unfortunately, the pumping lemma is one-way:
For (some) languages we can use the pumping lemma to prove that they are not regular.
But we cannot use the pumping lemma to help us prove that a language is regular.
And the pumping lemma is not a universal solution for determining that a language is non-regular. Its just a tool in the toolbox.

More Observations

The pumping lemma says that there is some way to define the language that meets the criteria.

It is not enough to pick your favorite value of \(m\) for which the language would/would not be regular.

You have to show that no satisfactory \(m\) can exist.

Can View as Adversary Argument

Your want to establish a contradiction (to prove the language is not regular)
Meanwhile, the opponent tries to stop the proof.
The moves in the game are:
1. The opponent picks \(m\).
2. We pick string \(w\) in \(L\) of length equal or greater than \(m\).
We are free to chose any \(w\), so long as \(w \in L\) and \(|w| \geq m\).
3. The opponent chooses the decomposition \(xyz\), such that \(|xy| \leq m, |y| \geq 1\).
The opponent will make the choice that is hardest for us to win the game.
4. We try to pick \(i\) so that the pumped string \(w_i = xy^iz\) is not in \(L\).
If we can always do this, we win (\(L\) is not regular).

Example

Prove \(L = \{ww^R : w \in \Sigma^*\}\) is not regular.
For any value \(m\), we pick the string \(a^mb^mb^ma^m\).
Since \(|xy| \leq m\), \(y\) must consist entirely of \(a\) ‘s.
If we pick \(i = 0\), then the resulting string has fewer \(a\) ‘s on the left than on the right and so cannot be of the form \(ww^R\).
Therefore, \(L\) is not regular.

Example: Failure (as expected)

If the language is indeed regular, you should find it impossible to use the pumping lemma to prove it non-regular!

Prove \(L = \{a^mb^n \mid n+m\) is odd \(\}\) is not regular.
If the opponent picks \(m = 1\), then we can pick \(w = abb\).
Whatever the adversary picks for \(xyz\), we end up with \(y\) such that we can pump strings not in the language.
SO… does this mean that \(L\) is non-regular?
NO!! The adversary will not pick a bad choice for \(m\) if they don’t have to!

Example: Failure (as expected)

Prove \(L = \{a^mb^n \mid n+m\) is odd \(\}\) is not regular.
Say that the opponent picks \(m = 3\).
We can choose this string that is in the language: \(w = aaabb\) so as to constrain the opponent to picking values for \(y\) with all \(a\) ‘s.
But unfortunately, the opponent picks decomposition \(a(aa)^ibb\).
We can’t pick \(i\) that is not in the language.
The point is that we cannot find a string, for all values of \(m\), such that the opponent cannot also pick workable values for \(x, y, z\).

Adversary Argument Explained (1)

Consider the Pumping Lemma definition again:
Let \(L\) be an infinite regular language. There exists a constant \(m > 0\) such that any \(w \in L\) with \(|w| \ge m\) can be decomposed into three parts as \(w=xyz\) with:
\(|xy| \le m\)
\(|y| \ge 1\)
\(xy^iz \in L\) for all \(i\ge 0\)
1. The opponent picks \(m\).
2. We pick string \(w\).
3. The opponent chooses the decomposition \(xyz\).
4. We try to pick \(i\).

Adversary Argument Explained (2)

WE seek to prove the language non-regular.
The adversary seeks to stop us.
  1. There exists a constant \(m > 0\) [= Adversary picks a value for \(m\).]
  2. … such that any \(w \in L\) with \(|w| \ge m\) [= WE pick our choice for \(w\).]
  3. can be decomposed into three parts as \(w=xyz\) [= Adversary picks \(xyz\)] (that meets the length criteria on \(xy\) and \(y\))
  4. … such that \(xy^iz \in L\) for all \(i\ge 0\) [= WE pick a value for \(i\).]

Example

Prove that \(L = \{a^ncb^n | n > 0\}\) is not regular.
Assume \(L\) is regular, therefore the pumping lemma holds.
Choose \(w = a^mcb^m\) where \(m\) is the constant in the pumping lemma. (Note that \(w\) must be choosen such that \(|w|\ge m\).)
The only way to partition \(w\) into three parts, \(w=xyz\), is such that \(x\) contains 0 or more a’s, \(y\) contains 1 or more a’s, and \(z\) contains 0 or more a’s concatenated with \(cb^m\). This is because of the restrictions \(|xy| \le m\) and \(|y|> 0\).
So the partition is: \(x=a^k\quad |\quad y=a^j\quad |\quad z=a^{m-k-j}cb^m\) where \(k \ge 0\), \(j > 0\), and \(k + j \le m\) for some constants \(k\) and \(j\).
It should be true that \(xy^iz \in L\) for all \(i\ge 0\).
\(xy^0z = a^{m-j}cb^{m} \not \in L\). Contradiction!

Example (1)

Prove that \(L = \{a^3b^nc^{n-3} | n > 3 \}\) is not regular.
We can do this with the Pumping Lemma, but it is very complicated!

Using Closure Properties

Using closure properties of regular languages, construct a language that should be regular, but for which you have already shown is not regular. Contradiction.

Proof Outline:
Assume \(L\) is regular.
Apply closure properties to \(L\) and other regular languages, constructing \(L'\) that you know is not regular.
Closure properties \(\Rightarrow L'\) is regular.
Contradiction. So \(L\) is not regular.

Example

Prove that \(L = \{a^3b^nc^{n-3} | n > 3 \}\) is not regular.
Assume \(L\) is regular.
Define a homomorphism \(h: \Sigma \rightarrow \Sigma^*\) such that \(h(a) = a\quad |\quad h(b) = a\quad |\quad h(c) = b\).
\(h(L) = \{a^3a^nb^{n-3} | n > 3 \} = \{a^{n+3}b^{n-3} | n > 3\}\)
\(L\) is regular and closure under homomorphism \(\Rightarrow h(L)\) is regular.
The language \(\{b^6\}\) is a regular language.
By closure under concatenation, \(L' = h(L)\{b^6\} = \{a^{n+3}b^{n+3} | n > 3\}\) is regular.
The language \(L'' = \{ab, aabb, aaabbb, aaaabbbb, aaaaabbbbb, aaaaaabbbbbb\}\) is regular.
By closure under union, \(L' \cup L'' = \{a^nb^n | n > 0\}\) is regular.
But, we showed earlier that \(\{a^nb^n | n > 0 \}\) is not regular! Contradiction.

Example

Prove that \(L = \{a^nb^ma^{m}\ |\ m \ge 0, n \ge 0 \}\) is not regular.
Assume \(L\) is regular.
\(L1 = \{ bb^{*}aa^{*}\}\)
\(L2 = L \cap L1 = \{b^na^n \mid n > 0\}\)
Define a homomorphism \(h: \Sigma \rightarrow \Sigma^*\) such that \(h(a) = b\quad |\quad h(b) = a\).
\(h(L2) = \{a^nb^n | n>0 \}\) should be regular.
We showed earlier that \(\{a^nb^n | n > 0 \}\) is not regular. Contradiction.

Example