3.Review
How do we prove that a language is regular?
We have a number of approaches in our toolbox.
How do we know if a language is non-regular?
Given so many tools for creating a regular language, are there
languages that are not regular?
(The very fact that we are concerned with this question is a hint that
this can happen.)
If a language \(L\) is finite, is \(L\) regular?
If \(L\) is infinite, is \(L\) regular?
Something to Think About, Revisited
\(L = \{a^nb^n \mid n>0\}\)
Is language \(L\) regular?
Can you draw a DFA, regular expression, or Regular grammar for this
language?
Our First Non-regular Language (1)
Prove that \(L_2 = \{a^nb^n | n > 0 \}\) is not regular.
Proof 1 (by contradiction)
If \(L_2\) is regular then \(\exists\) DFA \(M\)
that recognizes \(L_2\).
\(M\) has a finite number of states, say \(k\) states.
Consider a long string \(a^kb^k \in L_2\).
Since there are \(k\) states and \(k\) a’s,
some state in \(M\) must be reached more than once when
following the path of \(a^k\).
In that case, there is a loop with one or more a’s
(say \(t\) a’s for some \(t > 1\)) along the path.
Our First Non-regular Language (2)
Proof 1 (continued)
Suppose we start at the initial state, traverse the same path for
\(a^kb^k\), but we traverse the loop of \(a\) ‘s one
additional time.
We will end up in the same final state that \(a^kb^k\) did,
but our actual number of a’s is some other value (call it
\(m\)).
Therefore, the string \(a^mb^k\) is accepted by \(M\),
but this string is not in \(L_2\). Contradiction!
Thus, \(L_2\) is not regular.
Pigeonhole Principle
This is an example of the Pigeonhole Principle.
The Pigeonhole Principle states that, given \(n\) pigeonholes
and \(n+1\) pigeons, when all of the pigeons go into the holes
we can be sure that at least one hole contains more than one pigeon.
In our case, the number of \(a\) ‘s are the pigeons,
and the states in the DFA are the pigeonholes.
We can’t distinguish the various possibilities for the number of
\(a\) ‘s, so we can’t verify that they properly match the number
of \(b\) ‘s.
Pumping Concept (1)
We introduce the concept of “pumping” the string as
we go around the loop.
Loops are how we get infinite languages.
They are also how we lose count or otherwise lose the ability to
distinguish various properties of the string being processed.
Another way of looking at it is that the memory of a DFA is embodied
explicitly in its set of states.
Since the number of states is finite, the memory is finite.
A program in a traditional programming language might
have one integer (that conceptually at least stores an infinite number
of values, even if that is not literally true), a DFA has no such
thing.
Pumping Concept (2)
In general, consider DFAs with or without cycles.
If there is no cycle, the language accepted is finite.
If there is one or more cycle, then the language is infinite.
Let’s assume that there is exactly one cycle.
The cycle might be skipped, executed once, or executed more than once.
We can consider any string accepted by this DFA when the cycle is
executed exactly once to be of the form
\(w = w_1vw_2\) where the \(v\) represents the part of the
string captured by the cycle.
If the cycle is skipped, then we get \(w = w_1w_2\)
If its executed twice we get \(w = w_1v^2w_2\).
In general, the DFA accepts all strings like \(w = w_1v^*w_2\).
Pumping Lemma
Let \(L\) be an infinite regular language.
Then there exists a constant \(m > 0\) such that any
\(w \in L\) with \(|w| \ge m\) can be decomposed into three
parts as \(w=xyz\) with:
\(|xy| \le m\)
\(|y| \ge 1\)
\(xy^iz \in L\) for all \(i\ge 0\)
Meaning: Every sufficiently long string in \(L\)
(the constant \(m\) corresponds to the finite number of states in
\(M\))
can be partitioned into three parts such that the middle
part can be “pumped”, resulting in strings that must be in \(L\).
P.L. Used to Prove L Non-regular
Proof by Contradiction.
Assume L is regular.
Therefore \(L\) satisfies the pumping lemma.
Choose a long string \(w \in L\), \(|w| \ge m\).
(The choice of the string is crucial.
We must pick a string that will yield a contradiction).
Show that there is NO division of \(w\) into \(xyz\)
(must consider all possible divisions) such that
\(|xy| \le m\), \(|y| \ge 1\) and \(xy^iz \in L \forall i \ge 0\).
The pumping lemma does not hold. Contradiction!
\(\Rightarrow L\) is not regular.
Example
Prove that \(L = \{a^nb^n | n \geq 0\}\) is not regular.
Assume \(L\) is regular, therefore the pumping lemma holds.
Choose \(w = a^mb^m\)
where \(m\) is the constant in the pumping lemma.
(Note that \(w\) must be chosen such that \(|w| \ge m\).)
Choose \(n = m\).
Therefore, substring \(y\) must be some number of
\(a\) ‘s.
So the partition is \(x=a^{n-k}\quad |\quad y=a^k\quad |\quad z=b^n\)
where \(n \leq m\) and \(k > 0\).
It should be true that \(xy^iz \in L\) for all \(i\ge 0\).
But clearly this is not true. Contradiction!
Observations
Unfortunately, the pumping lemma is one-way:
For (some) languages we can use the pumping lemma to prove that they
are not regular.
But we cannot use the pumping lemma to help us prove that a language
is regular.
And the pumping lemma is not a universal solution for determining that
a language is non-regular.
Its just a tool in the toolbox.
More Observations
The pumping lemma says that there
is some way to define the language that meets the criteria.
It is not enough to pick your favorite value of \(m\) for which
the language would/would not be regular.
You have to show that no satisfactory \(m\) can exist.
Can View as Adversary Argument
Your want to establish a contradiction (to prove the language is
not regular)
Meanwhile, the opponent tries to stop the proof.
The moves in the game are:
1. The opponent picks \(m\).
2. We pick string \(w\) in \(L\) of length equal or greater
than \(m\).
We are free to chose any \(w\), so long as \(w \in L\) and
\(|w| \geq m\).
3. The opponent chooses the decomposition \(xyz\), such that
\(|xy| \leq m, |y| \geq 1\).
The opponent will make the choice that is hardest for us to win the
game.
4. We try to pick \(i\) so that the pumped string
\(w_i = xy^iz\) is not in \(L\).
If we can always do this, we win (\(L\) is not regular).
Example
Prove \(L = \{ww^R : w \in \Sigma^*\}\) is not regular.
For any value \(m\), we pick the string
\(a^mb^mb^ma^m\).
Since \(|xy| \leq m\), \(y\) must consist entirely of
\(a\) ‘s.
If we pick \(i = 0\), then the resulting string has fewer
\(a\) ‘s on the left than on the right and so cannot be of
the form \(ww^R\).
Therefore, \(L\) is not regular.
Example: Failure (as expected)
If the language is indeed regular, you should find it impossible to
use the pumping lemma to prove it non-regular!
Prove \(L = \{a^mb^n \mid n+m\) is odd \(\}\) is not regular.
If the opponent picks \(m = 1\), then we can pick
\(w = abb\).
Whatever the adversary picks for
\(xyz\), we end up with \(y\) such that we can pump
strings not in the language.
SO… does this mean that \(L\) is non-regular?
NO!! The adversary will not pick a
bad choice for \(m\) if they don’t have to!
Example: Failure (as expected)
Prove \(L = \{a^mb^n \mid n+m\) is odd \(\}\) is not regular.
Say that the opponent picks \(m = 3\).
We can choose this string that is in the language:
\(w = aaabb\) so as to constrain the opponent to picking
values for \(y\) with all \(a\) ‘s.
But unfortunately, the opponent picks decomposition
\(a(aa)^ibb\).
We can’t pick \(i\) that is not in the language.
The point is that we cannot find a string, for all values
of \(m\), such that the opponent cannot also pick workable
values for \(x, y, z\).
Adversary Argument Explained (1)
Consider the Pumping Lemma definition again:
Let \(L\) be an infinite regular language.
There exists a constant \(m > 0\) such that any
\(w \in L\) with \(|w| \ge m\) can be decomposed into three
parts as \(w=xyz\) with:
\(|xy| \le m\)
\(|y| \ge 1\)
\(xy^iz \in L\) for all \(i\ge 0\)
1. The opponent picks \(m\).
2. We pick string \(w\).
3. The opponent chooses the decomposition \(xyz\).
4. We try to pick \(i\).
Adversary Argument Explained (2)
WE seek to prove the language non-regular.
The adversary seeks to stop us.
- There exists a constant \(m > 0\)
[= Adversary picks a value for \(m\).]
- … such that any \(w \in L\) with \(|w| \ge m\)
[= WE pick our choice for \(w\).]
- … can be decomposed into three parts as \(w=xyz\)
[= Adversary picks \(xyz\)]
(that meets the length criteria on \(xy\) and \(y\))
- … such that \(xy^iz \in L\) for all \(i\ge 0\)
[= WE pick a value for \(i\).]
Example
Prove that \(L = \{a^ncb^n | n > 0\}\) is not regular.
Assume \(L\) is regular, therefore the pumping lemma holds.
Choose \(w = a^mcb^m\)
where \(m\) is the constant in the pumping lemma.
(Note that \(w\) must be choosen such that \(|w|\ge m\).)
The only way to partition \(w\) into three parts,
\(w=xyz\), is such that \(x\) contains 0 or more a’s,
\(y\) contains 1 or more a’s, and \(z\) contains 0 or
more a’s concatenated with \(cb^m\).
This is because of the restrictions \(|xy| \le m\) and
\(|y|> 0\).
So the partition is: \(x=a^k\quad |\quad y=a^j\quad |\quad z=a^{m-k-j}cb^m\)
where \(k \ge 0\), \(j > 0\), and \(k + j \le m\)
for some constants \(k\) and \(j\).
It should be true that \(xy^iz \in L\) for all \(i\ge 0\).
\(xy^0z = a^{m-j}cb^{m} \not \in L\). Contradiction!
Example (1)
Prove that \(L = \{a^3b^nc^{n-3} | n > 3 \}\) is not regular.
We can do this with the Pumping Lemma, but it is very complicated!
Using Closure Properties
Using closure properties of regular languages, construct a language
that should be regular, but for which you have already shown is
not regular. Contradiction.
Proof Outline:
Assume \(L\) is regular.
Apply closure properties to \(L\) and other regular
languages, constructing \(L'\) that you know is not regular.
Closure properties \(\Rightarrow L'\) is regular.
Contradiction. So \(L\) is not regular.
Example
Prove that \(L = \{a^3b^nc^{n-3} | n > 3 \}\) is not regular.
Assume \(L\) is regular.
Define a homomorphism \(h: \Sigma \rightarrow \Sigma^*\)
such that \(h(a) = a\quad |\quad h(b) = a\quad |\quad h(c) = b\).
\(h(L) = \{a^3a^nb^{n-3} | n > 3 \} = \{a^{n+3}b^{n-3} | n > 3\}\)
\(L\) is regular and closure under homomorphism
\(\Rightarrow h(L)\) is regular.
The language \(\{b^6\}\) is a regular language.
By closure under concatenation,
\(L' = h(L)\{b^6\} = \{a^{n+3}b^{n+3} | n > 3\}\)
is regular.
The language
\(L'' = \{ab, aabb, aaabbb, aaaabbbb, aaaaabbbbb, aaaaaabbbbbb\}\) is regular.
By closure under union, \(L' \cup L'' = \{a^nb^n | n > 0\}\)
is regular.
But, we showed earlier that \(\{a^nb^n | n > 0 \}\) is not
regular! Contradiction.
Example
Prove that \(L = \{a^nb^ma^{m}\ |\ m \ge 0, n \ge 0 \}\) is not regular.
Assume \(L\) is regular.
\(L1 = \{ bb^{*}aa^{*}\}\)
\(L2 = L \cap L1 = \{b^na^n \mid n > 0\}\)
Define a homomorphism \(h: \Sigma \rightarrow \Sigma^*\)
such that \(h(a) = b\quad |\quad h(b) = a\).
\(h(L2) = \{a^nb^n | n>0 \}\) should be regular.
We showed earlier that \(\{a^nb^n | n > 0 \}\) is not
regular. Contradiction.
Example