2.Regular Expressions§

Regular expressions are a way to specify the set of strings that define a language.

There are three operators that can be used:

+ union (or)
⋅ concatenation (AND)
∗ star-closure (repeat 0 or more times)

We often omit showing the

$\cdot$ for concatenation.

Examples

Example 1:

$(a + b)^* \cdot a \cdot (a + b)^* = (a + b)^*a(a + b)^*$

What language is this? All strings from

$\{a, b\}^*$ that contain at least one

$a$ .

Example2 :

$(aa)*$

What language is this? Strings of

$a$ ‘s with an even number of

$a$ ‘s

Definition for Regular Expression

Given $\Sigma$ ,

$\emptyset$ , $\lambda$ , and $a \in \Sigma$ are regular expressions.
If $r$ and $s$ are regular expressions then

$r + s$ is a regular expression.

$r s$ is a regular expression.

$(r)$ is a regular expression.

$r^*$ is a regular expression.
$r$ is a regular expression iff it can be derived from (1) with a finite number of applications of (2).

Definition for Regular Language

$L(r) =$ language denoted by some regular expression $r$ .

$\emptyset$ , $\{\lambda\}$ , and $\{a \in \Sigma\}$ are languages denoted by a regular expression.
If $r$ and $s$ are regular expressions, then

$L(r + s) = L(r) \cup L(s)$

$L(r s) = L(r) \cdot L(s)$

$L((r)) = L(r)$

$L((r)^*) = (L(r)^*)$

Precedence Rules

$^*$ highest
$\cdot$
$+$ lowest

Example: $ab^* + c = (a(b)^*) + c$

Examples of Regular Languages

$\Sigma = \{a,b\}$ , $\{w \in {\Sigma}^{*} \mid w$ has an odd number of $a$ ‘s followed by an even number of $b$ ‘s $\}$ :

??
$\Sigma=\{a,b\}$ , $\{w \in {\Sigma}^{*} \mid w$ has no more than three $a$ ‘s and must end in $ab\}$ :

??
Regular expression for positive and negative integers:

??

Regular Expressions vs. Regular Languages

Recall that we previously defined the term regular language to mean the languages that are recognized by a DFA.

(Which is the same as the languages recognized by an NFA.)

How do regular expressions relate to these?

We can easily see NFAs for

$\emptyset$ ,

$\lambda$ , and

$a \in \Sigma$ .

But what about the “more interesting” regular expressions?

Can every regular language

$r$ be described by some regular expression?

Regular Expression $\rightarrow$ NFA (1)

Theorem: Let $r$ be a R.E. Then $\exists$ NFA $M$ such that $L(M) = L(r)$ .

Proof: By simple simulation/construction.

(This is a standard approach to proving such things!)

We aleady know that we can easily do $\emptyset$ , $\{\lambda\}$ , and $\{a\}$ for $a \in \Sigma$ .

Regular Expression $\rightarrow$ NFA (1a)

Here is an NFA that accepts nothing ( $\emptyset$ ).

Here is an NFA that accepts an empty string ( $\lambda$ ).

Here is an NFA that accepts $a \in \Sigma$ .

Regular Expression $\rightarrow$ NFA (2)

Suppose that $r$ and $s$ are regular expressions. (By induction…) That means that there is an NFA for $r$ and an NFA for $s$ .

$r + s$ . Simply add a new start state and a new final state, each connected (in parallel) with $\lambda$ transitions to both $r$ and $s$ .

$r \cdot s$ . Add new start state and new final state, and connect them with $\lambda$ transitions in series.

$r^*$ . Add new start and final states, along with $\lambda$ transitions that allow free movement between them all.

<< Question: Why are we using NFAs for this proof?>>

Regular Expression $\rightarrow$ NFA: Schematic

1 / 4 Settings

<<<>>>

Suppose that we have a regular expression $r$. We want to find a simple representation for the NFA that accepts $r$.

Saving...

Server Error
Resubmit

Regular Expression $\rightarrow$ NFA: Or

1 / 8 Settings

<<<>>>

Suppose that we have a reqular expression $r+s$.

Saving...

Server Error
Resubmit

Regular Expression $\rightarrow$ NFA: Concatenation

1 / 8 Settings

<<<>>>

Lets suppose that we have a reqular expression $r.s$

Saving...

Server Error
Resubmit

Regular Expression $\rightarrow$ NFA: Star

1 / 9 Settings

<<<>>>

Lets suppose that we have a reqular expression $r^*$

Saving...

Server Error
Resubmit

Example

$ab^* + c$

1 / 4 Settings

<<<>>>

In this example, we will convert the NFA that accepts $ab^*+c$ to a minimized DFA.

Saving...

Server Error
Resubmit

NFA $\rightarrow$ Regular Expression

Theorem: Let $L$ be regular. Then $\exists$ a regular expression such that $L = L(r)$ .

Perhaps you see that any regular expression can be implemented as a NFA.

For most of us, its not obvious that any NFA can be converted to a regular expression.

Proof Idea:

Remove states sucessively, generating equivalent generalized transition graphs (GTG) until only two states are left (initial state and one final state).

The transition between these states is a regular expression that is equivalent to the original NFA.

Generalized Transition Graph (GTG)

A Generalized Transition Graph (GTG) is a transition graph whose edges can be labeled with any regular expression. Thus, it “generalizes” the standard transition graph.

Definition: A complete GTG is a complete graph, meaning that every state has a transition to every other state. Any GTG can be converted to a complete GTG by adding edges labeled $\emptyset$ as needed.

Proof (1)

$L$ is regular

$\Rightarrow \exists$ NFA

$M$ such that

$L = L(M)$ .

1. Assume

$M$ has one final state, and

$q_0 \notin F$ .

2. Convert

$M$ to a complete GTG.

Let

$r_{ij}$ stand for the label of the edge from

$q_i$ to

$q_j$ .

3. If the GTG has only two states, then it has this form:

rii

rjj

rji

rij

Add an arrow to the start state.

Then, the corresponding regular expression is:

$r = (r^*_{ii}r_{ij}r^*_{jj}r_{ji})^*r^*_{ii}r_{ij}r^*_{jj}$

Proof (2)

(See homework question about why its fair to just assume that there is a single final state, and that the start state is not final.)

If the GTG has three states, then it must have the following form:

rkk

rjj

rii

rkj

rjk

rij

rji

rik

rki

Proof (3)

In this case, make the following replacements:

REPLACE  WITH
riirii+rikr∗kkrki
rjjrjj+rjkr∗kkrkj
rijrij+rikr∗kkrkj
rjirji+rjkr∗kkrki

After these replacements, remove state $q_k$ and its edges.

Proof (4)

If the GTG has four or more states, pick any state

$q_k$ that is not the start or the final state. It will be removed.

For all

$o \neq k, p \neq k$ , replace

$r_{op}$ with

$r_{op} + r_{ok}r^*_{kk}r_{kp}$ .

When done, remove qk and all its edges.
Continue eliminating states until only two states are left.
Finish with step 3 of the “Proof (1)” slide.

Proof (5)

In each step, we can simplify regular expressions

$r$ and

$s$ with any of these rules that apply:

r+r=r
s+r∗s=r∗s
r+∅=r
r∅=∅
∅∗={λ}
rλ=r
(λ+r)∗=r∗
(λ+r)r∗=r∗

And similar rules.

Example

1 / 9 Settings

<<<>>>

Suppose we want to convert the following FA to a Regular Expression.

Saving...

Server Error
Resubmit

Conclusion

We have now demonstrated that R.E. is equivalent (meaning, goes both directions) to DFA.

2.Regular Expressions§

Examples

Definition for Regular Expression

Definition for Regular Language

Precedence Rules

Examples of Regular Languages

Regular Expressions vs. Regular Languages

Regular Expression →\rightarrow NFA (1)

Regular Expression →\rightarrow NFA (1a)

Regular Expression →\rightarrow NFA (2)

Regular Expression →\rightarrow NFA: Schematic

Regular Expression →\rightarrow NFA: Or

.

Regular Expression →\rightarrow NFA: Concatenation

Regular Expression →\rightarrow NFA: Star

Example

.

NFA →\rightarrow Regular Expression

Generalized Transition Graph (GTG)

Proof (1)

Proof (2)

Proof (3)

Proof (4)

Proof (5)

Example

Conclusion

Regular Expression $\rightarrow$ NFA (1)

Regular Expression $\rightarrow$ NFA (1a)

Regular Expression $\rightarrow$ NFA (2)

Regular Expression $\rightarrow$ NFA: Schematic

Regular Expression $\rightarrow$ NFA: Or

Regular Expression $\rightarrow$ NFA: Concatenation

Regular Expression $\rightarrow$ NFA: Star

NFA $\rightarrow$ Regular Expression