Now that we know the definition for Regular Expressions and have a bit
of experience with writing them, the next order of business is
understanding how powerful they are.
In particular, a natural question to ask is:
What is the relationship between Regular Expressions and Regular
Languages?
Recall that a Regular Language is defined to be any langauge that can
be accepted by a DFA (and equivalently, any language that can be
accepted by a NFA).
In this section, we will use our standard approach of simulation to
show that Regular Expressions are equivalent to Regular Languages.
By this, we mean that a Regular Expression can be converted to a
representation for a Regular Language (in particular, a NFA).
Therefore, any Regular Expression represents a Regular Language.
Going the other way, any Regular Language (in the form of an NFA) can
be converted to a Regular Expression.
Thus, any Regular Language can be represented by a Reglar Language.
The conclusion is then that these are equivalent.
35.4.1. Every Regular Expression has an Equivalent NFA¶
Part 1. Recall that we define the term regular language to mean the languages that are recognized by a DFA. And we know these are the same as the languages recognized by an NFA, because we know that every NFA can be converted to a DFA (and vice versa).
Now, we will show the relationship between regular languages (and thus, DFAs and NFAs) and Regular Expressions.
Summary: We have now shown that (1) an RE consisting of
λ or of a single symbol from the alphabet can be
represented by an NFA, and (2) we can convert any NFA to an equivalent
NFA with a single final state.
This simplifies the rest of the constructions that we will use.
Part 2. In Part 1, we showed how to convert the base case REs ($\lambda$ and any symbol from $\Sigma$) to NFAs. And we showed that any NFA can be converted to an equivalent NFA with a single final state.
Now we will see how to convert more complex REs to an NFA.
Part 3. Next, we will define a construction for the NFA that can accept the RE $r \cdot s$, given that we have NFAs that are equivalent to $r$ and $s$.
Part 4. The last operator that we need to implement is the Kleene star ($*$) operator. The operator will concatenate the language with itself zero or more times.
We now have a proof that any RegEx can be converted to a NFA. And we know some mechanics: In particular, we know how to combine two NFAs that represent RegExs into a single NFA using one of the RegEx builder rules. Unfortunately, that does not really help us when faced with a complex RegEx that we want to convert to an NFA. In this frameset, we show an algorithm for doing this.
Since every regular expression has an NFA that implements it, this means that the regular expressions are a subset of the regular languages. The next question is: Does every regular language have a regular expression?
Perhaps you thought it fairly intuitive to see that any regular expression can be implemented as a NFA. But for most of us, going the other way is not at all obvious. The proof that any NFA can be converted to a regular expression is rather difficult, and we are just going to give a sketch.