The Regular Expression (also known as RegEx or RE) is another
way to define a language.
They are used a lot, especially by programmers for defining simple
search patterns.
This adds another way to define languages along with the ones that we
already know: Grammars, DFAs and NFAs.
Or, we could just describe the language using an English description.
Why do we need another one?
The problem with an English description (or any other language that
people speak) is that it is too imprecise, and not something that we
can easily implement.
Using a DFA or NFA requires typically requires some sort of graphical
editor, and this takes a bit of time to enter.
We will see that regular expressions are easy to just type, and they
tend to be relatively short descriptions for common languages that we
want to represent.
Of course, even a relatively small precise specification for a
language can be hard to come up with (or to understand).
But at least with a regular expression, it is usually quick and easy
to type once you have it.
4.1.1. Definition and Examples of Regular Expressions¶
Part 1. Recall that we define the term regular language to mean the languages that are recognized by a DFA. And we know these are the same as the languages recognized by an NFA, because we know that every NFA can be converted to a DFA (and vice versa).
Now, we will show the relationship between regular languages (and thus, DFAs and NFAs) and Regular Expressions.
Part 2. In Part 1, we showed how to convert the base case REs ($\lambda$ and any symbol from $\Sigma$) to NFAs. And we showed that any NFA can be converted to an equivalent NFA with a single final state.
Now we will see how to convert more complex REs to an NFA.
Part 3. Next, we will define a construction for the NFA that can accept the RE $r \cdot s$, given that we have NFAs that are equivalent to $r$ and $s$.
Part 4. The last operator that we need to implement is the Kleene star ($*$) operator. The operator will concatenate the language with itself zero or more times.