Processing math: 100%
Close
Close Window

CS4114 Formal Languages and Automata

Chapter 6 Context-Free Grammars and Languages

Show Source |    | About   «  5.1. Identifying Non-regular Languages   ::   Contents   ::   6.2. CFG exercises  »

6.1. Context-Free Languages

6.1.1. Context-Free Languages

In the previous chapters, we saw that some languages are regular languages, which means that we can define a DFA, NFA, regular expression, or regular grammar for them.

Examples of regular languages:

  • keywords in a programming language

  • names of identifiers

  • integers

  • a finite list of miscillaneous symbols: = ;

Then we covered the Pumping Lemma which is a tool to prove that if a language is not regular.

Examples for Not Regular languages:

  • expressions: ((a+b)c)

  • block structures ({} in Java/C++ and beginend in Pascal)

We know that not all languages are not regular, since we’ve proved some are not. Now we will look at a class of languages that is larger than the regular languages, and more powerful: Context-free languages.

1 / 14 Settings
<<<>

Definition: A grammar $G=(V,T,S,P)$ is context-free if all productions are of the form
$A \rightarrow x$
where $A \in V$ and $x \in (V∪T)^∗$.
(T includes $\lambda$.)

Proficient Saving... Error Saving
Server Error
Resubmit

6.1.2. String Derivation

1 / 38 Settings
<<<>

Given a grammar $G$, the most important question to ask is whether a given string $w$ can be derived from $G$. This process is called string derivation.

String derivation begins with the start variable for the grammar (we often use $S$ as the start variable). A series of replacements on variables is performed until there are no more variables to replace. At each step, a variable is replaced by the right hand side of some production rule that has that variable as its left hand side. The goal is to select the proper replacement steps that will lead to $w$.

Proficient Saving... Error Saving
Server Error
Resubmit

6.1.3. Derivation Trees

1 / 13 Settings
<<<>

We have covered how to derive a string by using sentential forms. Now we a more visual derivation method.

Derivation Trees (also known as parse trees) represent a derivation, but do not show the order in which the productions were applied.

Created with Raphaël 2.1.2
Proficient Saving... Error Saving
Server Error
Resubmit

6.1.4. Derivation Trees Example

1 / 7 Settings
<<<>>>

Here is an example that shows how can we build a parse tree for the string $aacbb$ for the given grammar.

  1. S
  2. AcB
  1. A
  2. aAa
  1. A
  2. λ
  1. B
  2. Bbb
  1. B
  2. λ
Proficient Saving... Error Saving
Server Error
Resubmit

6.1.5. Practice question 1

6.1.6. Membership Problem

1 / 24 Settings
<<<>

The most important question that we can ask about a grammar is whether a given string can be derived from it. This is know as the Membership Problem.

Proficient Saving... Error Saving
Server Error
Resubmit

6.1.7. Practice question 2

6.1.8. Ambiguity

1 / 22 Settings
<<<>

$\textbf {Definition}$: A CFG $G$ is ambiguous if there is any string $w$ in the language $L(G)$ that has two distinct derivation trees.

Proficient Saving... Error Saving
Server Error
Resubmit

6.1.8.1. Ambiguous Grammars (1)

6.1.8.2. Ambiguous Grammars (2)

This problem is again about determining how many parse trees a given string has in a given grammar.

6.1.8.3. Ambiguous Grammars (3)

This problem is once more about determining how many parse trees a given string has in a given grammar.

6.1.8.4. Ambiguous Grammars (4)

This problem will help you discover ambiguities in grammars as well as convince yourself that a grammar is not ambiguous.

6.1.9. Precedence Practice

6.1.10. Unambiguous grammar parse tree Example

1 / 13 Settings
<<<>>>

Here is an example that shows how can we build a parse tree for the string $a+b*a$ for the given grammar.

  1. E
  2. E+T
  1. E
  2. T
  1. T
  2. T*F
  1. T
  2. F
  1. F
  2. I
  1. F
  2. (E)
  1. I
  2. a
  1. I
  2. b
Proficient Saving... Error Saving
Server Error
Resubmit

6.1.10.1. Associativity

6.1.10.2. Precedence and Associativity

This problem illustrates how grammatical structure impacts the associativity property and order of precedence of arithmetic operators.

Backus-Naur Form of a grammar:

Nonterminals are enclosed in brackets <>
For “” use instead “::=

Sample C++ Program::

main () {
  int a;     int b;   int sum;
  a = 40;    b = 6;   sum = a + b;
  cout << "sum is "<< sum << endl;
}

“Attempt” to write a CFG for C++ in BNF (Note: <program> is start symbol of grammar.)

<program>::=main() <block><block>::={ <stmt-list> }<stmt-list>::=<stmt> | <stmt> <stmt-list> | <decl> | <decl><stmt-list><decl>::=int <id> ; | double <id> ;<stmt>::=<asgn-stmt> | <cout-stmt><asgn-stmt>::=<id> = <expr> ;<expr>::=<expr> + <expr> | <expr>  <expr> | ( <expr> ) | <id><cout-stmt>::=cout <out-list>

etc., Must expand all nonterminals!

So a derivation of the program test would look like:

<program> main() <block> main() { <stmt-list> } main() { <decl> <stmt-list> } main() { int <id> <stmt-list> } main() { int a <stmt-list> } complete C++ program

This problem asks you to provide a characterization in English of the language generated by a BNF grammar. After you finish it, there is one more problem about Extended Backus-Naur Form, which is described before the problem.

6.1.11. Extended BNF

The symbols we have used in our representation of grammars collectively comprise what is known as Backus-Naur Form (BNF). In Extended Backus-Naur Form (EBNF) we add five meta-symbols to those already used in BNF notation:

  1. Kleene closure operator , which means “zero or more”. Hence if <fn_name> were a non-terminal representing a valid function name and <argument> were a non-terminal representing a valid argument, then the EBNF notation for function calls with zero or more arguments (with no commas between them) would be

    <fn_name>"("<argument>")"
  2. Positive closure operator +. The EBNF notation for function calls that must have at least one argument would be

    <fn_name>"("<argument>+")"
  3. The two paired parenthesis symbols (), which are used for grouping. For example, if <positive_number> were the non-terminal denoting a valid positive number, then the following EBNF would dictate that we must have a plus or minus sign preceding a number

(+|)<positive_number>
  1. The “optional operator” ?, which specifies that you can have zero or one of whatever grammatical structure precedes the operator. For example, if our language allowed an optional plus or minus sign in front of a number, we would use the EBNF

    (+|)?<positive_number>

EBNF is used to reduce the number of productions a grammar needs to specify a language. However, it does not increase the expressive power of grammars, that is, any grammatical structure that can be expressed in EBNF can also be expressed in BNF if one is willing to use more productions.

This last problem is about the equivalence between a given BNF grammar (the same one as in part 4 above) and a smaller EBNF grammar.

More on CFG for C++

Last time we “attempted” to write a CFG for C++, it is possible to write a CFG that recognizes all syntactically correct C++ programs, but there is a problem that the CFG also accepts incorrect programs. For example, it can’t recognize that it is an error to declare the same variable twice, once as an integer and once as a char.

We can write a CFG G such that L(G)={syntactically correct C++ programs}.

But note that {semantically correct C++ programs}L(G).

Another example: Can’t recognize if formal parameters match actual parameters in number and type:

declare: int Sum(int a, int b, int c) …
call: newsum = Sum(x,y);

   «  5.1. Identifying Non-regular Languages   ::   Contents   ::   6.2. CFG exercises  »

nsf
Close Window