10.4.1. CYK Parsing
Invented by J. Cocke, D.H. Younger, and T. Kasami
Requires \(|w|^3\) steps to parse string \(w\).
Dynamic Programming remembers the answer to small subproblems
so that it won’t have to solve them again.
For CYK Parsing, the grammar must be in Chomsky Normal Form (CNF)
first.
Definition: A CFG is in Chomsky Normal Form (CNF) if all
productions are of the form
\(A \rightarrow BC\) or \(A \rightarrow a\)
where \(A, B, C \in V\) and \(a \in T\).
10.4.1.1. CYK Parsing Algorithm
Assume \(G = (V, T, S, P)\) is in CNF, and \(w =
a_1a_2...a_n\).
Define substrings \(w_{ij} = a_i...a_j\).
Define subsets of
\(V, V_{ij} = \{A \in V \mid A \stackrel{*}{\Rightarrow} w_{ij} \}\)
Then \(w \in L(G)\) if and only if \(S \in V_{1n}\).
Note
\(A \in V_{ii}\) if and only if there is what kind of production?
Answer: \(A \rightarrow a_i\)
All \(V_{ii}\) are easy, just based on if there is a production.
Note: Compute \(V_{ij}\).
For \(j >i\), \(A\) derives \(w_{ij}\) if and only if
there is a production \(A \rightarrow BC\) with
\(B \stackrel{*}{\Rightarrow} w_{ik}\) and
\(C \stackrel{*}{\Rightarrow} w_{{k+1}j}\) for some \(k\)
with \(i \le k, k < j\).
10.4.1.2. Algorithm
Compute \(V_{11}, V_{22}, V_{33}, \ldots, V_{nn}\)
Compute \(V_{12}, V_{23}, V_{34}, \ldots, V_{{n-1}n}\)
Compute \(V_{13}, V_{24}, V_{35}, \ldots, V_{{n-2}n}\)
\(\ldots\)
Last step is? Compute \(V_{1n}\)
How do we know if it worked?
If the last step is \(S\)
Example 10.4.1
\(S \rightarrow CD \mid CF\)
\(B \rightarrow HB \mid c\)
\(C \rightarrow a\)
\(D \rightarrow SE\)
\(E \rightarrow GG\)
\(F \rightarrow BE\)
\(G \rightarrow b\)
\(H \rightarrow c\)
Build the CYK Parse table:
\[\begin{split}\begin{array}{|l|l|l|l|l|l|l|}
\ a& \ a & \ c &\ b & \ b & \ b& \ b \\ \hline
\ \ \ & \ \ \ & \ \ \ & \ \ \ & \ \ \ & \ \ \ & \ \ \ \\ \hline
\ \ \ & \ \ \ & \ \ \ & \ \ \ & \ \ \ & \ \ \ \\ \cline{1-6}
\ \ \ & \ \ \ & \ \ \ & \ \ \ & \ \ \ \\ \cline{1-5}
\ \ \ & \ \ \ & \ \ \ & \ \ \ \\ \cline{1-4}
\ \ \ & \ \ \ & \ \ \ \\ \cline{1-3}
\ \ \ & \ \ \ \\ \cline{1-2}
\ \ \ \\ \cline{1-1}
\end{array}\end{split}\]