28.12. NP-Completeness Proofs¶

28.12.1. NP-Completeness Proofs¶

To start the process of being able to prove problems are NP-complete, we need to prove just one problem \(H\) is NP-complete. After that, to show that any problem \(X\) is NP-hard, we just need to reduce \(H\) to \(X\). When doing NP-completeness proofs, it is very important not to get this reduction backwards! If we reduce candidate problem \(X\) to known hard problem \(H\), this means that we use \(H\) as a step to solving \(X\). All that means is that we have found a (known) hard way to solve \(X\). However, when we reduce known hard problem \(H\) to candidate problem \(X\), that means we are using \(X\) as a step to solve \(H\). And if we know that \(H\) is hard, that means \(X\) must also be hard (because if \(X\) were not hard, then neither would \(H\) be hard).

So a crucial first step to getting this whole theory off the ground is finding one problem that is NP-hard. The first proof that a problem is NP-hard (and because it is in NP, therefore NP-complete) was done by Stephen Cook. For this feat, Cook won the first Turing award, which is the closest Computer Science equivalent to the Nobel Prize. The “grand-daddy” NP-complete problem that Cook used is called SATISFIABILITY (or SAT for short).

A Boolean expression is comprised of Boolean variables combined using the operators AND (\(\cdot\)), OR (\(+\)), and NOT (to negate Boolean variable \(x\) we write \(\overline{x}\)). A literal is a Boolean variable or its negation. A clause is one or more literals OR’ed together. Let \(E\) be a Boolean expression over variables \(x_1, x_2, ..., x_n\). Then we define Conjunctive Normal Form (CNF) to be a Boolean expression written as a series of clauses that are AND’ed together. For example,

\[E = (x_5 + x_7 + \overline{x_8} + x_{10}) \cdot (\overline{x_2} + x_3) \cdot (x_1 + \overline{x_3} + x_6)\]

is in CNF, and has three clauses. Now we can define the problem SAT.

Cook proved that SAT is NP-hard. Explaining Cook’s proof in detail is beyond the scope of this course. But we can briefly summarize it as follows. Any decision problem \(F\) can be recast as some language acceptance problem \(L\):

\[F(I) = \mbox{YES} \Leftrightarrow L(I') = \mbox{ACCEPT}.\]

That is, if a decision problem \(F\) yields YES on input \(I\), then there is a language \(L\) containing string \(I'\) where \(I'\) is some suitable transformation of input \(I\). Conversely, if \(F\) would give answer NO for input \(I\), then \(I\) ‘s transformed version \(I'\) is not in the language \(L\). For example, recognizing that a given graph has a solution to TRAVELING SALESMAN with cost less than \(k\) is the same as recognizing that a representation for that graph as a string is in the language of all such strings that correspond to graphs for which the answer to TRAVELING SALESMAN is YES.

Turing machines are a simple model of computation for writing programs that are language acceptors. There is a “universal” Turing machine that can take as input a description for a Turing machine, and an input string, and return the execution of that machine on that string. (The next is the crucial step in the proof.) This Turing machine in turn can be cast as a Boolean expression such that the expression is satisfiable if and only if the Turing machine yields ACCEPT for that string. Cook used Turing machines in his proof because they are simple enough that he could develop this transformation of Turing machines to Boolean expressions, but rich enough to be able to compute any function that a regular computer can compute. The significance of this transformation is that any decision problem that is performable by the Turing machine is transformable to SAT. Thus, SAT is NP-hard.

To show that a decision problem \(X\) is NP-complete, we prove that \(X\) is in NP (normally easy, and normally done by giving a suitable polynomial-time, non-deterministic algorithm) and then prove that \(X\) is NP-hard. To prove that \(X\) is NP-hard, we choose a known NP-complete problem, say \(A\). We describe a polynomial-time transformation that takes an arbitrary instance \(I\) of \(A\) to an instance \(I'\) of \(X\). We then describe a polynomial-time transformation from the solution for \(I'\) in problem \(X\) (call it \(SLN'\)) to \(SLN\) such that \(SLN\) is the solution for \(I\) in problem \(A\).

Terminology: Showing that a problem is NP-complete fundamentally involves a reduction from a known NP-complete problem to the problem in question. Of course getting it backwards would merely mean that a known hard problem is being used to solve a problem that might or might not be hard. Note the terminology sometimes gets used in either order. Some people say that we reduce FROM the known NP-complete problem TO the problem in question. Other say that we reduce TO the problem in question FROM the known NP-complete problem. These sentences both mean the same thing, but alternate phrasing sometimes can be confusing.

The following modules show a number of known NP-complete problems, and also some proofs that they are NP-complete. The various proofs will link the problems together as shown here:

OpenDSA Data Structures and Algorithms Modules Collection

Chapter 28 Limits to Computing

28.12. NP-Completeness Proofs¶

28.12.1. NP-Completeness Proofs¶