28.22. Turing Machines¶
28.22.1. Turing Machines¶
28.22.1.1. A General Model of Computation¶
We would like to define a general model of computation that is as simple as possible. The reason is that we want to be able to understand the limits of what is possible in computing, but that is rather hard to do with a complicated definition for a “computer” is. But then, we need to be confident that whatever model we do pick, that it actually represents all of the fundamental capabilities of a “computer”.
“State machines” are simple to understand. There are a number of different state machines, with a range of capabilities. We will discuss a particular one, called a Turing machine. As we define “capability”, the key is ability, not efficiency.
The necessary capabilites for any such “machine” are these:
Read
Write
Compute
A Turing machine is defined as follows. It has a tape, divided into squares, with a fixed left end and extending infinitely to the right. Each square can store one character. The machine has a single I/O head that at any instant in time is “on” one of the squares. The control unit of the machine is defined by a set of abstract states. At any given instant, the machine is said to be “in” one of the states, and has a set of actions that can be performed when in that state. From the current state, the machine will read the symbol on the current state, and can then do one of the following:
Change the current symbol.
Move the I/O head one square to either the left or the right.
By convention, the machine ceases to perate if the head moves off the left end of the tape, or if the control unit sends the machine into a specially designated halt state.
The input to the machine is the intial contents of the tape, which is described by listing all of the tape squares from the left to the rightmost non-blank tape. Naturally, there must be a finite number of non-blank symbols on the tape. The alphabet of the machine consists of some letters, including the special symbol \(\#\) which means a blank symbol on the given square.
A Turing machine is formally defined as a quadruple (\(K\), \(\Sigma\), $delta$, $s$) where
\(K\) is a finite set of states (not including \(h\), the halt state).
\(\Sigma\) is an alphabet (containing \(\#\), not \(L\) or \(R\)).
\(s \in K\) is the initial state.
\(\delta\) is a function from \(K \times \Sigma\) to \((K \cup \{h\}) \times (\Sigma \cup \{L, R\})\).
Note that including \(\#\) in the language is for convenience only. We want to be able to read our specifications without being confused.
If \(q \in K\), \(a \in \Sigma\) and \(\delta(q, a) = (p, b)\), then when in state \(q\) and scanning \(a\), enter state \(p\) and
If \(b \in \Sigma\) then replace \(a\) with \(b\).
Else (\(b\) is \(L\) or \(R\)): move head.
Example 28.22.1
\(M = (K, \Sigma, \delta, s)\) where
\(K = \{q_0, q_1\}\),
\(\Sigma = \{a, \#\}\),
\(s = q_0\)
\(\delta =\)
\[\begin{split}\begin{array}{lll} \hline q&\sigma&\delta(q, \sigma)\\ \hline q_0&a&(q_1, \#)\\ q_0&\#&(h, \#)\\ q_1&a&(q_0, a)\\ q_1&\#&(q_0, R)\\ \end{array}\end{split}\]
Note that state \((q_1, a)\) cannot happen if the start state is \(q_0\). This is included only for completness (to make \(\delta\) a total function.
This machine will scan right, changing any \(a\) that it sees to a \(\#\). When it first hits a \(\#\), it will halt.
Example 28.22.2
\(M = (K, \Sigma, \delta, s)\) where
\(K = \{q_0\}\),
\(\Sigma = \{a, \#\}\),
\(s = q_0\),
\(\delta =\)
\[\begin{split}\begin{array}{lll} \hline q&\sigma&\delta(q, \sigma)\\ \hline q_0&a&(q_0, L)\\ q_0&\#&(h, \#)\\ \end{array}\end{split}\]
This machine will scan left until it encounters \(\#\), and then halt.
28.22.1.2. Interpreting Turing Machines¶
A configuration for a Turing machine looks like this:
A halted configuration occurs when \(q\) is \(h\), the halt state.
A hanging configuration occurs when the I/O head moves to the left from the left-most square of the tape.
A computation is a sequence of configurations for some length \(n \geq 0\). Execution on the first machine example from the starting configuration show would appear as follows:
\(M\) is said to halt on input :math:`w` iff \((s, \#w\underline{\#})\) yields some halted configuration.
\(M\) is said to hang on input :math:`w` if \((s, \#w\underline{\#})\) yields some hanging configuration. That means either move left from left end or go into an infinite loop.
Turing machines compute functions from strings to strings. Formally: Let \(f\) be a function from \(\Sigma^*_0\) to \(\Sigma^*_1\). Turing machine \(M\) is said to compute \(f\) when, for any \(w \in \Sigma^*_0\), if \(f(w) = u\) then
Such a function \(f\) is said to be a Turing-computable function.
Here is how we express multiple parameters: For \(f(w_1, ..., w_k) = u\),
One way to express functions on natural numbers is to represent a number using unary notation. (Remember, we are not concerned about is efficient, we are concerned about what is possible.) In this case, we represent the value 0 as an empty string. We say that \(f: \mathbb{N} \rightarrow \mathbb{N}\) is computed by \(M\) if \(M\) computes \(f': \{I\}^* \rightarrow \{I\}^*\) where \(f'(I^n) = I^{f(n)}\) for each \(n \in \mathbb{N}\).
Example 28.22.3
Compute \(f(n) = n + 1\) for each \(n \in \mathbb{N}\).
An example computation:
In general, \((q_0, \#I^n\underline{\#}) \vdash^*_M (h, \#I^{n+1}\underline{\#})\). What about \(n = 0\)? The input is no marks in unary, and it works OK.
28.22.1.3. Turing-Decideable vs. Turing-Acceptable Languages¶
A language \(L \subset \Sigma_0^*\) is Turing-decidable iff function \(\chi_L: \Sigma^*_0 \rightarrow \{\fbox{Y}, \fbox{N}\}\) is Turing-computable, where for each \(w \in \Sigma^*_0\),
Example: Let \(\Sigma_0 = \{a\}\), and let \(L = \{w \in \Sigma^*_0: |w|\ \mbox{is even}\}\).
\(M\) erases the marks from right to left, with current parity encode by state. Once blank at left is reached, mark \(\fbox{Y}\) or \(\fbox{N}\) as appropriate.
There are many views of computation. One is functions mapping input to output (\(N \rightarrow N\), or strings to strings, for examples). Another is deciding if a string is in a language.
\(M\) accepts a string \(w\) if \(M\) halts on input \(w\).
\(M\) accepts a language iff :math:M` halts on \(w\) iff \(w \in L\).
A language is \(Turing-acceptable\) if there is some Turing machine that accepts it.
Example: \(\Sigma_0 = \{a, b\}\), \(L = \{w \in \Sigma^*_0: w\ \mbox{contains at least one}\ a\}\).
Is this language Turing decidable? Of course. Instead of just running left, invoke another state that means “seen an \(a\)”, and print \(\fbox{Y}\) if we reach \(\#\) in that state, \(\fbox{N}\) otherwise.
Every Turing-decidable language is Turing-acceptable, because if the machine would have printed \(\fbox{Y}\), then the machine can halt instead, or if the machine would have printed \(\fbox{N}\), then it can hang left.
Is every Turing-acceptible language Turing decidable? This is the Halting Problem.
Of course, if the Turing-acceptible language would halt, we write \(\fbox{Y}\). But if the Turing-acceptible language would hang, can we always replace it with logic to write \(\fbox{N}\) instead? Example: Collatz function.
28.22.1.4. Making More Complicated Machines¶
Lemma: If
for string \(w\) and
then
Insight: Since \((q_2, w_2\underline{a_2}u_2) \vdash^*_M (q_3, w_3\underline{a_3}u_3)\), this computation must take place without moving the head left of \(w_2\) The machine cannot “sense” the left end of the tape. (And if it had moved left, it would have hung.) Thus, the head won’t move left of \(w_2\) even if it is not at the left end of the tape.
This means that Turing machine computations can be combined into larger machines:
\(M_2\) prepares string as input to \(M_1\).
\(M_2\) passes control to \(M_1\) with I/O head at end of input.
\(M_2\) retrieves control when \(M_1\) has completed.
Here are some basic machines and notation
\(|\Sigma|\) symbol-writing machines (one for each symbol): Any give letter \(\sigma\) has a symbol-writing machine named \(\sigma\).
Head-moving machines, named \(R\) and \(L\), move the head appropriately.
Start state indicated with \(>\).
Transitions on anything other than (for example) \(\#\) are labeled \(\overline{\#}\)
Multiple copies of a machine get a superscript: \(R^2\) means move right twice.
28.22.1.5. Turing Machine Extensions¶
When we give extentions or new functionality to a computing system, sometimes they change something fundamental about the capabilies of the system. For example, when we add non-determinism to an algorithm, we might change the cost of the underlying problem from exponential to polynomial time. But, other changes do nothing fundamental. In terms of Turing machines, our concern is what the machine can do, rather than how long it takes to do it. Does non-determinism help us to solve the Halting problem? No. Likewise, the following extensions do not increase the power of Turing Machines.
Provide a two-way infinite tape
This does not give Turing machines new capability. To make this clear, we can simulate the behavior of a two-way infinite tape using a standard one-way infinite tape. Just bend infinite tape in the middle, and store both directions of the tape into a single cell. This requires a greatly expanded alphabet, because we now need to be able to represent any combination of two characters. This will need more states, and probably more time. But it does not allow anything new in terms of capability.
Multiple tapes (each with its own head)
Again, we can simulate this with encoding multiple symbols into a single table cell. For example, to simulate two tapes (each with a head), we encode in each cell the corresponding two symbols, and a two binary markers to indicate if the tape head is currently in the corresponding cell of the two tapes.
Multiple heads on one tape
This is easier than encoding multiple tapes. We merely encode the heads onto the tape, and simulate moving them around.
A two-dimensional
tape
All that we need to do is find a mapping from 2D to 1D, which is fairly easy. One approach is to work in diagonals, in the order (0, 0), (0, 1), (1, 0), (0, 2), (1, 1), (2, 0), and so on.
Non-determinism
We can simulate nondeterministic behavior in sequence, doing all length 1 computations, then length 2, etc., until we reach a halt state for one of the non-deteriministic choices. So we see that while non-determinism can save a lot of time, it does not change what can (eventually) be done.