Turing Machines¶
1. Turing Machines¶
1.1. A General Model of Computation¶
We would like to define a general model of computation that is as simple as possible. The reason is that we want to be able to understand the limits of what is possible in computing, but that is rather hard to do with a complicated definition for a “computer” is. But then, we need to be confident that whatever model we do pick, that it actually represents all of the fundamental capabilities of a “computer”.
“State machines” are simple to understand. There are a number of different state machines, with a range of capabilities. We will discuss a particular one, called a Turing machine. As we define “capability”, the key is ability, not efficiency.
The necessary capabilities for any such “machine” are these:
Read
Write
Compute
A Turing machine is defined as follows. It has a one-dimensional tape, divided into squares. This tape extends infinitely to the left and to the right. Each square can store one character. The machine has a single I/O head that at any instant in time is “on” one of the squares. The control unit of the machine is defined by a set of abstract states. At any given instant, the machine is said to be “in” one of the states, and has a set of actions that can be performed when in that state. From the current state, the machine will read the symbol on the current square, and will then do the following (depending on the value of the symbol that it sees):
Overwrite that symbol with a new symbol,
Move the tape head either left (\(L\)), right (\(R\)), or stay (\(S\))
The letters that can appear on the tape are an important part of the definition for a given Turing machine. The alphabet of the machine is these letters that may appear in the input. In addition to the letters of the alphabet that can define an input string, there is also the blank character. When talking about strings, since a blank is hard to see, we will use the \(\#\) character to represent a blank character. Note that including \(\#\) in the alphabet is for convenience only. We want to be able to read our specifications without being confused.
The input to the machine is the initial contents of the tape, which is described by listing all of the tape squares from the leftmost non-blank tape cell to the rightmost non-blank tape cell. Naturally, there must be a finite number of non-blank symbols on the tape. And the input string might have some blank squares in between the non-blank squares that define the input.
Now, we know that at any instant the machine is in some state, and that the input head is under some square on the tape. The machine reads the symbol, and responds by going to some state (possibly the current state, possibly another state), writing some letter onto the square (possibly the same letter as is currently in the square, possibly another), and then moving the head either one square left, one square right, or leaving the head in the current square. That is everything that the machine can do.
A Turing machine is formally defined as (\(Q\), \(\Sigma\), \(\Gamma\), $s$, \(F\), \(\delta\)) where
\(Q\) is a finite set of states.
\(\Sigma\) is an alphabet. This is used to define the input.
\(\Gamma\) is another alphabet that at least includes \(\Sigma\) and \(\#\). It might include other symbols, and is the alphabet used by the computational process.
\(s \in Q\) is the initial state.
\(F \subset Q\) is the set of final states.
\(\delta\) is a partial function from \(Q \times \Gamma\) to \(Q \times \Gamma \times \{L, R, S\})\).
The machine operates as follows: For \(q \in Q\), \(a \in \Sigma\) and \(\delta(q, a) = (p, b, m)\), when in state \(q\) and scanning \(a\), enter state \(p\), replace \(a\) with \(b\), and move the head (\(m\) is \(L\), \(R\), or \(S\)).
To do computation, we have to have some conventions about starting and ending the process. The machine stops immediately if (1) it enters any final state, or (2) it is in a state and scans a character for which there is no transition. (Note that there are many ways to define Turing Machines, and some definitions require an explicit reject state. We do not.)
Example 1
\(M = (Q, \Sigma, \Gamma, s, F, \delta)\) where
\(Q = \{q_0, q_1\}\),
\(\Sigma = \{a\}\),
\(\Gamma = \Sigma \cup \{\#\}\),
\(s = q_0\),
\(F = {q_1}\),
\(\delta =\)
\[\begin{split}\begin{array}{lll} \hline q&\Gamma&(q, \Gamma, \{L, R, S\})\\ \hline q_0&a&(q_0, \#, R)\\ q_0&\#&(q_1, \#, S)\\ \end{array}\end{split}\]
This machine will scan right, changing any \(a\) that it sees to a \(\#\). When it first sees a \(\#\), it will halt.
We can also describe a Turing Machine as a graph, whose nodes are the states in \(Q\) and with edges corresponding to the transitions in \(\delta\). Further, we can visualize the processing of the machine as the movement of a head across the tape. In the visualizations, the current tape square (the one where the head is currently located) is highlighted.
Example 2
Here is an example of a machine that is slightly more complicated. This Turing machine accepts the language \(L(a^*b^*c^*)\).
\(M = (Q, \Sigma, \Gamma, s, F, \delta)\) where
\(Q = \{q_0, q_1, q_2, q_3\}\),
\(\Sigma = \{a, b, c\}\),
\(\Gamma = \Sigma \cup \{\#\}\),
\(s = q_0\),
\(F = {q_2}\),
\(\delta =\)
\[\begin{split}\begin{array}{lll} \hline q&\Gamma&(q, \Gamma, \{L, R, S\})\\ \hline q_0&a&(q_0, a, R)\\ q_0&b&(q_1, b, R)\\ q_0&c&(q_2, c, R)\\ q_0&\#&(q_3, \#, S)\\ q_1&b&(q_1, b, R)\\ q_1&c&(q_2, c, R)\\ q_1&\#&(q_3, \#, S)\\ q_2&c&(q_2, c, R)\\ q_2&\#&(q_3, \#, S)\\ \end{array}\end{split}\]
However, this specification is missing something important. Regardless of what input you give it on the tape, it will execute something and eventually halt. But how do we know if the machine has determined that the string is in the language or not? The answer is that we use a convention. First, we only care about what happens when the machine starts with the head scanning the first non-blank character. Second, we use the convention that the string is accepted as being in the language if the machine halts in a Final State (\(q_3\) in this case), and the string is rejected if the machine halts by following an undefined transition. For example, on the string “abac”, when the second ‘a’ is encountered there is no transition for what to do. So, the machine halts, and we interpret this to mean that the string has been rejected since it is not currently in a Final State.
Here is the graphical view of the machine and a demonstration for how it works.
1.2. Interpreting Turing Machines¶
A configuration for a Turing machine looks like this:
This means that the TM is on state \(q\), the tape contains \(\underline{a}aba\) and the read/write head position is on the underlined ‘a’. Recall that we assume at the start of processing input for any TM, the read/write head position is on the leftmost non-blank character.
Don’t forget that the tape is infinite in both directions. So to the left of the leftmost ‘a’ in this configuration is an infinite number of blank squares, and to the right of the rightmost a is also an infinite number of blank squares.
A halted configuration occurs when the machine does not find a move from the current state using the current tape letter (the current configuration). In other words, a TM halts if there is no \(\delta\) defined. Note that we never define any transitions out of any Final State. So there is some redundancy when we said earlier that the machine halts when either it is in any Final State, or when there is no current transition. But having two such definitions for halting makes it easy to define the difference between accepting and rejecting a string.
A computation is a sequence of configurations of some length \(n \geq 0\).
Example 3
Recall the TM example that erases all a’s from the tape. Here are the configurations for the input “aaaa”.
Notation: Given a string \(w\), the notation \(\underline{w}\) for a configuration means that the read/write head is scanning the leftmost character in \(w\).
\(M\) is said to halt on input \(w\) iff \((s, \underline{w})\) yields some halted configuration.
\(M\) is said to hang on input \(w\) if \((s, \underline{w})\) yields some hanging configuration.
Wait, what? What is a “hanging” configuration? The machine hangs when it goes into an infinite loop. Anytime you provide the mechanism to create loops that only end on a condition, you have also created the conditions that might allow an infinite loop to happen. Consider the following machine on strings of a’s and b’s that scans right until it sees a ‘b’. If it never sees a ‘b’, then it will never halt. This means that it goes into an infinite loop (or hangs) anytime the input string does not contain a ‘b’.
1.3. Turing Acceptors and Turing Transducers¶
Turing machines compute functions from strings to strings. Formally: Let \(f\) be a function from \(\Sigma^*_0\) to \(\Sigma^*_1\). Turing machine \(M\) is said to compute \(f\) when, for any string \(w \in \Sigma^*_0\), if \(f(w) = u\) then
for some state \(h \in F\) (that is, a Final State for \(M\)).
Such a function \(f\) is said to be a Turing-computable function.
Here is how we express multiple parameters: For \(f(w_1, ..., w_k) = u\),
One way to express functions on natural numbers is to represent a number using unary notation. (Remember, we are not concerned about what is efficient, we are concerned about what is possible.) In this case, we represent the value 0 as an empty string. We say that \(f: \mathbb{N} \rightarrow \mathbb{N}\) is computed by \(M\) if \(M\) computes \(f': \{I\}^* \rightarrow \{I\}^*\) where \(f'(I^n) = I^{f(n)}\) for each \(n \in \mathbb{N}\).
Example 4
Compute \(f(n) = n + 1\) for any \(n \in \mathbb{N}\).
An example computation:
In general, \((q_0, \#\underline{I^n}\#) \vdash^*_M (h, \#I^{n+1}\underline{\#})\). What about \(n = 0\)? The input is no marks in unary, and it works OK (that is, the result is the head to the right of a single mark).
1.4. Turing-Decidable vs. Turing-Acceptable Languages¶
Recall that we defined a convention for accepting/rejecting whether an input string is in a specified language: The string is accepted as being in the language if the machine halts in a Final State, and the string is rejected if the machine halts by following an undefined transition. The key here is that the machine halts (with separate mechanisms for accept or reject). We define a language to be Turing-decidable if every string results in one of these two outcomes.
Unfortunately, there is a third possible outcome: The machine might go into an infinite loop.
We can define another concept: \(Turing-acceptable\). We say that machine \(M\) accepts a string \(w\) if \(M\) halts on input \(w\). Then,
\(M\) accepts a language iff \(M\) halts on \(w\) iff \(w \in L\).
A language is \(Turing-acceptable\) if there is some Turing machine that accepts it.
So, a language is Turing-decidable if it halts on every input, in two different ways so that we can tell if the string is in the language or not. Separately, a language is Turing-acceptable if it halts on strings in the language, and does not halt on strings not in the language.
It is easy to turn any Turing-decidable machine into a Turing-acceptable machine. If the machine would reject the string, then simply go into an infinite loop by moving right regardless of the value of the symbol seen. But, can every Turing-acceptable machine be converted into a Turing-decidable machine?
Consider this example: Example: \(\Sigma_0 = \{a, b\}\), \(L = \{w \in \Sigma^*_0: w\ \mbox{contains at least one}\ a\}\).
This machine is Turing-acceptable. It halts if it sees an ‘a’, and it hangs if there is no ‘a’.
Is this language Turing decidable? Of course. Instead of running right when a # is seen, the machine can halt. Here is the modified machine:
All that we have done is remove the transition for what to do when a blank symbol is seen. Thus, the machine halts instead of moving to the right (thus starting the infinite loop).
(You might ask: But what if there is an ‘a’ to the right of the #? Recall that we only care about the machine’s behavior when it begins in a legal start configuration.)
But, we can ask again: Is every Turing-acceptable language Turing decidable? In other words, whenever the Turing-acceptable machine would hang, can we always replace it with logic to trigger a non-existant transition instead? This is known as the Halting Problem.
It turns out that we can prove that there are always languages that cannot be converted from Turing-acceptable to Turing-decidable. In other words, we can prove that the Halting Problem is unsolveable.
1.5. Making More Complicated Machines¶
Obviously, Turing Machines can take an input and modify it. We will see examples of how this leads to powerful computational capability, even if it does not seem yet like they are so powerful. To get a quick idea of their power, consider the following relatively simple machine to accept \(L(a^nb^nc^n)\). This is significant, because this language is in fact not context free! Which means that this simple Turing Machine is doing something that no DFA, NFA, or PDA can do!
But while Turing machines might be able to do powerful things, when operating at the individual state level, it can get rather difficult and tedious to program them. In fact, it might feel in some ways like writing machine code or assembly language. The secret to success in modern software development is to build up more powerful tools, especially by packaging behavior together and manipulating the packages. We can hope to build up similar capability with Turing Machines.
Lemma: If
\[(q_1, w_1\underline{a_1}u_1) \vdash_M^* (q_2, ww_2\underline{a_2}u_2)\]for string \(w\) and
\[(q_2, w_2\underline{a_2}u_2) \vdash^*_M (q_3, w_3\underline{a_3}u_3),\]then
\[(q_1, w_1\underline{a_1}u_1) \vdash^*_M (q_3, ww_3\underline{a_3}u_3).\]Insight: Since \((q_2, w_2\underline{a_2}u_2) \vdash^*_M (q_3, w_3\underline{a_3}u_3)\), this computation must take place without moving the head left of \(w_2\) The machine cannot “sense” the left end of the tape. (And if it had moved left, it would have hung.) Thus, the head won’t move left of \(w_2\) even if it is not at the left end of the tape.
This means that Turing machine computations can be combined into larger machines:
\(M_2\) prepares string as input to \(M_1\).
\(M_2\) passes control to \(M_1\) with I/O head at the end of the input.
\(M_2\) retrieves control when \(M_1\) has completed.
Here are some basic machines and notation
\(|\Sigma|\) symbol-writing machines (one for each symbol): Any give letter \(\sigma\) has a symbol-writing machine named \(\sigma\).
Head-moving machines, named \(R\) and \(L\), move the head appropriately.
Start state indicated with \(>\).
Transitions on anything other than (for example) \(\#\) are labeled \(\overline{\#}\)
Multiple copies of a machine get a superscript: \(R^2\) means move right twice.
1.6. Turing Machine Extensions¶
When we give extensions or new functionality to a computing system, sometimes they change something fundamental about the capabilities of the system. For example, when we add non-determinism to an algorithm, we might change the cost of the underlying problem from exponential to polynomial time. But, other changes do nothing fundamental. In terms of Turing machines, our concern is what the machine can do, rather than how long it takes to do it. Does non-determinism help us to solve the Halting problem? No. Likewise, the following extensions do not increase the power of Turing Machines.
Limit the tape to be infinite in only one direction
Our first example actually demonstrates that some limitations do not make a difference. Many textbooks on formal languages present the basic Turing Machine as having a tape that is infinite in only one direction. The folling diagram shows that we can easily simulate a tape infinite in two directions with a one-direction infinite tape.
The idea is to just bend the 2-way infinite tape in the middle, and store both directions of the tape into a single cell. This requires a greatly expanded alphabet, because we now need to be able to represent any combination of two characters. This will need more states, and probably more time. But it does not allow anything new in terms of capability.
Multiple tapes (each with its own head)
Again, we can simulate this with encoding multiple symbols into a single table cell. For example, to simulate two tapes (each with a head), we encode in each cell the corresponding two symbols, and two binary markers to indicate if the tape head is currently in the corresponding cell of the two tapes.
Multiple heads on one tape
This is easier than encoding multiple tapes. We merely encode the heads onto the tape, and simulate moving them around.
A two-dimensional “tape”
All that we need to do is find a mapping from 2D to 1D, which is fairly easy. One approach is to work in diagonals, in the order (0, 0), (0, 1), (1, 0), (0, 2), (1, 1), (2, 0), and so on.
Non-determinism
We can simulate non-deterministic behavior in sequence, doing all length 1 computations, then length 2, etc., until we reach a halt state for one of the non-deterministic choices. So we see that while non-determinism can save a lot of time, it does not change what can (eventually) be done.