..and how they can help us understand our DNA
Turing machines are the ultimate computers. A Universal Turing machine can compute so well, it can even compute other Turing machines that can then compute everything else (that is, every problem that can be solved by an algorithm).
Turing machines are the foundations of every computer you will encounter. The famous Von Neumann architecture was inspired by Turing’s ideas about how to set up computing devices in principle (although you add things like a Random Access Memory to make things go much more efficiently).
The Turing machine as envisioned by Turing in 1936 is very simple. You can implement it physically in a hundred different ways. It’s much more than a specific computing device. It’s a mathematical model more than it is an actual machine. And it is very powerful: in the last 80 years, no one has come up with anything that can’t be computed by a Universal Turing machine.
Therefore, Turing machines don’t only give guidance into how to set up a computer, but they transcend their pragmatic applications. Their fundamental structure defines the laws of computation itself in a generality that makes it likely to assume that they have to hold everywhere in the universe.
Turing ended up using them to solve Hilbert’s Entscheidungsproblem (see here for the original paper), the question of whether there is an algorithm that can test wether a statement is provable in a given set of axioms.
But how do you get one of these magical Turing machines? How can you even think of building a machine with such powers? The answer is: you cannot only build one. You can even be part of one.
Get a piece of tape, a pen and a notebook.
Like I said, Turing machines can be physically implemented in many ways. The one I’m talking about here is close to Turing’s original conception, but is in some way even simpler, because there is not technical, abstract machinery (beside your brain) involved whatsoever.
The tape is the information storage of the Turing machine. It can literally just be a long piece of paper (in the best case, infinitely long, but you probably don’t have one of those at home at the moment), subdivided into small rectangles. On the tape, you have a lot of blank spaces that you can draw things into.
The things that you draw onto the rectangles are called symbols. Their purpose is to transport information forward in time and make it acessible to computation.
The information they carry forward in time always depends on the code you use. Symbols only mean something in a context: this is what Shannon’s information theory teaches us. The symbols are part of a finite set called an alphabet. The actual alphabet is a good example of an alphabet: a finite set of arbitrary shapes that we have agreed on to correspond to noises we make with our mouth.
You can draw apples, if you know what an apple is supposed to mean. You can use your fantasy, or even use the word “fantasy” as a symbol. The possibilities are endless.
Computers usually stick to what they know: they use ones and zeros, which is our way of representing two minimally distinguishable symbols. You could use 1s and 2s instead, or apples and oranges (although that could lead to problems if you need to compare them) and so on and so on. The only important thing is that they can be distinguished when read out.
You, equipped with a Pen
You are sitting besides the tape with a pen in your hands. You know how to draw really well so you can draw all the possible symbols onto the tape. Further, you know how to identify and distinguish all possible symbols after looking at them.
The notebook carries the instructions that tell you how to work with the tape. They tell you what to do in all possible situations you will be faced with when operating the machine.
There are a couple of different, numbered lines in the notebook, carrying two kinds of information:
- what you are supposed to do on the tape when seeing a certain symbol
- to which line to jump afterwards in the notebook
You can do two different things on the tape:
- You can move one rectangle to the right or to the left, or you stay where you are
- You can delete the symbol you are currently seeing and draw a different symbol, or you can do nothing
The lines in the notebook tell you what to do. Let’s say there are 8 lines in the notebook. Line 1 holds the first instructions. Assuming we have just two kinds of symbols, 1 and 0, and in the beginning the tape is by default filled with zeros, they can for example read:
“If you read a 0: leave the 0, move one to the left and go to line 5 in the notebook.
If you read a 1: draw a 0, move one to the right, go to line 4.”
After carrying them out, you find yourself in a different line in your notebook. You move around in the notebook and on the tape until you end up in the final line. Let’s say this is line 8 of the instructions:
“If you read a 0: draw a 1 and end the procedure.
If you read a 1: draw a 0 and end the procedure.”
This final line ends all moving along and reading and writing on the tape.
That’s it. This is the basic setup of a Turing machine. While things can still become a little more difficult when trying to do actual computations with it, the basic premise is extremely simple.
The Main Components
To summarize, the main components of a Turing machine are
- A finite set of symbols called an alphabet
- The (infinite) tape which stores the symbols and functions as the memory of the machine.
- A movable read/write head operating on the tape. It can move one step to the right, one to the left, or stay where it is, depending on the instructions.
- A notebook carrying the instructions, including the initial state and the final state of the machine. This can be thought of as the processor of the machine.
Doing actual Computations
To carry out a computation, you need to find a procedure (lines of instructions in the notebook) that can manipulate the symbols on the tape in a sensible way as to carry out the desired computation.
You can work out procedures for adding, substracting, multiplying and dividing numbers on a Turing machine. And even more than that: while things get messy quickly, Turing’s claim that you can do every possible computation with a Turing machine is still unrefuted after 80 years.
Let’s say you are trying to add two numbers. What you first need to know is how to represent the numbers on the tape. The representations you choose determine what symbols you have to use, and the code tells you what the symbols mean. This is a crucial point. The machine itself does not need to know what the symbols it is manipulating mean. If you use 1s and 0s as your basic symbols, a composite symbol like 1110000010101001 can mean infinitely many things depending on the code you are using.
If we’re talking about numbers, there is a standard way of representing them with 1s and 0s. This a possible code to use, but you could also use the so-called unary code: a 1 is a 1, a 11 is a 2, a 111 is a 3 and so on.
You start out with the two numbers you want to add up written onto the tape in a way that is consistent with the instructions: the lines in your notebook need to be tailored to the code and the symbols you are using.
Let’s say you want to add 4 and 8. You write them onto the tape in binary:
Input: 1 0 0 $ 1 0 0 0
Note here that the dollar sign is a new symbol that tells the machine where the space between the numbers is, so in effect you have three different symbols in your alphabet. This isn’t strictly necessary. Any possible finite alphabet can be encoded by 1s and 0s, just as our decimal numbers can be encoded by it. This just to make the point that you can add symbols to make things more concise.
Then you can work out a procedure (lines in a notebook) that manipulates them, ending up with the result 12 in binary:
Output: 1 1 0 0
These procedures are readily found for simple arithmetic problems, although I’m leaving out the details here (you can read up here on how to add binary numbers with Turing machines explicitly).
Universal Turing Machines
Turing realized that you don’t really need a notebook when you can put all the information from the notebook straight onto the tape itself. Universal Turing Machines are kind of like Turing machines, but without the notebook. The notebook is essentially just a piece of paper carrying information, which the tape already is as well. So you don’t need two pieces of paper to get the job done.
This means that the tape can carry instructions for a specific Turing machine (like the 8 lines of instructions we had in the notebook earlier).
You first put these 8 lines of code onto the tape, and further add to the tape the input you would have given to the Turing machine.
The Universal Turing Machine then executes the function as if it were the Turing machine described by these 8 lines of instructions, and gives the same output as if we had used this Turing Machine ( with notebook) in the first place.
Note that this is pretty much what a modern computer does: the instructions on how to do run the computer are part of the computer itself. We call them software.
Turing Machines and DNA
There are a lot of correspondences between the workings of a Turing machine and how our DNA functions, although they differ in some regards: the purpose of the DNA is different to that of a Turing machine, as DNA is not computing anything. It nevertheless provides a cool angle on the computer-scientific foundations of our genes.
Let’s go through the components one by one:
- The alphabet consists of the four bases ( A, T, G and C). These are the symbols written on the tape.
- The code connects the symbols to the amino acids they are coding for. It is very interesting to note that the code really is purely symbolic, as in there is no chemical necessity that explains the structure of the codon (like a base pair sequence ACG or CAG) and the amino acid it is coding for.
- The instructions on how to build a protein are called a gene. They can be thought of as the overall instructions for the machine, like an algorithm for building a certain protein.
- The tape is the DNA itself. It carries the composite symbols (the codons).
- The read-write head is a bit different from the one in a conventional Turing machine: it is more of a read-head. It moves along the DNA in one direction transcribing the codons to the messenger RNA, which in turn end up telling the ribosomes which amino acids to string together.
- Transcription factors are proteins that bind to a DNA sequence and turn on and off the transcription of certain genes. They can in some sense be thought of as determining the state of the machine by changing what part of the tape is transcribed and what isn’t.
- There are also start/stop codons (bases sequences) which mark the initial state and the final state of the transcription process.
- The output itself can be thought of as the proteins, althought this kind of output is very different in nature than that of a normal Turing machine (which would be something written on the tape).
It is absolutely remarkable to me that nature has found a way to construct this machinery through an evolutionary mechanism, and even more so from the formal perspective of Turing machines.
And it’s similarily incredible how far humans have come in the last 80 years when it comes to building computers. Turing’s work was the starting point for understanding the possibilities of computing devices (although Charles Babbage and Ada Lovelace had done great work a century before with the difference machine), and for understanding that all computations could be broken down to the execution of extremely simple tasks.
Nevertheless it would have been madness back then to have the audacity to even think of building the device you are probably reading this article on right now (this was even before the days when Watson alledgedly said that he thought that “…there is a world market for maybe five computers.”).
The modern computer stands tall as one of the greatest testaments to the ingenuity of the human imagination. Of our ability to dream up something in our mind, to show that it could in principle work, and to have the boldness and perseverance to carry through with making it happen.