The Undecidability of the Halting Problem
DRAFT
One of the foundational results of Computer Science is that there is no program which can reliably tell if another program will halt on a given input.
This page presents a formal proof of this fact.
FIXME: some more blurb:
The unsolvability of the halting problem means that we can't (for most program languages) write programs that will soundly and completely automatically check a program for some property. E.g.:
Will this program ever output 1?
Will this program ever output a number between 0 and 100?
Will this program ever issue an instruction to
Vocabulary
Ways of Building Programs and Data
The following function symbols give us ways of building programs and data values.
true()
,false()
,loop()
are all basic programs that ignore their input and (a) returntrue
, (b) returnfalse
, (c) loop forever producing no answer.pair(x,y)
represents a pair of data itemsx
andy
. We will model programs taking multiple inputs by giving them inputs as pairs.duplicate(p)
is a program that duplicates its inputx
to a pairpair(x,x)
, and then executes as the programp
with that pair as input.if(p1, p2, p3)
is a program that runsp1
on the input, ifp1
returnstrue
then it runsp2
and ifp1
returnsfalse
then it runsp3
.
We will only need to assume this minimal set of function symbols for building programs to prove that the halting problem is undeciable. We do not say in our axiomatisation below that these are the only ways of building programs, only that the underlying model of computation must have at least these ways of constructing programs.
The Execution Predicate
We define our world of computational things via one predicate:
exec(program, input, output)
-- meaning that when we runprogram
oninput
the result isoutput
. Note that there may be no output for a given input (i.e. the program never gives us an answer), or there may be multiple possible answers for an input (i.e. the program may be non-deterministic).
We do not distinguish between things that are program-like and things that are data-like. In particular, a program can take itself as an input. This flexibility of self reference will be crucial for stating the halting problem and proving that it is undecidable.
Axioms
Next we have axioms that tell us how each of the different kinds of program executes. We'll not actually need all of these axioms to complete the proof of the undecidability of the halting problem, but they serve to show how one can use logic to specify how programs execute.
These axioms are meant to establish some basic properties of computation that any realistic kind of programs ought to satisfy: we can output fixed values, duplicate data, not give answers, and make decisions.
The true
and false
programs
The program true()
outputs true()
for any input:
∀x. exec(true(), x, true())
and if it outputs anything, then that thing is equal to true()
:
∀x. ∀y. exec(true(), x, y) → y = true()
Similarly, false()
outputs false()
for any input:
∀x. exec(false(), x, false())
and if it outputs anything, then that thing is equal to false()
:
∀x. ∀y. exec(false(), x, y) → y = false()
The loop
program
The program loop
never outputs anything:
∀x. ∀y. ¬exec(loop(), x, y)
The duplicate
program
The program duplicate(p)
acts like I said above:
∀p. ∀x. ∀y. exec(p, pair(x, x), y) → exec(duplicate(p), x, y)
and this is the only way it acts:
∀p. ∀x. ∀y. exec(duplicate(p), x, y) → exec(p, pair(x, x), y)
The if
program
The program if(p1,p2,p3)
acts like I said above. We split into two cases, one for when the condition p1
returns true
:
∀p1. ∀p2. ∀p3. ∀x. ∀y. exec(p1, x, true()) → exec(p2, x, y) → exec(if(p1, p2, p3), x, y)
and one when the condition returns false
:
∀p1. ∀p2. ∀p3. ∀x. ∀y. exec(p1, x, false()) → exec(p3, x, y) → exec(if(p1, p2, p3), x, y)
We also need to specify that this is the only way that if
executes:
∀p1. ∀p2. ∀p3. ∀x. ∀y. exec(if(p1, p2, p3), x, y) → ((exec(p1, x, true()) ∧ exec(p2, x, y)) ∨ (exec(p1, x, false()) ∧ exec(p3, x, y)))
Explanation
Why do we need “both directions” for the axioms for dup
and if
? This is because we are going to have to reason backwards about program execution to answer questions like “if the output was y
, then what happened during the program?”.
What does it mean for a program to halt?
We can use the exec
predicate to define what it means for a program to halt. A program p
halts on an input x
if there exists an answer y
that executing p
with input x
gives the output y
:
∃y. exec(p, x, y)
What does it mean for a program to solve the halting problem?
To specify when we have a program that solves the halting problem, we use a predicate solution(p)
. Any solution must satisfy the following four properties, so we say that solution(p)
implies each one:
A solution must always say “true” or “false”:
∀p. solution(p) → (∀q. ∀x. exec(p, pair(q, x), true()) ∨ exec(p, pair(q, x), false()))
A solution must never say “true” and “false”:
∀p. solution(p) → (∀q. ∀x. exec(p, pair(q, x), true()) → exec(p, pair(q, x), false()) → F)
If a solution says “true” for a given
p
andx
, then executingp
onx
halts:∀p. solution(p) → (∀q. ∀x. exec(p, pair(q, x), true()) → (∃y. exec(q, x, y)))
If a solution says “false” for a given
p
andx
, then executingp
onx
does not halt (i.e. loops):∀p. solution(p) → (∀q. ∀x. exec(p, pair(q, x), false()) → ¬(∃y. exec(q, x, y)))
FIXME: put some quiz questions here.
Any solution to the halting problem always halts.
If we didn't have this property, then there is an easy solution to the halting problem: run the program, if it halts then output 'true', otherwise never give an answer.
What if we had a solution to the Halting Problem?
If we had a solution to the halting problem, then we could use it to make larger programs.
One program we can make is the following. Let's say we have a solution p
to the halting problem, then we could make the following program (in pseudocode):
define spoiler(x):
if(p(x,x)):
loop-forever
else:
return true
So this program takes an input x
and asks p
whether or not x
will halt when given itself as an input. If p
says “true”, then it loops forever; if p
says “false”, then it returns “true”. We'll prove these facts formally from the axioms below.
We call this program spoiler
because we can use it to prove that there cannot be any solution to the halting problem.
We can write this program in the format described above like so:
if(duplicate(p), loop(), true())
Solution says “halts”, then spoiler loops
We can prove formally from the axioms that, if the solution p
says “true” then the spoiler program does not halt:
FIXME: put a rough plan of the proof here.
Solution says “loops”, then spoiler halts
Conversely, we can prove from the axioms that, if the solution p
says “false” then the spoiler program does halt:
FIXME: put a rough plan of the proof here
Undecidability of the Halting Problem
Given these two facts we have proved about the spoiler program, we can now prove that there can be no solution to the halting problem. The proof goes like this:
To prove that there cannot be a solution, we assume that there is a solution
p
and proveF
(“false” as a logical proposition) to show that there is a contradiction.Since
p
is a solution it must either say “true” or “false” when given the spoiler program as both the program and its input. This means the proof splits into two cases:If
p
says “true”, then we know that:The spoiler program must halt when given itself as an input, because
p
is a solution to the halting problem; andThe spoiler program must not halt when given itself as an input, because
p
said “true” so we can use the first result above.
We cannot have that a program both halts and does not halt, so we can prove
F
.If
p
says “false”, then we know that:The spoiler program must not halt when given itself as an input, because
p
is a solution to the halting problem; andThe spoiler program must halt when given itself as an input, because
p
said “false” so we can use the second result above.
We cannot have that a program both halts and does not halt, so we can prove
F
.
We have proved
F
in both branches of the proof, so we have a contradiction. The only assumption we made (apart from our basic assumptions about programs) is that there is a solution to the halting problem. So it must be impossible for such a solution to exist.
We can carry this proof out formally from our axioms of computation and the two results we proved above:
What does this proof show?
Ultimately, this proof shows that it is not possible to have the following things simultaneously:
Further reading
TBD...