A certain type of researchers value papers based on whether
they have a soundness and completeness theorem. I will refer to
them as the Cult of the Sound People. In this post I will try
to explain in simple terms what soundness and completeness means,
what it does not mean, why it is important, and why it is not
important. Follow me.
As an example, consider a tool that is supposed to detect
bugs in programs. We first define some abstract but precise
ideal for what the tool should do. Then we design a real tool and,
in the process, we might deviate from the ideal. In this example,
the ideal might be the following: If there is any input for which
the analyzed program ends with a runtime exception then the tool should
tell us that something is wrong; otherwise the tool should be
quiet. When you get to implementing the tool you might decide
in some cases that some bugs are so hard to find and so
unlikely that you are not even going to try. That makes the tool
unsound. In other situations you see maybe that a certain
usage pattern is almost always a bug but in certain situations,
also hard to detect, it is not. So you decide to give a warning
always. That makes the tool incomplete. In short, sound
means that no bugs are missed, and complete means that correct
programs will not make the tool complain about bugs.
To be more precise we consider a language and a set of rules
to manipulate propositions in that language. The set of rules is
sound if it generates only true propositions when it starts with
true propositions. The set of rules (plus a set of propositions
known to be true) is complete if it can prove any true proposition.
In the example above the (unspecified) language is something
capable of saying "the program (enclose program here) does
not produce run-time exceptions." The tool is basically a
set of rules that tries to prove such statements. A sound tool
will prove such a statement only if the statement is correct;
a complete tool will prove such a statement if the statement
is true.
The language and the rules should be precise enough so that
meta-proofs about them can be done. To take an example,
consider boolean expressions that use the (associative)
operators \/ and /\ with operands being variables (x, y, ...)
and constants (0, 1). An example proposition in this language
is a\/(b/\0). Is this `true'? Don't confuse `true'
as used earlier with the constant 1 in this particular language!
In fact, most people say `valid' for what I named `true'
earlier just to avoid the confusion with 1 (which some
people call `true'). OK, I hope you're not too confused
about this terminology mambo-jumbo. Anyway, the idea is
that we need some definition of when a proposition is
valid/true/OK. In the case of this language the tradition is to
say that you get the answer 1 whatever constants you substitute
for variables and then evaluate according to the usual rules
to compute \/ and /\ between constants. This
definition also gives an algorithm to decide whether
a proposition is true. But for more complicated languages
it may be possible to have a definition but not an algorithm!
In particular, whenever you say "for all real numbers x,
property (insert property here) must hold," you are
potentially in trouble because the trivial algorithm of
iterating over all real numbers... does not exist. There
are way to many real numbers, there's more of them than
natural numbers.
Anyway, it's probably clear by now that whenever possible
it is desirable to have sound and complete algorithms, sets of rules,
tools, and so on. It should also be clear that you can't claim
something to be sound or complete unless the language and the
set of rules are formal enough to allow meta-proofs about them.
In particular the ideal of "not producing run-time
exceptions" needs to be complemented with a mathematical
definition of what the program does when it is run, the so-called
semantics of the language.
Even if such mathematical definitions exist, the tool is proven
sound (or complete) with respect to those rules.
It is still possible to have a program crash
even if the tool says that won't happen because... the interpreter
that actually runs the program does not correspond to the
mathematical definitions of the semantics of the language.
Some more proofs are needed. Even worse, even if the
interpreter is proven correct with respect with those rules,
well, we might have a malfunctioning DRAM and the program
can still go wrong. Does this mean the battle is lost?
Does it mean we should stop coming up with all these proofs
that do nothing?
Of course not! A program that was proven correct has much
less chances of being wrong than one that was not. (Folklore fact:
Proven programs need testing.) And a program that was
certified as bug free by a program that was proven
to be correct has even less chances of being wrong.
And if it is run by an interpreter (or, what do they say these days?
ah, virtual machine) that was proven to be correct then the
chances of being wrong are even less. And... well you get the
idea. All we do with these proofs and meta proofs is considerably
reducing the chances of programs going wrong. But there is some
effort involved in this proving and the challenge is to find
a sweet-spot. When does proving become more of a time-consuming
sport and a self-justifying contemplation than something really
useful?
If you would develop a tool like the one described here would
you strive for soundness? Completeness? Both? Would you do the
same if the goal is to obtain a tool that is useful to
programmers in the industry and that makes them more
efficient?