## 12 July 2007

### Sound and Complete

A certain type of researchers value papers based on whether they have a soundness and completeness theorem. I will refer to them as the Cult of the Sound People. In this post I will try to explain in simple terms what soundness and completeness means, what it does not mean, why it is important, and why it is not important. Follow me.

As an example, consider a tool that is supposed to detect bugs in programs. We first define some abstract but precise ideal for what the tool should do. Then we design a real tool and, in the process, we might deviate from the ideal. In this example, the ideal might be the following: If there is any input for which the analyzed program ends with a runtime exception then the tool should tell us that something is wrong; otherwise the tool should be quiet. When you get to implementing the tool you might decide in some cases that some bugs are so hard to find and so unlikely that you are not even going to try. That makes the tool unsound. In other situations you see maybe that a certain usage pattern is almost always a bug but in certain situations, also hard to detect, it is not. So you decide to give a warning always. That makes the tool incomplete. In short, sound means that no bugs are missed, and complete means that correct programs will not make the tool complain about bugs.

To be more precise we consider a language and a set of rules to manipulate propositions in that language. The set of rules is sound if it generates only true propositions when it starts with true propositions. The set of rules (plus a set of propositions known to be true) is complete if it can prove any true proposition.

In the example above the (unspecified) language is something capable of saying "the program (enclose program here) does not produce run-time exceptions." The tool is basically a set of rules that tries to prove such statements. A sound tool will prove such a statement only if the statement is correct; a complete tool will prove such a statement if the statement is true.

The language and the rules should be precise enough so that meta-proofs about them can be done. To take an example, consider boolean expressions that use the (associative) operators \/ and /\ with operands being variables (x, y, ...) and constants (0, 1). An example proposition in this language is a\/(b/\0). Is this true'? Don't confuse true' as used earlier with the constant 1 in this particular language! In fact, most people say valid' for what I named true' earlier just to avoid the confusion with 1 (which some people call `true'). OK, I hope you're not too confused about this terminology mambo-jumbo. Anyway, the idea is that we need some definition of when a proposition is valid/true/OK. In the case of this language the tradition is to say that you get the answer 1 whatever constants you substitute for variables and then evaluate according to the usual rules to compute \/ and /\ between constants. This definition also gives an algorithm to decide whether a proposition is true. But for more complicated languages it may be possible to have a definition but not an algorithm! In particular, whenever you say "for all real numbers x, property (insert property here) must hold," you are potentially in trouble because the trivial algorithm of iterating over all real numbers... does not exist. There are way to many real numbers, there's more of them than natural numbers.

Anyway, it's probably clear by now that whenever possible it is desirable to have sound and complete algorithms, sets of rules, tools, and so on. It should also be clear that you can't claim something to be sound or complete unless the language and the set of rules are formal enough to allow meta-proofs about them. In particular the ideal of "not producing run-time exceptions" needs to be complemented with a mathematical definition of what the program does when it is run, the so-called semantics of the language.

Even if such mathematical definitions exist, the tool is proven sound (or complete) with respect to those rules. It is still possible to have a program crash even if the tool says that won't happen because... the interpreter that actually runs the program does not correspond to the mathematical definitions of the semantics of the language. Some more proofs are needed. Even worse, even if the interpreter is proven correct with respect with those rules, well, we might have a malfunctioning DRAM and the program can still go wrong. Does this mean the battle is lost? Does it mean we should stop coming up with all these proofs that do nothing?

Of course not! A program that was proven correct has much less chances of being wrong than one that was not. (Folklore fact: Proven programs need testing.) And a program that was certified as bug free by a program that was proven to be correct has even less chances of being wrong. And if it is run by an interpreter (or, what do they say these days? ah, virtual machine) that was proven to be correct then the chances of being wrong are even less. And... well you get the idea. All we do with these proofs and meta proofs is considerably reducing the chances of programs going wrong. But there is some effort involved in this proving and the challenge is to find a sweet-spot. When does proving become more of a time-consuming sport and a self-justifying contemplation than something really useful?

If you would develop a tool like the one described here would you strive for soundness? Completeness? Both? Would you do the same if the goal is to obtain a tool that is useful to programmers in the industry and that makes them more efficient?