21 September 2009

There is no segfault

A short puzzle related to induction.

To prove that all elements of a set S have a property P we can use induction:

• [induction step] we show that the elements of size s have the property P, assuming that all elements of size <s have it, and
• [base case] we show that the smallest elements have the property P

For example, if S is the set of natural numbers and the size of a natural number is itself, then we can prove that 2Σknk=n(n+1) by showing that:

• [base case] 2Σk0k=0, and
• [induction step] 2Σk<nk=n(n-1) ⇒ 2Σknk=n(n+1)

As another example, if S is the set of (non-empty) binary trees in which each node either is a leaf or has exactly two children then we can show that the number of leafs is one more than the number of non-leafs by induction. In this case the size is the number of nodes in the tree. If we denote by l(t) the number of leafs in the tree t and by k(t) the number of non-leafs, then the statement we want to prove is l(t)-k(t)=1 for all trees t. Induction says that it suffices to show:

• [base case] l(•)-k(•)=1, and
• [induction step] (l(t1)-k(t1)=1 ∧ l(t2)-k(t2)=1) ⇒ l((t1,•,t2))-k((t1,•,t2))=1

You may want to take a moment to convince yourself (if you haven't already) that the base case and the induction step are fairly easy to check for both examples above.

But there is another type of induction! Suppose you have a program Q that is a set S of functions and you want to prove the following theorem: Program Q will not dereference a null pointer. One way to approach this task is to prove for each function that it will not dereference a null pointer ("throw NullPointerException" in Java, segfault in C, etc.) assuming that all the other functions in the program do not dereference a null pointer.

However, there is something troublesome about this. Until now we had an order: We think about the first proof by induction as first showing that an identity holds for n=0 then for n=1 then for n=2,... and whenever we get to proving for some value we rely on what was proven "before". But the last proof enjoys no such property: We can happily have a program made out of n functions, each calling all the n functions. Let's put it differently. Say we have four functions: a, b, c, and d. Is it possible for a, b, and c to be OK but d not? No, it isn't, because one of the proof obligations was to show that d is OK given that the others are. But... what's stopping c and d to be both not OK?

Still, I claim the proof is fine. Can you see why it works? What are the general conditions under which such proofs work?

Anonymous said...

The first type of (general) induction you introduced is called well-founded induction and the base case is subsumed by the induction case, so you only need one.

Your last proof is a coinductive proof, not an inductive proof.

rgrig said...

Thanks! Did you see my "answer"? I mention there both these things, and I also mention that I don't understand why the latter is the same as coinduction. I'm trying (during evenings) to go thru some tutorial that I find hard to digest but if you know of a nicer explanaition/resource I'd like to know.

rgrig said...

(Well, I mention the base case is redundant but I didn't mention any of the "well-founded induction"/"strong induction"/"complete induction" names.)