27 September 2009

There is no segfault: sort of an answer

[edit: updated link to dead cat]

My attempt at answering my own puzzle.

But first I should acknowledge that the title comes from the song There is no dead cat. Second, I should note that my presentation of what induction means is unnecessarily redundant: The base case is included in the induction step given that there is a smallest element. (And, yes, that’s my strange sense of humor.)

OK, now the challenge: Under which conditions can we conclude P(n) for all nS given that ∧kn P(k) implies P(n) for all nS?

Read literally the question is not interesting: “Under which conditions p implies q?” has the obvious and useless answer “q, since pq implies q.” So, why is this an interesting question? Because we reduce the former to the later whenever P(n) has the form “the piece of code n is OK,” for some value of OK, one example being “does not dereference a null pointer.” Another reason that makes the question interesting to me is that I feel we are justified in looking at code one function at a time to assess whether it “goes wrong” but I don’t know it, because I never read or gave a coherent explanation: The matter is always treated as “obvious” as far as I can tell. Also, I formulated the problem in more general terms so that the answer has a chance of being useful in a different setting than program verification. Finally, I vaguely remembered reading that modular reasoning about a system is a form of coinduction: After reading a little about coinduction I’m not so sure anymore. (I looked at slides introducing coinduction in Coq by Yves Bertot and at blogs like n-category and a neighborhood of infinity. Reading through this tutorial is slow.)

The intuitive reason why we can analyze program pieces one at a time is that any counterexample is an execution that has one place where null is dereferenced. A bit more precise: An execution is a sequence of states. A state is a pointer to the next program statement to be executed plus the content of the memory (that is, a map from variables to values). The wrong state is special. The assumption ∧kn P(k) amounts to “ignore executions that go to the wrong state after executing a statement of a module kn.”

Problem. Can we use modular reasoning to put upper bounds (other than zero) on the number of errors that can occur in an execution? For example, is it possible even in principle to conclude that “this program may break its contract at most three times during its execution” without doing a global analysis? I suspect not.

Is there a useful way to generalize the answer with “executions”? Since I’ve been reading about model theory I’m inclined to view the program as a theory whose models are its executions. In this language the trick is that we can systematically explore parts of a model where a formula may not hold, assuming that in all other parts it does hold.

For the record, I’m not satisfied with my answer.

No comments:

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.