20 December 2006

Specifications as comments

Programming bugs are annoying for the users and embarrassing for the programmers. Like real bugs, they are pervasive. If we chose a random bug to analyze, chances are it either stems from a special case the programmer forgot about or it reflects the programmer's intent. Sometimes we can't tell which, but most of the time we can. When the programmer intends something different than the user wants, it is either because he does not know what the user wants or because what the user wants involves too much work. The worst kind of bug is that which eludes classification. Such situations arise from the lack of comments and from obscure code. So, please write comments that document intent whenever the code does not mirror it. Let's look at an example.

  void f(int n, int k) {
    int i;
    int[] a = new int[n]; a[0] = k;
    while (true) {
      // Visit a
      for (i = 0; i < n-1 && a[i] == 0; ++i);
      if (i == n - 1) return;
      a[0] = a[i] - 1; ++a[i+1];
      if (i != 0) a[i] = 0;
    }
  }

I assume the code is pretty cryptic for nearly everyone who sees it for the first time. So let me explain. I want to generate all arrays of length n whose elements are nonnegative and sum up to k. In a program I should not let you wonder what the code is supposed to do. I should let you know my intent in a comment placed such that you are likely to read it before reading the code. Of course, you are wondering now why the hell does it work, if at all. This is another blunder. I should not let the reader of my code in a state of puzzlement. I should either give a convincing argument (aka a proof) or point to such, if it is in a rather standard and available reference. Here I'll leave it as a puzzle since this is a blog post, not a program, and you might enjoy thinking about it. There is however something else that I `forgot' to mention in comments. The function will work as intended only if n is positive and k is nonnegative.

This brings us to the first type of bugs, those that are caused by a programmer failing to keep track of all cases. Humans are good at creative leaps; machines are simply no match. Machines are good at keeping track of many cases and nitty-gritty details; humans are simply no match. That's good enough reason to develop tools that catch this type of bugs. We already have type systems enforced by compilers, tools such as FindBugs, and interesting dynamic checks built in interpretors (such as the check for array out of bounds in Java). We prefer static checks for the same reason we prefer to educate people and prevent crime rather than to capture murderers. Let's simulate an (extended) static checker working on the given code.

The first interesting expression is new int[n]. For this to work we must have n>0. That's OK---we said that the good version of the code would include a comment saying such. Wait! We are a tool and we can't read comments. Earlier we saw that n>0 should be written to help programmers understand the code and now we see we should do it formally if we want tool support.

  void f(int n, int k) requires n > 0 && k >= 0; { /* ... */ }

That's not too bad: It is easy to read by both humans and machines. Let's continue. We give a value to a[0]. That should be OK since a was allocated earlier. Now we look at the for loop. No statement changes a (the pointer) and i ranges from 0 to at most n-1. So accesses a[i] in the condition should be fine. We can do this reasoning automatically because the loop follows a common pattern. When the loop ends we know that i==n-1 or a[i]!=0. If we continue, then we know that i<n-1 and a[i]!=0. It even seems to be feasible for a tool to check that the following statements maintain the invariant sum(a)==k. Again, we need to write this formally if we want tool support.

    while (true) invariant (sum(a)==k) { /* ... */ }

It is not obvious how to automate the line of reasoning we followed. But, hopefully, it is clear that we did not need much creativity. It almost felt like following some simple rules. This gives us hope that it can be done by machines.

The rest is harder. If we want to prove that the while loop terminates and that by that time a represented all length-n indexed partitions of k, then we do need some creativity. The good news is that the help we must provide to a tool to complete the proof roughly corresponds to the help we need to provide to programmers to understand why the code works. Again, we need to transform some of the existing comments that accompany good code into a formal notation so that we can have tool support. Of course, we need to weight the effort of expressing our thoughts in a formal language (rather than English) against the benefits derived from tool support.

This concludes our short exploration of silly mistakes. The other bugs are mismatches between the programmer's intent and the user's desire. Here I use the terms `programmer' and `user' in a broad sense. One may think about the interface between an API writer and an API consumer as an interface between a programmer and a user; another may think of the interface between a computer game and a gamer as an interface between a programmer and a user. The same principles apply.

Oral communication is imprecise and causes false agreements---people hear what they want to hear, not what they are being told. Written communication requires more effort and doesn't solve the problem. Formal written communication is much harder and almost solves the problem. The parties involved need mathematical maturity and, even so, misunderstandings are possible. The problem of capturing the requirements seems hopeless. Yet, my experience says that the quality of the requirements description is the most important factor for the success of a project. So we must try to do it well.

My point of view is the following. For each interface we must chose a language understandable by both parties. If the interface is between an accounting application and an accountant, then we must not use math to specify it because, as sad as this is, accountants don't understand math. On the other hand, if we specify a procedure, then we can reasonably expect that the user will be a programmer and, hence, will have some mathematical maturity. The other important aspect is to be prepared for change. If a misunderstanding surfaces, fix it and update the specification to avoid repeating the situation. This suggests the following rule of thumb: Spend more time specifying aspects that are costly to change.

Conclusion. I hope I convinced my readers that specification---used correctly and in the right amounts---can be an efficient weapon in our daily fight against bugs.

Questions. If you can help me explore this subject then read on. Otherwise, thanks for staying with me up to this point.

  1. Is there a statistical study of how many bugs are `intended', in the sense that programmers know about them?
  2. Another simple property an extended static checker could verify is that a[i]>=0 for all i. Are there other simple properties that I forgot about?
  3. Is there a standard reference for the algorithm used to produce all indexed partitions of a nonnegative integer?
  4. How would you optimize the implementation? In particular, how would you get rid of the condition if(i!=0)?
  5. How many times is the while loop executed?
  6. What parts of the post did you need to reread in order to understand? Sentences? Paragraphs? etc.
  7. Complete the proof that the code follows its intent. Propose a formal annotation that captures succinctly the creative parts of the proof.
  8. What language would you use to specify an accountancy application?

1 comment:

rgrig said...

PS: The reversed generated arrays (by the example code) are lexicographically increasing.

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.