Theory and Practice: Why Do We Fail?

What is there to do if a problem refuses to be knocked down? There's a generic problem solving strategy for that.

Suppose you work on a problem and fail to solve it. You could take a break, step back, and try a different technique. If after repeated attempts the problem fails to fall, then it is time for a change in strategy. Instead of seeing patterns in the problem itself, try to see a pattern in your failed attempts. Ask yourself: Why did all these attempts fail? What do they have in common? There are two things that may happen. One is that you find a criterion that lets you quickly rule out techniques that won't work, thus saving time. Another is that you'll find you cannot solve the problem: all techniques are doomed to fail.

Let's see how this advice applies in three situations: writing correct programs, proving $P \ne NP$, and automatically proving properties of programs. The last is what I've been working on for awhile.

Coding. Designing and coding are similar activities. The difference is in scale, level of detail, and precision. When designing, you erect a tall skeleton. To work quickly, you leave out much detail, but you try to be precise. Nevertheless, designs are less precise than code, because code has to meet the minimal standard of being understood by a compiler. When coding, you flesh out the details, one organ at a time. An ‘organ’ is a functional unit, which you should code in one sitting.

There are two tricks that improve the quality of code considerably. These apply to individual coding sessions, not across coding sessions.

Begin by clarifying to yourself the high-level structure of the code. You should be able to explain why the structure is correct and good. You should also know the time and space complexity. Only once this is done, you can move on to writing the first line of code. There are some caveats to this rule of thumb. First, if it takes more than 15 minutes to clarify the structure, then you are probably trying to code more than you can in one sitting. In such a situation, the best attitude is to scale down your ambitions for the day. Second, the level of detail depends on experience: more experienced coders need less detail. As a rough guide, you should know what functions you will write, and what each of them does.
Once you finish writing the code, move away from the keyboard for 5 minutes. Then come back with a new attitude. Pretend the author is a rookie programmer (not you!), so some bugs must be lurking. Get into the meanest mindset you can — your job is to break the code.

For the second trick, the problem you are trying to solve is this: find an input that causes the program to exhibit a bug. If repeated attempts failed to exhibit a bug, then it's time for a change in strategy. Instead of trying to find a new input that may cause the program to fail, look at the ones that you already tried and make sure you understand why they don't uncover a bug. There are two things that may happen. One is that you find a criterion that lets you quickly rule out classes of inputs. The criterion would usually be a proof that the program is correct on a class of inputs. Another thing that may happen, is that you realize that the program always works. That's when you find a full proof.

In other words, the search for bugs is an instance of the general strategy laid out at the beginning.

P versus NP. Some results in theoretical computer science are known as ‘barriers’. These are results that say a certain proof technique will not work. One example is that small monotone circuits cannot solve the Clique problem.

Let me remind you what the Clique problem is. Think of a graph with $n$ vertices. It can be described by a bitstring of length $\binom{n}{2}$. Each bit says whether its corresponding edge is in the graph. For example, for a graph with vertices 1, 2, 3, and 4, the possible edges are 12, 13, 14, 23, 24, 34. So, the bitstring $110110$ represents the graph that contains edges 12, 13, 23, 24. The Clique problem asks for a family of boolean circuits, one for each number of vertices, that takes such a bitstring and outputs whether the graph contains a clique of size $k$. The constant $k$ is fixed in advance. A clique is a subset of vertices that are all pairwise connected. In the previous example, 123 is a clique, because 12, 13, and 23 are all edges of the graph. Thus, if $k$ would be 3, the answer would be ‘yes’; but, if $k$ would be 4, the answer would be ‘no’.

The result I mentioned says that if you try to build circuits using only AND and OR gates, then the circuits will get big. More precisely, there is some $k$ for which the size of the circuit for $n$ vertices is not polynomial in $n$. The argument is based on approximations of the AND and OR gates, and it is asymmetric in how it handles AND and OR. Once negations are allowed, you can easily simulate an AND with an OR and some NOTs. So, the same argument stops working, because it may not handle AND and OR asymmetrically anymore. We do not know what happens if NOT gates are allowed. We suspect that big circuits are still needed, but don't know for sure. If we would be able to prove that the result still holds in the presence of NOT gates, then we would know that $P \ne NP$. The reason is a theorem ‘whose exact authorship is apparently quite difficult to establish’ [1]:

Theorem. Let $\{f_n\}_{n \in \mathbb{N}}$ be a family of boolean functions, where $f_n$ has $n$ inputs. Let $\mathcal{L}$ be the language whose words of length $n$ are the models of $f_n$. Let $M$ be a Turing machine that recognizes the language $\mathcal{L}$. Then $$ T_M(n) S_M(n) = \Omega(L(f_n))$$ where $T_M$ and $S_M$ are the time and space used by the Turing machine, and $L(f_n)$ is the size of the smallest circuit that computes $f_n$.

Proof sketch: Given a Turing machine and a fixed $n$, you can build a circuit that handles inputs of size $n$ just like the machine. We say that Turing machines describe families of circuits; and we say the families of circuits are uniform, because they have a concise description — the machine. If no family of circuits is smaller than a certain limit, then certainly no uniform family of circuits is smaller than that limit.

In this storyline, several people attempt to prove $P \ne NP$ and fail. One of them steps back and asks why do these attempts fail? The result is a nice theorem about boolean circuits and the Clique problem. This theorem says that any attempt that does not seriously consider negation, one way or another, is doomed to fail to prove $P \ne NP$. So, this storyline is an instance of the general strategy laid out at the beginning.

Parametric Static Analysis. Static analyzers are programs that goggle other programs, looking for bugs. A naive way to analyze a program is to run it and see what it does. The trouble with this approach is that it takes forever, especially if you run the program on all possible inputs. Static analyzers do use the naive approach, but with a twist: they approximate the semantics. As an aside, note that something very similar happened in the Clique barrier: It is difficult to track what the circuit does exactly, but easier to track what the circuit does approximatively. Approximation is a fundamental tool in static analysis, and it is studied systematically in the area of abstract interpretation.

A parameterized static analyzer is one that can try various approximations, tweakable by parameters. Given a program and a potential bug, the question is whether some setting of the analyzer's parameters would succeed in ruling out the bug.

Let's apply the general strategy to this problem.

The first step is to try several parameter settings. If all of them fail to rule out the bug, then it is time to ask why. We analyze the failures and we conclude that some other parameter settings won't rule out the bug either. This is exactly what we did in [Zhang et al., On Abstraction Refinement for Program Analyses in Datalog, 2014] .

Which attempts to analyze? The general strategy involves making some attempts at solving the problem. The hope is then that analyzing the failed attempts helps us solve the problem. But, not all attempts are created equal. If an attempt is too simple and fails for trivial reasons, then analyzing it won't give us too much information. Thus, the attempts should seriously jab at the problem.

In the case of writing code, the test cases should be tough. One option is to cover a corner case. Another option is make them involved enough to exercise most of the code. (Look at A Torture Test for $\TeX$ to see what I mean.)

In the case of $P \ne NP$ you should look at serious proof attempts. In the case of parametric static analysis $\ldots$ I'm still working out what it means. Literally. That's what I'm working on now. In On Abstraction Refinement for Program Analyses in Datalog we focused on cheap attempts. But, that's like trying to break code by throwing at it the smallest test cases: a fairly good strategy, actually, but not quite the best.

Conclusion. Remember: If you work on a problem, first try to solve it in earnest, and record your attempts. If after repeated attempts the problem fails to fall, then it is time for a change in strategy. Instead of seeing patterns in the problem itself, try to see a pattern in your failed attempts. Ask yourself: Why did all these attempts fail? What do they have in common? There are two things that may happen. One is that you find a criterion that lets you quickly rule out techniques that won't work, thus saving time. Another is that you'll find you cannot solve the problem: all techniques are doomed to fail.

Notes. For more unsolicited advice on coding from yours truly, see Advice to Beginner Programmers. (But you should know that when I say ‘rule’ I don't mean ‘law’.) For the Clique barrier, see the expository article [Gowers, Razborov's Method of Approximations, 2009]. The theorem and the quote [1] is from [Razborov, Lower Bounds for Monotone Complexity of Boolean Functions, 1986] . For more on parametric static analyses, you could see a rather long previous post (Datalog and MaxSat: an Unexpected Match) or a rather abstract video (On Abstraction Refinement for Program Analyses in Datalog).

Theory and Practice

28 October 2014

Why Do We Fail?

No comments:

Post a Comment