## 26 August 2014

### Watching Horn

How to use watched literals (or, rather, vertices) to do breadth-first search in Horn hyperdigraphs.

Satisfiability solvers use a technique called ‘watched literals’. I'll present it here in the context of Horn clauses, for two reasons: (1) it is a slightly unusual context, and (2) I actually needed to implement this for my work.

Horn clauses can be difficult to visualize. So, I'll use an equivalent formulation in terms of directed hypergraphs. Unlike their undirected relatives, directed hypergraphs are less famous. A directed (hyper)arc is a pair $(X, Y)$ of vertex sets. If you are familiar with SAT, think of $X$ as the set of negative literals, and $Y$ the set of positive literals. Then, a Horn formula corresponds to a hypergraph with $|Y| \le 1$ for all arcs. (And a Horn clause corresponds to a hypergraph arc.) A definite Horn formula corresponds to the case $|Y|=1$.

We want to perform breadth-first search on definite Horn hyperdigraphs.

Let's first clarify what reachability means. For digraphs, if vertex $x$ is reachable and the digraph has an arc $(x,y)$, then vertex $y$ is reachable. For hyperdigraphs, if all vertices in $X$ are reachable and the hyperdigraph has an arc $(X, Y)$, then some vertices in $Y$ are reachable. Saying that some vertex is reachable sounds a bit weird. I chose this definition so that it corresponds on satisfiability of formulas. And, the weirdness doesn't matter as we look at the definite Horn case $|Y|=1$. (But, I should say there are other definitions of reachability on hyperdigraphs.)

How should we implement this?

For digraphs, we use a flag on each vertex to put them in one of three categories: (1) not seen, (2) seen, (3) done. At every step we pick a seen vertex $x$. Then we mark all its successors that were previously not seen as seen. And then we mark $x$ as done. This schema works for every traversal. For breadth-first search, we must pick vertices in a certain order. One way is to use a queue. But, if you are interested in distances, there is a more convenient way. We keep two sets, the current level and the next level, of seen nodes. The $x$ is picked from the current level, its successors added to the next. Also, $x$ is removed from the current level when marked done. (Not strictly necessary: The invariant that these two levels contain only seen nodes can be relaxed.) When the current level becomes empty, it gets swapped with the next level. The process ends when both levels are empty.

If we try to do the same for hypergraphs, we run into an efficiency problem. Previously, given $x$, we had to find all arcs of the form $(x,y)$. That's just the adjacency list representation of digraphs. But now, given $x$, we must find all arcs of the form $(X,y)$ such that $x\in X$ and all vertices of $X$ are seen (or done). This is a considerably more difficult problem, solved by watching literals.

Consider an arc $(X,y)$. If all nodes in $X$ are done, then $y$ must be seen or done. To ensure this is the case, we need a quick way to find arcs whose all sources are done. The trick is to think of a data structure that enforces the negation: For each arc $(X,y)$ there is at least one vertex in $X$ that is not done. When we fail to maintain the invariant, it is because an arc has the required property.

How do we make sure that every arc has an undone vertex? Well, we designate one particular vertex, which is undone, as being watched. Then, we still use the adjacency list representation, as for digraph. For digraphs, the adjacency list maps a vertex $x$ to the list of arcs of the form $(x,y)$. For Horn hyperdigraphs, the adjacency list maps a vertex $x$ to the list of arcs $(X,y)$ such that $x \in X$ is the designated watched vertex for this arc.

Finally, let's see what happens when we pick a seen vertex $x$ in the traversal algorithm. This time we begin by marking $x$ as done. This is like promising to have done the work before actually doing it. So, we'd better do it. We have to go through the adjacency list of $x$, because $x$ cannot be a watched vertex anymore. For each of the arcs, we must pick a different undone vertex to watch. For each arc $(X,y)$ we simply scan all members of $X$ until we find a suitable replacement. If there isn't any, then all sources are done, so we must mark $y$ as seen if it was not seen.

That's it.

By the way, according to Blogger this is my 100000000th post.