26 August 2014

Watching Horn

How to use watched literals (or, rather, vertices) to do breadth-first search in Horn hyperdigraphs.

Satisfiability solvers use a technique called ‘watched literals’. I'll present it here in the context of Horn clauses, for two reasons: (1) it is a slightly unusual context, and (2) I actually needed to implement this for my work.

Horn clauses can be difficult to visualize. So, I'll use an equivalent formulation in terms of directed hypergraphs. Unlike their undirected relatives, directed hypergraphs are less famous. A directed (hyper)arc is a pair (X,Y)(X,Y) of vertex sets. If you are familiar with SAT, think of XX as the set of negative literals, and YY the set of positive literals. Then, a Horn formula corresponds to a hypergraph with |Y|1|Y|1 for all arcs. (And a Horn clause corresponds to a hypergraph arc.) A definite Horn formula corresponds to the case |Y|=1|Y|=1.

We want to perform breadth-first search on definite Horn hyperdigraphs.

Let's first clarify what reachability means. For digraphs, if vertex xx is reachable and the digraph has an arc (x,y)(x,y), then vertex yy is reachable. For hyperdigraphs, if all vertices in XX are reachable and the hyperdigraph has an arc (X,Y)(X,Y), then some vertices in YY are reachable. Saying that some vertex is reachable sounds a bit weird. I chose this definition so that it corresponds on satisfiability of formulas. And, the weirdness doesn't matter as we look at the definite Horn case |Y|=1|Y|=1. (But, I should say there are other definitions of reachability on hyperdigraphs.)

How should we implement this?

For digraphs, we use a flag on each vertex to put them in one of three categories: (1) not seen, (2) seen, (3) done. At every step we pick a seen vertex xx. Then we mark all its successors that were previously not seen as seen. And then we mark xx as done. This schema works for every traversal. For breadth-first search, we must pick vertices in a certain order. One way is to use a queue. But, if you are interested in distances, there is a more convenient way. We keep two sets, the current level and the next level, of seen nodes. The xx is picked from the current level, its successors added to the next. Also, xx is removed from the current level when marked done. (Not strictly necessary: The invariant that these two levels contain only seen nodes can be relaxed.) When the current level becomes empty, it gets swapped with the next level. The process ends when both levels are empty.

If we try to do the same for hypergraphs, we run into an efficiency problem. Previously, given xx, we had to find all arcs of the form (x,y)(x,y). That's just the adjacency list representation of digraphs. But now, given xx, we must find all arcs of the form (X,y)(X,y) such that xXxX and all vertices of XX are seen (or done). This is a considerably more difficult problem, solved by watching literals.

Consider an arc (X,y)(X,y). If all nodes in XX are done, then yy must be seen or done. To ensure this is the case, we need a quick way to find arcs whose all sources are done. The trick is to think of a data structure that enforces the negation: For each arc (X,y)(X,y) there is at least one vertex in XX that is not done. When we fail to maintain the invariant, it is because an arc has the required property.

How do we make sure that every arc has an undone vertex? Well, we designate one particular vertex, which is undone, as being watched. Then, we still use the adjacency list representation, as for digraph. For digraphs, the adjacency list maps a vertex xx to the list of arcs of the form (x,y)(x,y). For Horn hyperdigraphs, the adjacency list maps a vertex xx to the list of arcs (X,y)(X,y) such that xXxX is the designated watched vertex for this arc.

Finally, let's see what happens when we pick a seen vertex xx in the traversal algorithm. This time we begin by marking xx as done. This is like promising to have done the work before actually doing it. So, we'd better do it. We have to go through the adjacency list of xx, because xx cannot be a watched vertex anymore. For each of the arcs, we must pick a different undone vertex to watch. For each arc (X,y)(X,y) we simply scan all members of XX until we find a suitable replacement. If there isn't any, then all sources are done, so we must mark yy as seen if it was not seen.

That's it.

By the way, according to Blogger this is my 100000000th post.

No comments:

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.