Theory and Practice: Weak but Functional Pigeons

A bird's eye view of [Razborov, Resolution Lower Bounds for the Weak Functional Pigeonhole Principle, 2003].

Resolution refutations are interesting, among other reasons, because they correspond to how SAT solvers work. A (general) resolution refutation $P$ is a sequence $C_1,\ldots,C_L$ of clauses such that (a) each clause is either an axiom or the resolution of two previous clauses, and (b) the last clause is empty. As usual, a clause is a set of literals, which are possibly negated variables. The length $L(P)$ of the refutation $P$ is its number of clauses. For practical SAT solving algorithms, it is true that if the solver says in time $T$ that the axioms are UNSAT then there exists a resolution refutation $P$ of length $L(P) \le O(T)$. Conversely, if there exists a resolution refutation $P$ of length $L(P)$, then you may get lucky and obtain an UNSAT answer from your solver in time $T\le O(L(P))$.

The resolution rule is the following: $$ \frac{A\cup\{l\}\qquad \{\bar{l}\}\cup B}{A \cup B} $$ We erase a literal $l$ and its negation $\bar{l}$, and keep the other literals.

The first lower bound on $L(P)$ was given by Haken, for resolution refutations $P$ that express the pigeonhole principle: for $m+1$ pigeons and $m$ holes, it is impossible that each pigeon goes in at least a hole and each hole gets at most one pigeon. The proof is allegedly complicated. (I should probably read it.) Some simplification of this and other lower bounds came with [Ben-Sasson, Wigderson, Short proofs are narrow, JACM2001], which shows that $$ w(P') \le O(\sqrt{n \lg L(P)}) \tag{BW} $$ where $w(P')$ is the maximum size of a clause in $P'$ (the width) and $n$ is the number of variables. The idea of the proof is as follows — you give me some refutation $P$ and I'll give you back a refutation $P'$ of width $O(\sqrt{n \lg L(P)})$ by iterating the following process:

find the literal $l$ that occurs most often in clauses longer than the axioms
reorganize the proof to get rid of the said occurrences:
1. derive $\bar{l}$
2. by resolving with $\bar{l}$, erase $l$ from axioms
3. replay old proof but with $l$ removed
(these three chunks are concatenated to give a new, thinner refutation)

From $w \le O(\sqrt{n \lg L})$ and $w \ge \Omega(f(n))$, it follows that $L \ge \Omega\bigl(\exp(\alpha \frac{f(n)^2}{n})\bigr)$ for some $\alpha \gt 0$. So, lower bounds on width $w \gt \omega(\sqrt{n})$ imply exponential lower bounds on length $L \ge \Omega\bigl( \exp(\alpha n^\beta)\bigr)$ for some $\alpha,\beta \gt 0$. To finish off the lower bound for pigeons, Ben-Sasson and Wigderson show that indeed $w \ge \Omega(n)$ using some expander stuff that I'm skipping here. On the other hand, if $w \le O(\sqrt{n})$, then the result $w \le O(\sqrt{n \lg L})$ is of no help in deriving lower bounds on $L$.

Now let's get to the subject of this post, the weak and functional pigeonhole principle. Weak means that instead of having $m+1$ pigeons we have $p \gt m$ pigeons, where $m$ is the number of holes. (One may think that if there are lots and lots of pigeons, then you might be able to notice quicker that you can't find homes for all. So, potentially, the weak principle might have shorter proofs.) Functional means that you put each pigeon not in at least one hole but in exactly one hole. (Again, the extra no-pigeon-cloning axioms might make it possible to find shorter refutations.) It turns out that in this case there are indeed refutations of width $O(\sqrt{n})$, so the (BW) lemma doesn't help. That's where Razborov comes in.

The general plan is quite similar, except we replace the width by a pseudowidth:

upper bound the pseudowidth in terms of the length $L$, and
lower bound the pseudowidth

From 1 and 2, we get a lower bound for the length $L$.

So, what's the pseudowidth? Roughly, it is the number of pigeons that are given many options. To say that more precisely, we need some notation. Setting variable $x_{ij}$ means that pigeon $i$ goes in hole $j$. For the functional principle, it turns out that we can use only positive literals: essentially, we systematically replace $\bar{x}_{ij}$ by $(x_{i1}\lor\ldots\lor x_{i(j-1)})\lor(x_{i(j+1)}\lor\ldots\lor x_{im})$. Then, we can count how many options is pigeon $i$ given by clause $C$: $$ d_i(C) \mathrel{:=} |C \cap \{\,x_{ij} : j\in[m]\,\}| $$ We say that a pigeon has many options if this number is over some fixed threshold which depends on the pigeon. Let $\mathbf{d}=(d_1,\ldots,d_p)$ be a vector with such thresholds, and let $w_{\mathbf{d}}(C)$ be the number of pigeons that have many options in clause $C$, according to thresholds $\mathbf{d}$. The pseudowidth of a refutation $P$ is then $w_{\mathbf{d}}(P) \mathrel{:=} \max_{C\in P} w_{\mathbf{d}}(C)$. Razborov shows that one can always find a thresholds $\mathbf{d}$ such that $$ w_{\mathbf{d}}(P') \le O(w_0 + \lg L(P)) $$ More precisely, if we give Razborov a refutation $P$, he can reply with another refutation $P'$ and some thresholds $\mathbf{d}$ such that the above holds. The catch is that he might introduce a few extra axioms. In fact, $P'$ is obtained from $P$ by replacing some clauses with $(w_0,\mathbf{d})$-axioms, so $L(P')=L(P)$. An $(w_0,\mathbf{d})$-axiom mentions exactly $w_0$ pigeons, and each pigeon $i$ that is mentioned is given exactly $d_i$ options. Step 2 of the proof, which lower-bounds the pseudowidth, needs to somehow deal with these extra axioms.

The main idea of the second part of the proof is to map clauses to certain vector spaces, and look at how the dimension of those spaces grows. Let $V(C)$ be the vector space associated to clause $C$. As long as the pseudowidth is below some threshold $O\bigl(\frac{m}{\lg p}\bigr)$, it turns out that new dimensions are seldom added. More precisely, if a resolution step produces $C$ out of $C_0$ and $C_1$, then $V(C) \;\subseteq\; V(C_0) + V(C_1)$. So, $$\dim V(C) \le \sum_{C'}\dim V(C')$$ where $C'$ ranges over the axioms used in deriving $C$. For the original pigeonhole axioms, the dimension is 0. For the extra axioms introduced in the step 1 of the proof, the dimension is some small number. For the empty clause, the dimension is some big number. So, either we reach the big number because we have many extra axioms, which means the refutation is long, or the pseudowidth is $\ge\Omega\bigl(\frac{m}{\lg p}\bigr)$. In the latter case, we have the lower-bound on pseudowidth that we wanted to obtain in step 2.

From $\frac{m}{\lg p} \le w_0 + \lg L$ it looks like $L\ge \exp \Omega(\frac{m}{\lg p})$. The lower bound is in fact only slightly worse: $$ L \ge \exp \Omega \biggl( \frac{m}{(\lg p)^2}\biggr) $$ The reason is the first case, in which many extra axioms are introduced. To see why, you'd have to look in the paper to see what exactly did I meant by ‘small number’ and ‘big number’ in the previous paragraph.

Nitpick. I stop here because the post is already long. Go read the paper: it has some great ideas! (Although, it does have quite a few small/silly mistakes as well. Somewhat unusually for me, I did not find them confusing.)

Edit 20180430: Fixed the dimension claim, by making the sum going over axioms. (Initially, I said that some dimension inequality holds for each resolution step, which is true but too weak for what follows.)

Theory and Practice

30 April 2018

Weak but Functional Pigeons

No comments:

Post a Comment