Polylog parameterizability
In this post I will try to describe what I believe Deolilakar means by "polylog parameterizability", a key concept in his work.
Let [math]\displaystyle{ \mu }[/math] be a probability distribution on the cube [math]\displaystyle{ \{0,1\}^n }[/math]. Then to describe this distribution fully, one needs [math]\displaystyle{ 2^n-1 }[/math] real parameters; one real number [math]\displaystyle{ \mu(\{x\}) }[/math] for each of the [math]\displaystyle{ 2^n }[/math] elements of the cube, minus one for the constraint [math]\displaystyle{ \sum_{x \in \{0,1\}^n} \mu(\{x\}) = 1 }[/math].
Roughly speaking, a probability distribution is "polylog parameterizable" if it requires much fewer than [math]\displaystyle{ 2^n }[/math] real numbers to describe, and more precisely it needs only [math]\displaystyle{ \exp( O( polylog(n) ) ) }[/math] many real numbers. More precisely still, we ask that the distribution function [math]\displaystyle{ \mu(x_1,\ldots,x_n): \{0,1\}^n \to [0,1] }[/math] admit a factorisation into potentials
- [math]\displaystyle{ \mu(x_1,\ldots,x_n) = \prod_{j=1}^n p_j( x_j; x_{pa(j)} ) }[/math]
where for each j, pa(j) is a set of O(polylog(n)) "parents" of j (as given by a directed acyclic graph), and [math]\displaystyle{ p_j(\cdot,x_{pa(j)}) }[/math] is a probability distribution for each assignment of the parent literals [math]\displaystyle{ x_{pa(j)} }[/math]. The point is that each factor [math]\displaystyle{ p_j }[/math] only takes [math]\displaystyle{ \exp(O(polylog(n)) }[/math] real numbers to describe, and so the entire distribution also only requires [math]\displaystyle{ \exp(O(polylog(n)) }[/math] real numbers to describe.
The purpose of the FO(LFP) machinery is (I think) ultimately to obtain a result of the following form:
- Claim. If a solution space in [math]\displaystyle{ \{0,1\}^n }[/math] can be "explored by a P algorithm" in a suitable sense, then the uniform distribution [math]\displaystyle{ \mu }[/math] of that space is the _projection_ of a polylog-parameterisable measure [math]\displaystyle{ \mu' }[/math] on a subset of a polynomially larger space [math]\displaystyle{ \{0,1\}^{n^{O(1)}} }[/math].
However, a key point that does not seem to be sufficiently addressed in the paper is whether the projection of a polylog-parameterisable measure remains polylog-parameterisable, and which I believe could in fact be a serious problem with the proof.
Let me illustrate the above Claim with a simple example, that of a straight line program. Let [math]\displaystyle{ N := n^{100} }[/math], and consider a boolean function [math]\displaystyle{ f: \{0,1\}^n \to \{0,1\} }[/math] obtained by the formula
- [math]\displaystyle{ f(x_1,\ldots,x_n) := x_N }[/math]
where [math]\displaystyle{ x_{n+1},\ldots,x_N }[/math] are obtained from previous literals by formulae of the form
- [math]\displaystyle{ x_i = f_i( x_{a_i}, x_{b_i} ) }[/math] (1)
for [math]\displaystyle{ 1 \leq a_i,b_i \lt i }[/math] and some logic gates [math]\displaystyle{ f_i : \{0,1\}^2 \to \{0,1\} }[/math]. Now let
- [math]\displaystyle{ S := \{ (x_1,\ldots,x_n, f(x_1,\ldots,x_n)): (x_1,\ldots,x_n) \in \{0,1\}^n \} \subset \{0,1\}^{n+1} }[/math]
be the graph of f, and let [math]\displaystyle{ \mu }[/math] be the uniform distribution on S, thus [math]\displaystyle{ \mu }[/math] assigns a probability [math]\displaystyle{ 2^{-n} }[/math] to each element of S:
- [math]\displaystyle{ \mu(x_1,\ldots,x_{n+1}) = 2^{-n} 1_{x_{n+1} = f(x_1,\ldots,x_n)}. }[/math] (2)
As f is fairly complicated (indeed, it is not obviously distinguishable from a random function), it does not seem that [math]\displaystyle{ \mu }[/math] itself is polylog parameterisable. However, [math]\displaystyle{ \mu }[/math] can be lifted to a probability measure [math]\displaystyle{ \mu' }[/math] on the larger space [math]\displaystyle{ \{0,1\}^N }[/math] which _is_ polylog parameterisable. Namely, let [math]\displaystyle{ S' \subset \{0,1\}^N }[/math] be the set of all tuples [math]\displaystyle{ (x_1,\ldots,x_N) \in \{0,1\}^N }[/math] for which (1) holds for all [math]\displaystyle{ i=n+1,\ldots,N }[/math]. Clearly, the map [math]\displaystyle{ (x_1,\ldots,x_N) \mapsto (x_1,\ldots,x_n,x_N) }[/math] from [math]\displaystyle{ \{0,1\}^N }[/math] to [math]\displaystyle{ \{0,1\}^{n+1} }[/math] maps S' to S in a bijective manner, and so the uniform measure [math]\displaystyle{ \mu' }[/math] on S' projects to the uniform measure [math]\displaystyle{ \mu }[/math] on S.
On the other hand, [math]\displaystyle{ \mu' }[/math] is polylog-parameterisable. Indeed, we have the explicit factorisation
- [math]\displaystyle{ \mu'(x_1,\ldots,x_N) := 2^{-n} \prod_{i=n+1}^N 1_{x_i = f_i(x_{a_i},x_{b_i})} }[/math].
Next, consider the problem of parameterising a certain distribution [math]\displaystyle{ \nu }[/math] to be defined shortly on the solution space
- [math]\displaystyle{ T := \{ (x_1,\ldots,x_n): F(x_1,\ldots,x_n) = 1 \} \subset \{0,1\}^n }[/math]
of some function [math]\displaystyle{ F: \{0,1\}^n \to \{0,1\} }[/math], which we assume to be non-empty. We assume that the partial satisfiability problem is solvable here, thus for any initial partial assignment [math]\displaystyle{ x_1,\ldots,x_i }[/math] of literals, one can quickly solve the problem of whether this partial assignment can be completed to a full solution in T. Indeed, let us suppose that there is a polynomial length straight line program [math]\displaystyle{ f_i: \{0,1\}^i \to \{0,1\} }[/math] such that [math]\displaystyle{ f_i(x_1,\ldots,x_i)=1 }[/math] if and only if there exists [math]\displaystyle{ x_{i+1},\ldots,x_n }[/math] for which [math]\displaystyle{ F(x_1,\ldots,x_n)=1 }[/math].
We now define [math]\displaystyle{ \nu }[/math] to be the probability distribution on T defined by selecting random variables [math]\displaystyle{ x_1,\ldots,x_n }[/math] consecutively in the following manner. Assume that [math]\displaystyle{ x_1,\ldots,x_{i-1} }[/math] have already been selected and are extendible (i.e. [math]\displaystyle{ f_{i-1}(x_1,\ldots,x_{i-1})=1 }[/math]). We then compute [math]\displaystyle{ f_i(x_1,\ldots,x_{i-1},0) }[/math] and [math]\displaystyle{ f_i(x_1,\ldots,x_{i-1},1) }[/math]; at least one of these must equal 1. If only the former equals 1, we set [math]\displaystyle{ x_i:=0 }[/math]; if the latter, we set [math]\displaystyle{ x_i:=1 }[/math]; and otherwise, we set [math]\displaystyle{ x_i }[/math] equal to either 0 or 1 with a 1/2 probability of each.
One can then factor [math]\displaystyle{ \nu }[/math] as
- [math]\displaystyle{ \nu(x_1,\ldots,x_n) = \frac{1}{|T|} \prod_{i=1}^n p_i(x_i; f_i( x_1,\ldots,x_{i-1}, 0), f_i( x_1,\ldots,x_{i-1}, 1)) }[/math]
where [math]\displaystyle{ p_i(\cdot; a_0, a_1) }[/math] is the delta distribution [math]\displaystyle{ \delta_0 }[/math] if [math]\displaystyle{ a_0=1, a_1 = 0 }[/math], the delta distribution [math]\displaystyle{ \delta_1 }[/math] if [math]\displaystyle{ a_0=0,a_1=1 }[/math], the uniform distribution [math]\displaystyle{ \frac{1}{2}\delta_0+\frac{1}{2}\delta_1 }[/math] if [math]\displaystyle{ a_0=a_1=1 }[/math], and some arbitrary distribution (e.g. [math]\displaystyle{ \delta_0 }[/math] for concreteness) in the (unattainable) case [math]\displaystyle{ a_0=a_1=1 }[/math].
This is not yet of a polylog parameterisable form (each i has i-1 ancestors rather than polylog ancestors).
However, if one expands all the straight line programs [math]\displaystyle{ f_i }[/math] and adds all the literals required for computing such programs, then we can lift [math]\displaystyle{ \nu }[/math] to a polylog parameterisable measure [math]\displaystyle{ \nu' }[/math] similarly to before. However [math]\displaystyle{ \nu }[/math] itself is not obviously a polylog parameterisable measure, merely the projection of one that involves additional "hidden variables".