Polylog parameterizability

From Polymath Wiki
Revision as of 09:03, 15 August 2010 by Shishiqra (talk | contribs)
Jump to navigationJump to search

In this post I will try to describe what I believe Deolilakar means by "polylog parameterizability", a key concept in his work. (A more accurate term here would be "polylog recursive factorizability".

Definition

Let [math]\displaystyle{ \mu }[/math] be a probability distribution on the cube [math]\displaystyle{ \{0,1\}^n }[/math]. Then to describe this distribution fully, one needs [math]\displaystyle{ 2^n-1 }[/math] real parameters; one real number [math]\displaystyle{ \mu(\{x\}) }[/math] for each of the [math]\displaystyle{ 2^n }[/math] elements of the cube, minus one for the constraint [math]\displaystyle{ \sum_{x \in \{0,1\}^n} \mu(\{x\}) = 1 }[/math].

Roughly speaking, a probability distribution is "polylog parameterizable" if it requires much fewer than [math]\displaystyle{ 2^n }[/math] real numbers to describe, and more precisely it needs only [math]\displaystyle{ \exp( O( polylog(n) ) ) }[/math] many real numbers. More precisely still, we ask that the distribution function [math]\displaystyle{ \mu(x_1,\ldots,x_n): \{0,1\}^n \to [0,1] }[/math] admit a factorisation into potentials

[math]\displaystyle{ \mu(x_1,\ldots,x_n) = \prod_{j=1}^n p_j( x_j; x_{pa(j)} ) }[/math]

where for each j, pa(j) is a set of O(polylog(n)) "parents" of j (as given by a directed acyclic graph), and [math]\displaystyle{ p_j(\cdot,x_{pa(j)}) }[/math] is a probability distribution for each assignment of the parent literals [math]\displaystyle{ x_{pa(j)} }[/math]. The point is that each factor [math]\displaystyle{ p_j }[/math] only takes [math]\displaystyle{ \exp(O(polylog(n)) }[/math] real numbers to describe, and so the entire distribution also only requires [math]\displaystyle{ \exp(O(polylog(n)) }[/math] real numbers to describe.

The purpose of the FO(LFP) machinery is (I think) ultimately to obtain a result of the following form:

Claim. If a solution space in [math]\displaystyle{ \{0,1\}^n }[/math] can be "explored by a P algorithm" in a suitable sense, then the uniform distribution [math]\displaystyle{ \mu }[/math] of that space is the _projection_ of a polylog-parameterisable measure [math]\displaystyle{ \mu' }[/math] on a subset of a polynomially larger space [math]\displaystyle{ \{0,1\}^{n^{O(1)}} }[/math].


The following can be easily shown (see proof):

A probability distribution on [math]\displaystyle{ \{0,1\}^n }[/math] that can be computed by a polynomial time run on a probabilistic Turing machine is a projection of a polylog parametrizable distribution.

However, a key point that does not seem to be sufficiently addressed in the paper is whether the projection of a polylog-parameterisable measure remains polylog-parameterisable, and which I believe could in fact be a serious problem with the proof. In fact, it is not the case, shown by counterexamples below.

Examples

Let me illustrate the above Claim with a simple example, that of a straight line program. Let [math]\displaystyle{ N := n^{100} }[/math], and consider a boolean function [math]\displaystyle{ f: \{0,1\}^n \to \{0,1\} }[/math] obtained by the formula

[math]\displaystyle{ f(x_1,\ldots,x_n) := x_N }[/math]

where [math]\displaystyle{ x_{n+1},\ldots,x_N }[/math] are obtained from previous literals by formulae of the form

[math]\displaystyle{ x_i = f_i( x_{a_i}, x_{b_i} ) }[/math] (1)

for [math]\displaystyle{ 1 \leq a_i,b_i \lt i }[/math] and some logic gates [math]\displaystyle{ f_i : \{0,1\}^2 \to \{0,1\} }[/math]. Now let

[math]\displaystyle{ S := \{ (x_1,\ldots,x_n, f(x_1,\ldots,x_n)): (x_1,\ldots,x_n) \in \{0,1\}^n \} \subset \{0,1\}^{n+1} }[/math]

be the graph of f, and let [math]\displaystyle{ \mu }[/math] be the uniform distribution on S, thus [math]\displaystyle{ \mu }[/math] assigns a probability [math]\displaystyle{ 2^{-n} }[/math] to each element of S:

[math]\displaystyle{ \mu(x_1,\ldots,x_{n+1}) = 2^{-n} 1_{x_{n+1} = f(x_1,\ldots,x_n)}. }[/math] (2)

As f is fairly complicated (indeed, it is not obviously distinguishable from a random function), it does not seem that [math]\displaystyle{ \mu }[/math] itself is polylog parameterisable. However, [math]\displaystyle{ \mu }[/math] can be lifted to a probability measure [math]\displaystyle{ \mu' }[/math] on the larger space [math]\displaystyle{ \{0,1\}^N }[/math] which _is_ polylog parameterisable. Namely, let [math]\displaystyle{ S' \subset \{0,1\}^N }[/math] be the set of all tuples [math]\displaystyle{ (x_1,\ldots,x_N) \in \{0,1\}^N }[/math] for which (1) holds for all [math]\displaystyle{ i=n+1,\ldots,N }[/math]. Clearly, the map [math]\displaystyle{ (x_1,\ldots,x_N) \mapsto (x_1,\ldots,x_n,x_N) }[/math] from [math]\displaystyle{ \{0,1\}^N }[/math] to [math]\displaystyle{ \{0,1\}^{n+1} }[/math] maps S' to S in a bijective manner, and so the uniform measure [math]\displaystyle{ \mu' }[/math] on S' projects to the uniform measure [math]\displaystyle{ \mu }[/math] on S.

On the other hand, [math]\displaystyle{ \mu' }[/math] is polylog-parameterisable. Indeed, we have the explicit factorisation

[math]\displaystyle{ \mu'(x_1,\ldots,x_N) := 2^{-n} \prod_{i=n+1}^N 1_{x_i = f_i(x_{a_i},x_{b_i})} }[/math].


Next, consider the problem of parameterising a certain distribution [math]\displaystyle{ \nu }[/math] to be defined shortly on the solution space

[math]\displaystyle{ T := \{ (x_1,\ldots,x_n): F(x_1,\ldots,x_n) = 1 \} \subset \{0,1\}^n }[/math]

of some function [math]\displaystyle{ F: \{0,1\}^n \to \{0,1\} }[/math], which we assume to be non-empty. We assume that the partial satisfiability problem is solvable here, thus for any initial partial assignment [math]\displaystyle{ x_1,\ldots,x_i }[/math] of literals, one can quickly solve the problem of whether this partial assignment can be completed to a full solution in T. Indeed, let us suppose that there is a polynomial length straight line program [math]\displaystyle{ f_i: \{0,1\}^i \to \{0,1\} }[/math] such that [math]\displaystyle{ f_i(x_1,\ldots,x_i)=1 }[/math] if and only if there exists [math]\displaystyle{ x_{i+1},\ldots,x_n }[/math] for which [math]\displaystyle{ F(x_1,\ldots,x_n)=1 }[/math].

We now define [math]\displaystyle{ \nu }[/math] to be the probability distribution on T defined by selecting random variables [math]\displaystyle{ x_1,\ldots,x_n }[/math] consecutively in the following manner. Assume that [math]\displaystyle{ x_1,\ldots,x_{i-1} }[/math] have already been selected and are extendible (i.e. [math]\displaystyle{ f_{i-1}(x_1,\ldots,x_{i-1})=1 }[/math]). We then compute [math]\displaystyle{ f_i(x_1,\ldots,x_{i-1},0) }[/math] and [math]\displaystyle{ f_i(x_1,\ldots,x_{i-1},1) }[/math]; at least one of these must equal 1. If only the former equals 1, we set [math]\displaystyle{ x_i:=0 }[/math]; if the latter, we set [math]\displaystyle{ x_i:=1 }[/math]; and otherwise, we set [math]\displaystyle{ x_i }[/math] equal to either 0 or 1 with a 1/2 probability of each.

One can then factor [math]\displaystyle{ \nu }[/math] as

[math]\displaystyle{ \nu(x_1,\ldots,x_n) = \frac{1}{|T|} \prod_{i=1}^n p_i(x_i; f_i( x_1,\ldots,x_{i-1}, 0), f_i( x_1,\ldots,x_{i-1}, 1)) }[/math]

where [math]\displaystyle{ p_i(\cdot; a_0, a_1) }[/math] is the delta distribution [math]\displaystyle{ \delta_0 }[/math] if [math]\displaystyle{ a_0=1, a_1 = 0 }[/math], the delta distribution [math]\displaystyle{ \delta_1 }[/math] if [math]\displaystyle{ a_0=0,a_1=1 }[/math], the uniform distribution [math]\displaystyle{ \frac{1}{2}\delta_0+\frac{1}{2}\delta_1 }[/math] if [math]\displaystyle{ a_0=a_1=1 }[/math], and some arbitrary distribution (e.g. [math]\displaystyle{ \delta_0 }[/math] for concreteness) in the (unattainable) case [math]\displaystyle{ a_0=a_1=1 }[/math].

This is not yet of a polylog parameterizable form (each i has i-1 ancestors rather than polylog ancestors).

However, if one expands all the straight line programs [math]\displaystyle{ f_i }[/math] and adds all the literals required for computing such programs, then we can lift [math]\displaystyle{ \nu }[/math] to a polylog parameterizable measure [math]\displaystyle{ \nu' }[/math] similarly to before. However [math]\displaystyle{ \nu }[/math] itself is not obviously a polylog parameterizable measure, merely the projection of one that involves additional "hidden variables".

Another remark: since NP is in EXP, I think that the solution space to a k-SAT problem, if it is non-empty admits, the projection of a degree O(1) recursively factorizable measure on exponentially many variables. This already illustrates that the projection operation is powerful enough to convert even the simplest measure into an extremely general one, at least if one is allowed to discard an exponential number of variables.

Counterexamples

The following measures are not polylog parameterisable, though they are projections of polylog parameterisable measures in a higher dimensional space:

  1. uniform (or any positive) measure on [math]\displaystyle{ EVEN:=\{x \in \{0,1\}^n: x_1+\ldots+x_n =0 (\mod 2)\} }[/math]
  2. uniform (or any positive) measure on [math]\displaystyle{ \{0,1\}^n \backslash (0,\ldots,0) }[/math].

To see the non polylog parameterisability, we use

Claim: Let [math]\displaystyle{ S \subset \{0,1\}^n }[/math], let [math]\displaystyle{ \Chi_S }[/math] be the characteristic function of S, and let [math]\displaystyle{ \mu }[/math] be a probability measure whose support is exactly S (i.e. [math]\displaystyle{ \mu(x) \gt 0 }[/math] if and only if x is in S. Assume that [math]\displaystyle{ \mu }[/math] is polylog parametriable. Then, there is some permutation (reordering) of [n] such that Boolean function [math]\displaystyle{ \Chi_S(x_1,\ldots,x_n) }[/math] can be expressed/computed as follows:
[math]\displaystyle{ \Chi_S(x_1\ldots,x_n) = \prod_{i=1}^n g(x_i; x_j’s) }[/math], where each g is a Boolean function of [math]\displaystyle{ x_i }[/math] and polylog-many [math]\displaystyle{ x_j }[/math]’s with j < i.

We can see the claim by focusing on measure being zero or nonzero in the definition of polylog paramateizability. Now consider #1. With respect to any ordering (permutation) of [n], the membership in EVEN cannot be expressed by such an AND as in the claim because the last factor [math]\displaystyle{ g(x_n; x_j's) }[/math] only depend on *fixed* polylog many [math]\displaystyle{ x_j }[/math]'s. By the same reason, #2 is not polylog parametrizable. (To express #1 and #2 in such a form, the last g(xn; xj's) have to depend on all n variables.)

On the other hand EVEN can be polylog parametrized if we add a number of auxilary variables, placeholders for partial sums, so that we never add more than polylog elements (we can always add a fixed number), and similarly for #2.