Polylog parameterizability

From Polymath Wiki
Revision as of 10:00, 14 August 2010 by Teorth (talk | contribs)
Jump to navigationJump to search

In this post I will try to describe what I believe Deolilakar means by "polylog parameterizability", a key concept in his work.

Let [math]\displaystyle{ \mu }[/math] be a probability distribution on the cube [math]\displaystyle{ \{0,1\}^n }[/math]. Then to describe this distribution fully, one needs [math]\displaystyle{ 2^n-1 }[/math] real parameters; one real number [math]\displaystyle{ \mu(\{x\}) }[/math] for each of the [math]\displaystyle{ 2^n }[/math] elements of the cube, minus one for the constraint [math]\displaystyle{ \sum_{x \in \{0,1\}^n} \mu(\{x\}) = 1 }[/math].

Roughly speaking, a probability distribution is "polylog parameterizable" if it requires much fewer than [math]\displaystyle{ 2^n }[/math] real numbers to describe, and more precisely it needs only [math]\displaystyle{ \exp( O( polylog(n) ) ) }[/math] many real numbers. More precisely still, we ask that the distribution function [math]\displaystyle{ \mu(x_1,\ldots,x_n): \{0,1\}^n \to [0,1] }[/math] admit a factorisation into potentials

[math]\displaystyle{ \mu(x_1,\ldots,x_n) = \prod_{j=1}^J p_j( x_{C_j} ) }[/math]

where [math]\displaystyle{ J = \exp(O(polylog(n)) }[/math], the [math]\displaystyle{ C_j }[/math] are subsets of [math]\displaystyle{ \{1,\ldots,n\} }[/math] of size O(polylog(n)), the [math]\displaystyle{ x_{C_j} := (x_i)_{i \in C_j} \in \{0,1\}^{C_j} }[/math] are the literals in [math]\displaystyle{ C_j }[/math], and [math]\displaystyle{ p_j: \{0,1\}^{C_j} \to {\mathbf R} }[/math] is a function. The point is that each factor [math]\displaystyle{ p_j }[/math] only takes [math]\displaystyle{ \exp(O(polylog(n)) }[/math] real numbers to describe, and so the entire distribution also only requires [math]\displaystyle{ \exp(O(polylog(n)) }[/math] real numbers to describe. We also ask that the [math]\displaystyle{ C_j }[/math] are organised along a directed acyclic graph in some technical sense which will not be detailed here.

The purpose of the FO(LFP) machinery is (I think) ultimately to obtain a result of the following form:

Claim. If a solution space in [math]\displaystyle{ \{0,1\}^n }[/math] can be "explored by a P algorithm" in a suitable sense, then the uniform distribution [math]\displaystyle{ \mu }[/math] of that space is the _projection_ of a polylog-parameterisable measure [math]\displaystyle{ \mu' }[/math] on a subset of a polynomially larger space [math]\displaystyle{ \{0,1\}^{n^{O(1)}} }[/math].

However, a key point that does not seem to be sufficiently addressed in the paper is whether the projection of a polylog-parameterisable measure remains polylog-parameterisable, and which I believe could in fact be a serious problem with the proof.

Let me illustrate the above Claim with a simple example, that of a straight line program. Let [math]\displaystyle{ N := n^{100} }[/math], and consider a boolean function [math]\displaystyle{ f: \{0,1\}^n \to \{0,1\} }[/math] obtained by the formula

[math]\displaystyle{ f(x_1,\ldots,x_n) := x_N }[/math]

where [math]\displaystyle{ x_{n+1},\ldots,x_N }[/math] are obtained from previous literals by formulae of the form

[math]\displaystyle{ x_i = f_i( x_{a_i}, x_{b_i} ) }[/math] (1)

for [math]\displaystyle{ 1 \leq a_i,b_i \lt i }[/math] and some logic gates [math]\displaystyle{ f_i : \{0,1\}^2 \to \{0,1\} }[/math]. Now let

[math]\displaystyle{ S := \{ (x_1,\ldots,x_n, f(x_1,\ldots,x_n)): (x_1,\ldots,x_n) \in \{0,1\}^n \} \subset \{0,1\}^{n+1} }[/math]

be the graph of f, and let [math]\displaystyle{ \mu }[/math] be the uniform distribution on S, thus [math]\displaystyle{ \mu }[/math] assigns a probability [math]\displaystyle{ 2^{-n} }[/math] to each element of S:

[math]\displaystyle{ \mu(x_1,\ldots,x_{n+1}) = 2^{-n} 1_{x_{n+1} = f(x_1,\ldots,x_n)}. }[/math] (2)

As f is fairly complicated (indeed, it is not obviously distinguishable from a random function), it does not seem that [math]\displaystyle{ \mu }[/math] itself is polylog parameterisable. However, [math]\displaystyle{ \mu }[/math] can be lifted to a probability measure [math]\displaystyle{ \mu' }[/math] on the larger space [math]\displaystyle{ \{0,1\}^N }[/math] which _is_ polylog parameterisable. Namely, let [math]\displaystyle{ S' \subset \{0,1\}^N }[/math] be the set of all tuples [math]\displaystyle{ (x_1,\ldots,x_N) \in \{0,1\}^N }[/math] for which (1) holds for all [math]\displaystyle{ i=n+1,\ldots,N }[/math]. Clearly, the map [math]\displaystyle{ (x_1,\ldots,x_N) \mapsto (x_1,\ldots,x_n,x_N) }[/math] from [math]\displaystyle{ \{0,1\}^N }[/math] to [math]\displaystyle{ \{0,1\}^{n+1} }[/math] maps S' to S in a bijective manner, and so the uniform measure [math]\displaystyle{ \mu' }[/math] on S' projects to the uniform measure [math]\displaystyle{ \mu }[/math] on S.

On the other hand, [math]\displaystyle{ \mu' }[/math] is polylog-parameterisable. Indeed, we have the explicit factorisation

[math]\displaystyle{ \mu'(x_1,\ldots,x_N) := 2^{-n} \prod_{i=n+1}^N 1_{x_i = f_i(x_{a_i},x_{b_i})} }[/math].


Next, consider the problem of parameterising the uniform distribution [math]\displaystyle{ \nu }[/math] on the solution space

[math]\displaystyle{ T := \{ (x_1,\ldots,x_n): F(x_1,\ldots,x_n) = 1 \} \subset \{0,1\}^n }[/math]

of some function [math]\displaystyle{ F: \{0,1\}^n \to \{0,1\} }[/math], which we assume to be non-empty. We assume that the partial satisfiability problem is solvable here, thus for any initial partial assignment [math]\displaystyle{ x_1,\ldots,x_i }[/math] of literals, one can quickly solve the problem of whether this partial assignment can be completed to a full solution in T. Indeed, let us suppose that there is a polynomial length straight line program [math]\displaystyle{ f_i: \{0,1\}^i \to \{0,1\} }[/math] such that [math]\displaystyle{ f_i(x_1,\ldots,x_i)=1 }[/math] if and only if there exists [math]\displaystyle{ x_{i+1},\ldots,x_n }[/math] for which [math]\displaystyle{ F(x_1,\ldots,x_n)=1 }[/math].

One can then factor [math]\displaystyle{ \nu }[/math] as

[math]\displaystyle{ \nu(x_1,\ldots,x_n) = \frac{1}{|T|} \prod_{i=1}^n (1_{x_i=0} f_i( x_1,\ldots,x_{i-1}, 1) + 1_{x_i=0} f_i( x_1,\ldots,x_{i-1}, 1) ) }[/math]

which has a directed acyclic structure (the nature of the i^th factors is such that the distribution of the i^th literal [math]\displaystyle{ x_i }[/math] is determined purely by its ancestors [math]\displaystyle{ x_1,\ldots,x_{i-1} }[/math]) but is not yet of a polylog parameterisable form (each i has i-1 ancestors rather than polylog ancestors). However, if one expands all the straight line programs [math]\displaystyle{ f_i }[/math] and adds all the literals required for computing such programs, then we can lift [math]\displaystyle{ \nu }[/math] to a polylog parameterisable measure [math]\displaystyle{ \nu' }[/math] similarly to before. However [math]\displaystyle{ \nu }[/math] itself is not obviously a polylog parameterisable measure, merely the projection of one that involves additional "hidden variables".