Equal-slices distribution for DHJ(k)

From Polymath Wiki
Jump to navigationJump to search

For [math]\displaystyle{ k \in {\mathbb N} }[/math], define [math]\displaystyle{ \nu_k }[/math] to be the uniform distribution on the [math]\displaystyle{ k }[/math]-simplex [math]\displaystyle{ \{(p_1, \dots, p_k) : p_i \geq 0\ \forall i, \sum \pi_i = 1\} }[/math]. It will be useful to think of [math]\displaystyle{ \nu_k }[/math] as first drawing [math]\displaystyle{ k }[/math] i.i.d. [math]\displaystyle{ \mathrm{Exponential}(1) }[/math] rv's [math]\displaystyle{ Y_1, \dots, Y_k }[/math], and then setting [math]\displaystyle{ p_j = Y_j/(\sum Y_i) }[/math].

Some comments:

1. The fact that [math]\displaystyle{ \nu_k }[/math] can be thought of this way is because the density function of [math]\displaystyle{ (Y_1, \dots, Y_k) }[/math] is [math]\displaystyle{ \exp(-y_1)\cdots\exp(-y_k) = \exp(-(y_1 + \cdots + y_k)) }[/math], which only depends on [math]\displaystyle{ s = y_1 + \cdots + y_k }[/math]; i.e., it's constant on each simplex [math]\displaystyle{ Y_1 + \cdots + Y_k = s }[/math].

2. One can also think of the [math]\displaystyle{ Y_j }[/math]'s as interarrival times in a Poisson Process. By a well-known fact about the location of events in a Poisson Process, it follows that [math]\displaystyle{ \nu_k }[/math] is also equivalent to picking [math]\displaystyle{ k-1 }[/math] independent uniformly random points in [math]\displaystyle{ [0,1] }[/math] and then letting [math]\displaystyle{ p_j }[/math] be the length of the [math]\displaystyle{ j }[/math]th segment formed.

3. If we draw [math]\displaystyle{ (p_1, \dots, p_k) }[/math] from [math]\displaystyle{ \nu_k }[/math], and then draw a string in [math]\displaystyle{ [k]^n }[/math] according to the product distribution defined by [math]\displaystyle{ (p_1, \dots, p_k) }[/math], then we get an "equal-slices" draw from [math]\displaystyle{ [k]^n }[/math]. (You can take this as the definition of "equal-slices" if you like.)

4. Given that we're going to be doing this, it's not really essential to divide all of the [math]\displaystyle{ Y_i }[/math]'s by their sum. We can be more relaxed and just say that [math]\displaystyle{ \nu_k }[/math] determines [math]\displaystyle{ (Y_1, \dots, Y_k) }[/math], and then we make a draw from [math]\displaystyle{ [k]^n }[/math] according to the product distribution in which [math]\displaystyle{ j \in [k] }[/math] is chosen with probability proportional to [math]\displaystyle{ Y_j }[/math].

5. Summarizing, we can draw from equal-slices on [math]\displaystyle{ [k]^n }[/math] as follows: pick [math]\displaystyle{ (Y_1, \dots, Y_k) }[/math] as i.i.d. [math]\displaystyle{ \mathrm{Exponential}(1) }[/math]'s; then draw from the product distribution "proportional to [math]\displaystyle{ (Y_1, \dots, Y_k) }[/math]".


The point of this article is the following:


Goal: To define a distribution on combinatorial lines [math]\displaystyle{ (x^1, \dots, x^k, z) \in [k+1]^n }[/math] in such a way that: (i) [math]\displaystyle{ z }[/math] is distributed according to equal-slices on [math]\displaystyle{ [k+1]^n }[/math]; (ii) each [math]\displaystyle{ x^j }[/math] is distributed according to equal-slices on [math]\displaystyle{ [k]^n }[/math]. Please note that [math]\displaystyle{ x^j }[/math] will not necessarily be the point in the line where the wildcards are [math]\displaystyle{ j }[/math]'s.


Let us record an easy-to-see fact:

Proposition 1 Suppose we first draw a string in [math]\displaystyle{ [k+1]^n }[/math] from the product distribution [math]\displaystyle{ \mu }[/math] proportional to [math]\displaystyle{ (y_1, \dots, y_{k+1}) }[/math], and then we change all the [math]\displaystyle{ (k+1) }[/math]'s in the string to [math]\displaystyle{ j }[/math]'s, where [math]\displaystyle{ j \in [k] }[/math]. Then the resulting string is distributed according to the product distribution [math]\displaystyle{ \mu_j }[/math] on [math]\displaystyle{ [k]^n }[/math] proportional to [math]\displaystyle{ (y_1, \dots, y_{j-1}, y_j + y_{k+1}, y_{j+1}, \dots, y_k) }[/math].


The following proposition achieves the essence of the goal:

Proposition 2 Suppose [math]\displaystyle{ \lambda }[/math] is a distribution on the [math]\displaystyle{ k }[/math]-simplex defined as follows. First we draw i.i.d. Exponentials [math]\displaystyle{ (Y_1, \dots, Y_{k+1}) }[/math] as in [math]\displaystyle{ \nu_{k+1} }[/math]. Then we set [math]\displaystyle{ W^{(j)} = (Y_1, \dots, Y_{j-1}, Y_j + Y_{k+1}, Y_{j+1}, \dots, Y_k) }[/math] (as in Proposition 1). Next, we choose [math]\displaystyle{ J \in [k] }[/math] uniformly at random. Following this, we define [math]\displaystyle{ \vec{W} = W^{(J)} }[/math]. Finally, we scale [math]\displaystyle{ \vec{W} }[/math] so as to get a point [math]\displaystyle{ (p_1, \dots, p_k) }[/math] on the [math]\displaystyle{ k }[/math]-simplex.
Then [math]\displaystyle{ \lambda }[/math] is identical to [math]\displaystyle{ \nu_k }[/math] (even though the components of [math]\displaystyle{ \vec{W} }[/math] are not i.i.d. Exponentials).

Proof: It suffices to check that density function [math]\displaystyle{ f(w_1, \dots, w_k) }[/math] of the vector [math]\displaystyle{ \vec{W} }[/math] depends only on [math]\displaystyle{ w_1 + \cdots + w_k }[/math]. Now [math]\displaystyle{ \vec{W} }[/math] has a mixture distribution, a uniform mixture of the [math]\displaystyle{ W^{(j)} }[/math] distributions. The density of [math]\displaystyle{ W^{(j)} }[/math] at [math]\displaystyle{ (w_1, \dots, w_k) }[/math] is

[math]\displaystyle{ \displaystyle \exp(-w_1)\cdots\exp(-w_{j-1}) \cdot \left(w_j \exp(-w_j)\right) \cdot \exp(-w_{j+1}) \cdots \exp(-w_k) \qquad }[/math]

[math]\displaystyle{ \displaystyle \qquad = w_j \exp(-(w_1 + \cdots + w_k)). }[/math]

This is because the density of a sum of two independent [math]\displaystyle{ \mathrm{Exponential}(1) }[/math]'s is [math]\displaystyle{ t\exp(-t) }[/math]. Hence the density of [math]\displaystyle{ \vec{W} }[/math] at [math]\displaystyle{ (w_1, \dots, w_k) }[/math] is

[math]\displaystyle{ \displaystyle \frac{w_1 + \cdots + w_k}{k} \exp(-(w_1 + \cdots + w_k)), }[/math]

which indeed depends only on [math]\displaystyle{ w_1 + \cdots + w_k }[/math]. [math]\displaystyle{ \Box }[/math]


We can now achieve the goal by combining the previous two propositions:

Achieving the Goal: Draw [math]\displaystyle{ z \in [k+1]^n }[/math] according to equal-slices on [math]\displaystyle{ [k+1]^n }[/math]. Next, let [math]\displaystyle{ v^j \in [k]^n }[/math] be the string formed by changing all the [math]\displaystyle{ (k+1) }[/math]'s in [math]\displaystyle{ z }[/math] to [math]\displaystyle{ j }[/math]'s. Finally, pick a random permutation [math]\displaystyle{ \pi }[/math] on [math]\displaystyle{ [k] }[/math] and define [math]\displaystyle{ x^{i} = v^{\pi(i)} }[/math].

Proof: We can think of [math]\displaystyle{ z }[/math] as being drawn from the product distribution proportional to [math]\displaystyle{ (Y_1, \dots, Y_{k+1}) }[/math], where [math]\displaystyle{ (Y_1, \dots, Y_{k+1}) }[/math] is drawn as in [math]\displaystyle{ \nu_k }[/math]. The strings [math]\displaystyle{ (v^1, \dots, v^k, z) }[/math] form a combinatorial line ("in order"), so clearly [math]\displaystyle{ (x^1, \dots, x^k, z) }[/math] form a combinatorial line (possibly "out of order"). We have that [math]\displaystyle{ v^j }[/math] is distributed according to the product distribution proportional to [math]\displaystyle{ W^{(j)} }[/math]. Thus [math]\displaystyle{ x^i }[/math] is distributed according to a product distribution which itself is distributed as [math]\displaystyle{ \lambda = \nu_k }[/math]. Hence [math]\displaystyle{ x^i }[/math] is equal-slices-distributed on [math]\displaystyle{ [k]^n }[/math]. [math]\displaystyle{ \Box }[/math]