Equal-slices.tex: Difference between revisions

From Polymath Wiki
Jump to navigationJump to search
New page: \section{Equal-slice distributions} \subsection{Equal-nondegenerate-slices} \begin{definition} Given a vector $C \in \N^k$ whose components sum to $n$, the associated \emph{slice} of $[...
 
No edit summary
Line 52: Line 52:
\end{proof}
\end{proof}


 
It will also be quite useful to observe that equal-slices is a \emph{mixture of product distributions}.
 
\begin{definition}  \label{def:spacings} We say that $\nu$ is a (randomly chosen) \emph{uniform spacing distribution} on $[k]$ if the vector of probabilities $(\nu[1], \dots, \nu[k])$ is formed as follows: $k$ points $r_1, \dots, r_k$ are chosen uniformly and independently on the circle of unit circumference, dividing it into $k$ arcs; then $\nu[i]$ is set to the length of the arc emanating clockwise from $r_i$.
 
XXXStopped revising hereXXX
 
 
 
 
 
 
 
 
\begin{definition}  \label{def:spacings} We say that $\nu$ is a (randomly chosen) \emph{uniform spacing distribution} on $[k]$ if the vector of probabilities $(\nu[1], \dots, \nu[k])$ is formed as follows: $k$ points $q_1, \dots, q_k$ are chosen uniformly and independently on the circle of unit circumference, dividing it into $k$ arcs; then $\nu[i]$ is set to the length of the arc emanating clockwise from $q_i$.
\end{definition}
\end{definition}
\begin{remark}  It is well known (see, e.g.~\cite[Ch.\ 1]{Rei89})\noteryan{I think the traditional citation here is Pyke '65, but we can save a spot in the biblio by citing Reiss again (which is also a text, whereas Pyke is a bit painful to read)} that a uniform spacing distribution $\nu$ can equivalently be defined as choosing the vector of probabilities $(\nu[1], \dots, \nu[k])$ uniformly from the simplex $\{(p_1, \dots, p_k) : \sum p_i = 1, p_i \geq 0\ \forall i\}$. \noteryan{Or, by taking the spacings of $k-1$ random points in a unit segment.  Since we don't use this fact, I think it's fair to not prove it.}  
\begin{remark}  It is well known (see, e.g.~\cite[Ch.\ 1]{Rei89})\noteryan{I think the traditional citation here is Pyke '65, but we can save a spot in the biblio by citing Reiss again (which is also a text, whereas Pyke is a bit painful to read)} that a uniform spacing distribution $\nu$ can equivalently be defined as choosing the vector of probabilities $(\nu[1], \dots, \nu[k])$ uniformly from the simplex $\{(p_1, \dots, p_k) : \sum p_i = 1, p_i \geq 0\ \forall i\}$. \noteryan{Or, by taking the spacings of $k-1$ random points in a unit segment.  Since we don't use this fact, I think it's fair to not prove it.}  
\end{remark}
\end{remark}
 
\begin{proposition} The equal-slices distribution $\eqs{k}^n$ can equivalently be defined as follows: First, draw a uniform spacing distribution $\nu$ on $[k]$; then, draw a string from the product distribution $\nu^{\otimes n}$ on $[k]^{n}$.   
The following distribution on strings is important for our proof.
 
\begin{definition} The \emph{equal-slices} distribution on $[k]^{n}$, denoted $\eqs{k}^{n}$, is defined as follows: First, draw a uniform spacing distribution $\nu$ on $[k]$; second, draw from the product distribution $\nu^{\otimes n}$ on $[k]^{n}$.  More generally we write $\eqs{k}^J$ for the analogous distribution on $[k]^{J}$, and we write simply $\eqs{k}$ when $n$ or $J$ is clear from context.
\end{definition}
 
\begin{remark} \label{rem:eqs} It can be useful to view the second part of the equal-slices draw of $x \sim [k]^n$ as follows: having chosen $q_1, \dots, q_k$ on the unit circle to give $\nu$, we draw $n$ further points $r_1, \dots, r_n$ uniformly and independently on the circle and set $x_j$ to be $i$ iff $r_j$ is in the arc emanating clockwise from $q_i$.\noteryan{Not worth it to point out that there are certain $0$-probability events involved here\dots}. 
\end{remark}
 
The terminology ``equal-slices'' comes from the following:\noteryan{We don't actually need the observations that follow --- could leave the proof as an exercise :)}
\begin{definition}  Given a vector $C \in \N^k$ whose components sum to $n$, the associated \emph{slice} of $[k]^{n}$ is the set
\[
\{x \in [k]^{n} : x \text{ has $C_1$ many $1$'s, $C_2$ many $2$'s, \dots, $C_k$ many $k$'s}\}.
\]
We abuse terminology by also referring to $C$ as the slice.  A slice is \emph{nondegenerate} when $C_i \neq 0$ for all $i$.  There are $\binom{n+k-1}{k-1}$ different slices, of which $\binom{n-1}{k-1}$ are nondegenerate.\noteryan{Justify with bars and dots?}
\end{definition}
\begin{proposition} \label{prop:eqs-equiv} The equal-slices distribution $\eqs{k}^{n}$ is equivalent to first choosing a slice uniformly from the $\binom{n+k-1}{k-1}$ many possibilities, then drawing a string uniformly from that slice. \noteryan{Include proof?  If so, it's by throwing $n+k-1$ points into the interval and looking at the result in two ways.}
\end{proposition}
 
The following lemma is crucial:
\begin{lemma}  Let $x \sim \eqs{k}^n$, and let $i \sim [k-1]$ be uniformly random.  Then $\chg{x}{k}{i}$ is distributed as~$\eqs{k-1}^n$.
\end{lemma}
\begin{proof}
From the definition, the string $\chg{x}{k}{a}$ is distributed as follows: first $\nu$ is taken to be a uniform spacing distribution on $[k]$; then $\wt{\nu}$ is taken to be the distribution on $[k-1]$ with $\wt{\nu}[j] = \nu[j]$ for $j \neq i$ and $\wt{\nu}[i] = \nu[i] + \nu[k]$; then a string is drawn from the product distribution $\wt{\nu}^{\otimes n}$.  Thus it suffices to show that $\wt{\nu}$ is a randomly chosen uniform spacing distribution on $[k-1]$ in the sense of Definition~\ref{def:spacings}.  This is easy to see: having chosen $k$ points on the circle, thus forming $\nu$, we get the correct distribution $\wt{\nu}$ simply by removing the $k$th point. \noteryan{This is not especially well-explained; will try to think how to say it with the maximum combination of concision and correctness.}
\end{proof}
 
Another important lemma first requires a definition:
\begin{definition} Let $x \in [m]^n$, $y \in [k]^m$.  We write $y \circ x$ for the string in $[k]^n$ given by replacing in $x$ the symbol $i$ by the symbol $y_i$, for all $i \in [m]$.\noteryan{This really is function composition, if one things as a string in $\Omega^J$ as a map $J \to \Omega$.}
\end{definition}
 
\begin{lemma} Let $x \sim \eqs{m}^n$ and let $y \sim \eqs{k}^m$.  Then $y \circ x \in [k]^n$ is distributed as $\eqs{k}^n$.
\end{lemma}
\begin{proof}
XXX Hmm --- I don't see how to prove this slickly without getting into a little more detail.  Perhaps we should get into a little more detail.  XXXACTUALLY IT'S FALSE!XXX
\end{proof}
 
 
 
We will also need the following simple bounds:
\begin{proposition}  \label{prop:rand-min} If $\nu$ is a uniform spacing distribution on $[k]$ as in Definition~\ref{def:spacings}, then
\[
\Pr[\min(\nu) \leq \gamma] \leq k(k-1) \gamma.
\]
\end{proposition}
\end{proposition}
\begin{proof}
\begin{proof}
By a union bound and symmetry, $\Pr[\min(\nu) \leq \gamma] \leq k \Pr[\nu[1] \leq \gamma]$.  The event $\nu[1] \leq \gamma$ occurs only if one of $q_2, \dots, q_k$ falls within the arc of length $\gamma$ clockwise of $q_1$.  By another union bound, this has probability at most $(k-1)\gamma$.
In the usual definition of a draw $x \sim \eqs{k}^n$, the string $x$ only depends on the joint ordering of the $n+k$ points $q_i$ and $r_j$ around the circleFurther, it's easy to see that this ordering is just a uniform (circular) permutation of $n+k$ objectsHence we can also obtain it by choosing all $n+k$ points independently from the \emph{continuous} uniform distribution on the circle of unit circumference.\noteryan{Not worth pointing out that we're disregarding some probability-$0$ events here.} Further, in this continuous distribution, we may choose the $r_j$'s first, and then the $q_i$'s next, one-by-one.  The choice of $r_j$'s gives a uniform spacing distribution $\nu$, and then the formation of $x$ by choice of $q_i$'s is indeed equivalent to drawing $x$ from the product distribution $\nu^{\otimes n}$.\noteryan{Not very eloquently put.}
\end{proof}
 
\begin{proposition} \label{prop:degen} Let $x \sim \eqs{k}^n$.  Then for each $i \in [k]$, the probability that $x$ contains no $i$'s is at most $(k-1)/n$.  And, the probability $x$ is in a degenerate slice is at most $k(k-1)/n$.
\end{proposition}
\begin{proof}  It suffices to prove the first statement, as the second follows immediately by a union boundIt also suffices to prove the first statement for $i = 1$, by symmetry.
 
Consider drawing $x \sim \eqs{k}^n$ as suggested in Remark~\ref{rem:eqs}.  Suppose we condition on the \emph{positions} of the $k+n$ points drawn on the circle, but not their identities as $q_1, \dots, q_k, r_1, \dots, r_n$.  By symmetry, we can choose one of these uniformly to be $q_1$.  The event that $x$ contains no $1$'s is then the event that the first point appearing clockwise after this $q_1$ is an $r$-point, rather than a $q$-point.  This occurs with probability precisely $(k-1)/(n+k-1) \leq (k-1)/n$.
\end{proof}
\end{proof}

Revision as of 05:59, 18 May 2009

\section{Equal-slice distributions}

\subsection{Equal-nondegenerate-slices}

\begin{definition} Given a vector $C \in \N^k$ whose components sum to $n$, the associated \emph{slice} of $[k]^{n}$ is the set \[ \{x \in [k]^{n} : x \text{ has $C_1$ many $1$'s, $C_2$ many $2$'s, \dots, $C_k$ many $k$'s}\}. \] We abuse terminology by also referring to $C$ as the slice. A slice is \emph{nondegenerate} when $C_i \neq 0$ for all $i$. There are $\binom{n+k-1}{k-1}$ different slices, of which $\binom{n-1}{k-1}$ are nondegenerate.\noteryan{Justify? Justify later?} \end{definition}


\begin{definition} Let $2 \leq k \leq n$. The \emph{equal-nondegenerate-slices} distribution on $[k]^n$, denoted $\ens{k}^n$, generates a string $x$ as follows: First, points $q_1, \dots, q_n$ are arranged on a circle in a uniformly random order. This also forms $n$ ``gaps between the $q_i$'s. Second, $k$ gaps are chosen, uniformly from the $\binom{n}{k}$ possibilities; points $r_1, \dots, r_k$ are put into the chosen gaps, in random order. \ignore{Second, $k$ points $r_1, \dots, r_k$ are placed one by one into the gaps, each going into a random unoccupied gap.} This divides the $q_i$'s into ``arcs, with the ``$r_j$ arc consists of those $q_i$'s whose nearest $r$-point counterclockwise is $r_j$. Finally, $x$ is defined by setting $x_i = j$ for all $q_i$ in the $r_j$ arc.\noteryan{Diagram would help.} \end{definition} \begin{remark} More generally, when $|J| \geq k$ we write $\eqs{k}^J$ for the analogous distribution on $[k]^{J}$, and we write simply $\eqs{k}$ when $n$ or $J$ is clear from context. \end{remark} \begin{remark} If $x \sim \ens{k}^n$ then $x$ always belongs to a nondegenerate slice. \end{remark} \begin{remark} It is not hard to see\noteryan{Honest, although we can clarify} that drawing $x \sim \ens{k}^n$ is equivalent to first choosing a nondegenerate slice $C$ uniformly from the $\binom{n-1}{k-1}$ possibilities, then choosing $x$ uniformly from this slice. \end{remark}

\begin{definition} Let $x \in [m]^n$, $y \in [k]^m$. We write $y \circ x$ for the string in $[k]^n$ given by replacing in $x$ the symbol $j$ by the symbol $y_j$, for all $j \in [m]$.\noteryan{This really is function composition, if one thinks of a string in $\Omega^J$ as a map $J \to \Omega$.} \end{definition}


The following two lemmas are crucial: \begin{lemma} \label{lem:last-pt} Let $x \sim \ens{k+1}^n$, and let $j \sim [k]$ be uniformly random. Then $\chg{x}{(k+1)}{j}$ is distributed as~$\ens{k}^n$. \end{lemma} \begin{proof} Suppose we are drawing $x \sim \ens{k+1}^n$ and have chosen the arrangement of $q_1, \dots, q_n$ and also the $k+1$ gaps to be filled with $r$-points. To fill these gaps, we may do the following. First, choose one of the $k+1$ gaps at random to contain $r_{k+1}$. Next, take the first gap counterclockwise from $r_{k+1}$ and let it contain $r_j$, where $j \sim [k]$ is uniformly chosen. Finally, fill the remaining $k-1$ gaps according to a uniform ordering of $[k] \setminus \{j\}$. But having chosen $x$ in this way, $\chg{x}{k+1}{j}$ is obtained simply by ``deleting $r_{k+1}$ and interpreting the result as a (correctly distributed) draw from $\ens{k}^n$.\noteryan{not put very eloquently} \end{proof}

\begin{lemma} Let $x \sim \ens{m}^n$ and let $y \sim \ens{k}^m$. Then $y \circ x \in [k]^n$ is distributed as $\ens{k}^n$. \end{lemma} \begin{proof} The effect of forming $y \circ x$ can be thought of as follows: First the points $q_1, \dots, q_n$ and $r_1, \dots, r_m$ are randomly arranged as usual to form $x$. Next, we think of the \emph{arcs of $q_i$'s} as being the $m$ objects circularly permuted in the initial stage of drawing $y$; we then choose $k$ out of the $m$ gaps between the arcs at random and place points $s_1, \dots, s_k$ into them at random order. Then $z = y \circ x$ is given by setting $z_i = j$ for all $q_i$ in the $s_j$ ``super-arc.

On the other hand, since $\ens{m}^n$ is symmetric under permutations of $[m]$, the middle stage in this composite process, permuting the $r_j$ arcs, is superfluous. Eliminating it, we see that the $s_j$'s are ultimately placed by first choosing $m$ out of $n$ gaps at random, and then further choosing $k$ out of these $m$ choices at random. This is equivalent to simply choosing $k$ out of the $n$ gaps at random; hence $z$ is indeed distributed as $\ens{k}^n$.\noteryan{Again, not put very eloquently} \end{proof}

\subsection{Equal-slices} We will also use the following slight variant on the equal-nondegenerate-slices distribution: \begin{definition} Let $k \geq 2$ and let $n$ be a positive integer. The \emph{equal-slices} distribution on $[k]^n$, denoted $\eqs{k}^n$, generates a string $x$ in a manner similar to that of equal-nondegenerate-slices. The only distinction is that the points $r_1, \dots, r_k$ are placed into random gaps one-by-one, and once a point $r_j$ is placed, it is considered to split the gap into two occupiable gaps. Thus when placing $r_j$, there are $n+j-1$ gaps to choose from. As a consequence, some of the resulting $r_j$ arcs may be empty of $q$-points. \end{definition} \begin{remark} It is again not hard to see\noteryan{Same note} that drawing $x \sim \eqs{k}^n$ is equivalent to first choosing a slice $C$ uniformly from the $\binom{n+k-1}{k-1}$ possibilities (nondegenerate and degenerate), then choosing $x$ uniformly from this slice. \end{remark}

The two distributions are very close: \begin{proposition} \label{prop:degen} For $2 \leq k \leq n$ we have $\dtv{\ens{k}^n}{\eqs{k}^n} \leq k(k-1)/n$. \end{proposition} \begin{proof} The distribution $\ens{k}^n$ is precisely $\eqs{k}^n$ conditioned on no $q_j$'s ending up adjacent. So it suffices to show this event occurs with probability at most $k(k-1)/n$. In a draw from $\eqs{k}^n$, the probability that $q_j$ is placed adjacent to one of $q_1, \dots, q_{j-1}$ is exactly $2(j-1)/(n+j-1) \leq 2(j-1)/n$. Hence the probability of having any adjacent $q_j$'s is at most $\sum_{j=1}^k 2(j-1)/n = k(k-1)/n$, as desired. \end{proof}

It will also be quite useful to observe that equal-slices is a \emph{mixture of product distributions}. \begin{definition} \label{def:spacings} We say that $\nu$ is a (randomly chosen) \emph{uniform spacing distribution} on $[k]$ if the vector of probabilities $(\nu[1], \dots, \nu[k])$ is formed as follows: $k$ points $r_1, \dots, r_k$ are chosen uniformly and independently on the circle of unit circumference, dividing it into $k$ arcs; then $\nu[i]$ is set to the length of the arc emanating clockwise from $r_i$. \end{definition} \begin{remark} It is well known (see, e.g.~\cite[Ch.\ 1]{Rei89})\noteryan{I think the traditional citation here is Pyke '65, but we can save a spot in the biblio by citing Reiss again (which is also a text, whereas Pyke is a bit painful to read)} that a uniform spacing distribution $\nu$ can equivalently be defined as choosing the vector of probabilities $(\nu[1], \dots, \nu[k])$ uniformly from the simplex $\{(p_1, \dots, p_k) : \sum p_i = 1, p_i \geq 0\ \forall i\}$. \noteryan{Or, by taking the spacings of $k-1$ random points in a unit segment. Since we don't use this fact, I think it's fair to not prove it.} \end{remark} \begin{proposition} The equal-slices distribution $\eqs{k}^n$ can equivalently be defined as follows: First, draw a uniform spacing distribution $\nu$ on $[k]$; then, draw a string from the product distribution $\nu^{\otimes n}$ on $[k]^{n}$. \end{proposition} \begin{proof} In the usual definition of a draw $x \sim \eqs{k}^n$, the string $x$ only depends on the joint ordering of the $n+k$ points $q_i$ and $r_j$ around the circle. Further, it's easy to see that this ordering is just a uniform (circular) permutation of $n+k$ objects. Hence we can also obtain it by choosing all $n+k$ points independently from the \emph{continuous} uniform distribution on the circle of unit circumference.\noteryan{Not worth pointing out that we're disregarding some probability-$0$ events here.} Further, in this continuous distribution, we may choose the $r_j$'s first, and then the $q_i$'s next, one-by-one. The choice of $r_j$'s gives a uniform spacing distribution $\nu$, and then the formation of $x$ by choice of $q_i$'s is indeed equivalent to drawing $x$ from the product distribution $\nu^{\otimes n}$.\noteryan{Not very eloquently put.} \end{proof}