The complexity of distributions

by Emanuele Viola

Oded's comments

This work is a good opportunity to mention my work with Shafi Goldwasser and Asaf Nussbaum On the Implementation of Huge Random Objects; see Section 2.5 of our work, which explicitly discusses the special case of objects of feasible size. While (in Sec. 2.5) we considered efficient implementions of such objects (i.e., distributions on strings of feasible size), the current work refers to super-efficient implementations (i.e., by polynomial-size constant-depth circuits). Its focus is actually on lower bounds.

From that perspective, I find Thm 1.6 most interesting. It asserts that certain simple distributions cannot be approximate (well) by small decision forests. In contrast, Thm 1.3 refers to simultaneously achieving small locality and relatively low randomness complexity, which is used for deriving lower bounds on data structures.

The paper also present an interesting pseudorandom generator (PRG), which is obtained by combining Thms 8.1 and 8.3. Specifically, for any constant $d$ there exists a constant $c$ such that $m$-bit long sequences generated from a seed of length $k=(\log m)^{c\log\log m}$ (i.e., $k=\exp(c (\log\log m)^2)$) by uniform $m^c$-size circuits of depth 2 fool any depth $d$ circuit of size $m$. That is, the PRG takes a seed of length $k$, and stretches it to a sequence of length $\ell(k)=\exp(2^{\sqrt{(\log k)/c}})$ such that it fools any depth $d$ circuit of size $\ell(k)$ (w.r.t gap $1/\ell(k)$), while each bit of the output is produced by an explicit DNF of size $\ell(k)^c$. Note that Nisan's generator uses a seed of polylogarithmic length (i.e., $k=\poly(\log m)$), but the PRG itself is computed by a circuit of depth $O(d)$.

The original abstract

Complexity theory typically studies the complexity of computing a function $h(x) : {0,1}^m \to {0,1}^n$ of a given input $x$. We advocate the study of the complexity of generating the distribution $h(x)$ for uniform $x$, given random bits. Our main results are:

  1. Any function $f : {0,1}^\ell \to {0,1}^n$ such that (i) each output bit $f_i$ depends on $o(\log n)$ input bits, and (ii) $\ell \le \log_2 \binom{n}{\alpha n} + n^{0.99}$, has output distribution $f(U)$ at statistical distance $\ge 1 - 1/n^{0.49}$ from the uniform distribution over $n$-bit strings of hamming weight $\alpha n$. We also prove lower bounds for generating $(X,b(X))$ for boolean $b$, and in the case in which each bit $f_i$ is a small-depth decision tree. These lower bounds seem to be the first of their kind; the proofs use anti-concentration results for the sum of random variables.
  2. Lower bounds for generating distributions imply succinct data structures lower bounds. As a corollary of (1), we obtain the first lower bound for the membership problem of representing a set $S \subseteq [n]$ of size $\alpha n$, in the case where $1/\alpha$ is a power of $2$: If queries ``$i \in S$?'' are answered by non-adaptively probing $o(\log n)$ bits, then the representation uses $\ge \log_2 \binom{n}{\alpha n} + \Omega(\log n)$ bits.
  3. Upper bounds complementing the bounds in (1) for various settings of parameters.
  4. Uniform randomized AC0 circuits of $\poly(n)$ size and depth $d = O(1)$ with error $\e$ can be simulated by uniform randomized AC0 circuits of $\poly(n)$ size and depth $d+1$ with error $\e + o(1)$ using $\le (\log n)^{O( \log \log n)}$ random bits. Previous derandomizations [Ajtai and Wigderson '85; Nisan '91] increase the depth by a constant factor, or else have poor seed length.


Back to list of Oded's choices.