The issue is explicitly consructing an averaging sampler that uses an optimal number of random bits and an opti8mal number of samples. That is, we wish to optimize two seemingly conflicting parameters. While a probabilistic argument demonstrates the existence of such samplers, the point is obaining an explicit one. This was known when dropping the averaging requirement (i.e., that the sampler outputs the average of its sample points), and explicit averaging samplers that optimize one of the parameters were also known.
The new construction uses the composition paradygm, in a manner analogous to the first proof of the PCP theorem. That is, a sampler with (almost) optimal randomness (and non-optimal but not terrible sample complexity) is used as an outer component, and a new sampler with almost optimal sample complexity (and non-optimal but not terrible randomness complexity) is used as the inner component.
We present the first efficient averaging sampler that achieves asymptotically optimal randomness complexity and near-optimal sample complexity for natural parameter choices. Specifically, for any constant $\alpha > 0$, for $\delta > 2^{-\mathrm{poly}(1 / \varepsilon)}$, it uses $m + O(\log (1 / \delta))$ random bits to output $t = O(\log(1 / \delta) / \varepsilon^{2 + \alpha})$ samples $Z_1, \dots Z_t \in \{0, 1\}^m$ such that for any function $f: \{0, 1\}^m \to [0, 1]$,
\[\Pr\left[\left|\frac{1}{t}\sum_{i=1}^tf(Z_i) - \mathbb{E} \, f\right|
\le \varepsilon \right] \ge 1 - \delta. \]
The sample complexity is optimal up to the $O(\varepsilon^\alpha)$ factor.
We use known connections with randomness extractors and list-decodable codes to give applications to these objects.
See ECCC TR24-097.