A Computational Perspective on Sampling (survey)

Abstract

We consider the problem of estimating the average of a huge set of values. That is, given oracle access to an arbitrary function $f:\{0,1\}^n\to[0,1]$, we wish to estimate $2^{-n} \sum_{x\in\{0,1\}^n} f(x)$ upto an additive error of $\epsilon$. We are allowed to employ a randomized algorithm that may err with probability at most $\delta$.

We survey known algorithms for this problem and focus on the ideas underlying their construction. In particular, we present an algorithm that makes $O(\epsilon^{-2} \cdot \log(1/\delta))$ queries and uses $n+O(\log(1/\epsilon))+O(\log(1/\delta))$ coin tosses, both complexities being very close to the corresponding lower bounds.

Preface

The idea of writing this survey occurred to me when finding out that a brilliant, young researcher who has worked in very related areas was unaware of the Median-of-Averages Sampler. It then occurred to me that many of the results surveyed here have appeared in papers devoted to other subjects (indeed, the Median-of-Averages Sampler is an excellent example), and have thus escaped the attention of a wider community, which might have cared to know about them. I thus decided to write a survey that focuses on these very basics.

Material available on-line

• First version posted: May 1997.
• Revisions: July 2010. This revision includes a quantitative improvement in Theorem 6.1, which is obtained by the subsequent work of [RVW]. Also included (as a new Section 7), is an alternative perspective advocated by Ronen Shaltiel and Amnon Ta-Shma.

Back to either Oded Goldreich's homepage or general list of papers.