Oded's choices (Nr. 377)

Interactive Proofs for General Distribution Properties

by Tal Herman and Guy Rothblum

Oded's comments

When introducing MAPs (MA Proofs of Proximity), Gur and (Ron) Rothblum pointed out that a generic system may consists of the prover sending a full description of the input string and the verifier testing equality between this message and the actual input (via a trivial tester that makes $O(1/\eps)$ queries). They concluded that the study of MAPs (and IPPs) should focus on systems that offer sublinear query and communication complexity. A similar observation applies to the study of IPPs for distributions, initiated by Chiesa and Gur, although in that case testing equality between a fixed distribution and an input distribution involves a non-trivial tester (which uses $O({sqrt n}/\eps^2)$ samples).

The current work presents such interactove proofs of proximity (IPPs) for a very general class of properties (of distributions), which typically extend beyond the label-invariant properties, which were considered in previous works of the same authors (see HERE and HERE). The class is defined in terms of the circuit complexity of deciding whether a (full) description of a distribution satisfies the property.

The original abstract

Suppose Alice has collected a small number of samples from an unknown distribution, and would like to learn about the distribution. Bob, an untrusted data analyst, claims that he ran a sophisticated data analysis on the distribution, and makes assertions about its properties. Can Alice efficiently verify Bob's claims using fewer resources (say in terms of samples and computation) than would be needed to run the analysis herself?

We construct an interactive proof system for any distribution property that can be decided by uniform polynomial-size circuits of bounded depth: the circuit gets a complete description of the distribution and decides whether it has the property. Taking $N$ to be an upper bound on the size of the distribution's support, the verifier's sample complexity, running time, and the communication complexity are all sublinear in $N$: they are bounded by $\widetilde{O}( N^{1-\alpha} + D)$ for a constant $\alpha > 0$, where $D$ is a bound on the depth of the circuits that decide the property. The honest prover runs in $\text{poly}(N)$ time and has quasi-linear sample complexity. Moreover, the proof system is tolerant: it can be used to approximate the distribution's distance from the property. We show similar results for any distribution property that can be decided by a bounded-space Turing machine (that gets as input a complete description of the distribution). We remark that even for simple properties, deciding the property without a prover requires quasi-linear sample complexity and running time. Prior work [Herman and Rothblum, FOCS 2023] demonstrated sublinear interactive proof systems, but only for the much more restricted class of label-invariant distribution properties.

See ECCC TR24-094.

Back to list of Oded's choices.