Oded's choices (Nr. 380)

Some highlights of the Oberwolfach complexity workshop

This my report on the complexity workshop that took place at Oberwolfach in June 2024. This report refers only to some of the presentations, and it does not cover the contents of these presentations but rather some take-home ideas that I extracted. (Omissions, errors and misunderstandings are obviously my fault.) Once the official report will be available, I will post it HERE.

Raghu Meka - Structure vs Randomness Redux: New Frontiers and Applications

Taking the structure vs randomness approach to its extreme, one may define structure by a test that distinguishes a given object from a random one. E.g., the failure of the uniform distribution on a set $S$ to pass a small-bias test yields an affine subspace in which the density of $S$ is larger than it is in the original space. Using tests that arise from the specific application, we infer that it suffices to handle either random objects or ones that are very dense in the corresponding domain.

(An afterthought (14-June-24): Actually, one may think of the proof of the regularity lemma in this way. Failure of many pairs to be regular, which is a sense is an indication of non-randomness, yields a refinement of the partition, which is so closer to being regular (and so less non-random).)

Amir Abboud - Recent Developments in Fine-Grained Complexity

Advocating a natural notion (such as `combinatorial constructions') does not require providing a definition. Instead, one may list various desired features (of the construction) and indicate which is satisfied by the alternative construction (or algorithm) although it is not satisfied by the standard one.

Avi Wigderson - Fast Parallel Algorithm for integer GCD

Recall that all natural algebraic functions can be computed in NC2 (let alone that DET, which is in NC2, is a complete problem for the arithmetic analogue of P). Many of these functions are known to be in ``arithmetic AC0'', but many are not (per the breakthrough of LST21 that seperates the arithmetic versions of AC0 and NC1). the current result shows that GCD is in AC0.

Kuikui Lui - Spectral Independence and Applications to Analysis of Markov chains

While most `sampling via MC' result seem problem-specific, the surveyed approach aims at providing conditions regarding the desired distribution that allows for a simple MC to generate it fast (i.e., small mixing-time). Specifically, the main meta-theorem states that if the desired distribution has `limited correlation' then a simple MC mixes rapidly.

Avi Wigderson - Reading Turing's Papers

While all know of Turing's 1936 paper, it is amazing to note that many of the notions that are central to ToC were discussed in that paper and in other papers of Turing (which we are mostly unaware of).

Starting with the 1936 paper, we stress that this paper not only introduced a model (and proved the undecidability result wrt it), but rather also provided a detailed discussion that justifies the model as well as considers potential critiques of it. This feature (i.e., a careful articulation of suggested definitions) is extremely typical of Turing's works. We also stress that the 1936 paper discussed the passibility of defining complexities in this model. It also discussed non-determinism (or choices, leading to c-machines) and viewing them as a proof system. Additional highlights include

In his phd thesis (1939), Turing considered oracle machines (which he termed o-machines).
In his unpublished works in statistics (credited by Good in the early 1950s), Turing actually defined a worst-case notion related to KL-divergence.
In the early 1950s, Turing discussed randomized algorithms, the notion of pseudorandomness, machine learning, and the famous notion of a `test of intelligence' (viewed as an `imitation game').

Pranjal Dutta - Recent Advances in Polynomial Identity Testing

The survey covered both conditional results and unconditional ones. Highlights of the conditional results include

A construction paradigm that starts with a hypothesis regarding the existence of an explicit $O(1)$-variate polynomial of degree $d$ that requires large size in terms of $d$, and derives a hitting set of related size for $n$-variate polynomials.
A new generator (of GKSS22) that does not follow the NW framework, but rather uses numerous different sums of partial derivatives. Hence, unlike prior works, this capitalizes on the arithmetic nature of the relevant tests (i.e., polynomials).
A bootstrapping step that takes a hitting set of size that is only one unit less than trivial (i.e., size $(d+1)^k-1$ for $k$-variate polynomials of degree $d$) and yields a hitting set of size $\poly(s)$ for polynomials computed by size $s$ circuits.

In the unconditional case, the breakthrough result of LST21, which establishes an $n^{\Omega((\log n)^{f(d)})}$ lower bound for the size of depth $d$ Arithmetic circuits computing $o(log n)$-iterated matrix multiplication, demonstrates the misleading nature of the common wisdom by which lower bounds for depth four imply lower bounds for larger depth (where the point is that implication losses a factor that is exponential in square root of $n$).

Venkatesan Guruswami - 3-query LCC

(LCC = locally correctable codes.)

Recalling that $q$-query LCCs of length $\exp(k^{1/(q-1)})$ exists, the focus is on trying to show that this is optimal for $q=3$ at least in some cases (i.e., under some restrictions). The surveyed result gets very close for the case of linear codes (while recalling that the foregoing 3-query LCC is linear). It shows a lower bound that is exponential in $\tildeOmega(n^{1/2})$. Unlike prior work (e.g., see Pravesh Kothari's presentation (below)) that transform potential 3-query LCC to longer 2-query LCC, the current approach is more direct. See AG24.

Ron Rothblum - Batch Verification

Batch verification is a process that allows to verify $k$ claims at a cost that is significantly smaller than running the $k$ verifications independently of one another. Known results include

A doubly-efficient batch verification for any set in NP that has unique witnesses (i.e., a set in UP).
If a set has a doubly-efficient batch verification, then it is in `weak SWI' (where weak refers to having a non-uniform poly-sized prover and noticeable distinguishing gap).
Note that SWI is trivial for UP, and that it is also trivial for a computationally unbounded prover.
Any set in NISZK has batch verification that is honest-verifier SZK (i.e., hvSZK).
Hence a natural subclass of (hv)SZK has (hv)SZK batch verification.

Pravesh Kothari - Spectral Refutation for Semirandom CSPs and Applications to Local Codes

(LCC = locally correctable codes; 3-LCC = 3-query LCCs.)

The general paradigm is to reduce an extremal combinatorial problem (and one can view 3-LCCs as such) to the analysis of a distribution of (linear) CNP instances. Specifically, to prove that an object does not exist, we reduce its non-existence to the claim that the corresponding instance is unsatisfiable with positive probability. (Beware, however, that the distribution has less randomness than the number of variables.)

The results surveyed are a sub-exponential lower bound for linear 3-LCC, superseded by the result presented by Venkat (see above), and a super-polynomial lower bound for general 3-LCCs. I will make some comments about the second result, because its conceptual nuances are not that clear in KM24.

Specifically, while Thm2 is stated for smooth codes, I view it as a technical lemma, whereas the real conceptual focus should be on local correction per se (see Def 3.1). However, while both notions are traditionally defined with respect to high levels of decoding error (equiv., low but positive advantage over a random guess), in the current paper they are defined wrt a low level of error (i.e., a positive $\eps$ smaller than (say) 1/6). The lower bounds increase when this error decreases; that is, they are of the form $\Omega(k^{0.5/\eps})$. This is not a real conceptual problem, because we may expect to have such low level of decoding error when the corruption rate is adequately small. Indeed, a strong notion of LCCs (akin POTs in the context of property testing) may require that the decoding error is linear in the corruption rate.

Another issue at hand is that the results do hold for a general decoder, which may be adaptive. In the context of large decoding error below 1/2, it is easy to move from adaptive decoders to non-adaptive ones, at the cost of reducing the advantage (over 1/2) by a factor of $2^3=8$; but we cannot afford this transformation the current context because we wish to have non-adaptive decoders of small errors. The solution is to avoid this transformation, which introduces additionally difficulties in the proof that handles the general case (i.e., adaptive decoders (let alone for non-linear codes)).

Shuichi Hirahara - One-Way Functions and Zero Knowledge

Letting MCSP denote the minimum circuit size problem (in which the input is the $2^n$-bit long truth table of a Boolean function on $n$-bit strings), it is clear than MCSP is in NP but it is unknown whether it is NP-hard. Likewise, it is clear that MCSP can be solved in time $2^{N+o(n)}$, where $N$ is the length of the input, but one can actually improve the upper bound to $2^{0.8N+o(n)}$.

In contrast to the case of SAT, it is known that average case hardness of MCSP (wrt the uniform distribution) is equivalent to (randomized) worst-case of GapMCSP. These and other results regarding MCSP were used in order to prove that if NP is in ZK in a meaningful way (i.e., NP is not in ioP/poly), then one-way function exist, but, as illustrated by Liu, Mazor, and Pass, an alternative presentation may avoid MCSP and other `meta-complexity' problem altogether.

Yael Kalai - Recent Developments in Constructing SNARGs for NP

SMARG are succinct non-interactive computationally-sound proofs (aka, arguments). The non-interactive model includes a common reference string (CRS, which may be either random or structured), and adaptivity refers to whether the cheating prover selects a false claim based on the CRS. It is known that SNARGs can be constructed in time that is closely related to the complexity of deciding membership, and the challenge is to construct SNARGs in polynomial-time when given an NP-witness (for an input in an NP-set).

One result that was mentioned is that (interactive) batch arguments for NP (see Ron Rothblum's presentation (above)) can be converted into SNARGs for NP.

Venkatesan Guruswami - The Parameterized Inapproximability Hypothesis

Parameterized problems refer to pairs of the form $(k,x)$ such that $k$ is a parameter that restricts possible solutions and $x$ is the actual input; e.g., in case of VC, on input $(k,G)$, one seeks a $k$-vertex cover in $G$ (or asks if such exists). Recall that FPT is the class of such problems that can be solved in time $f(k)\cdot\poly(|x|)$, where $f$ is an arbitrary function (and indeed VC is in FPT via a truncated search on all possible sets). In contrast, under ETH, Clique is not in FPT (and is complete for a class denoted W[1] (i.e., the reduction is in FPT and the new parameter depends only on the original parameter)).

The Parameterized Inapproximability Hypothesis (PIH) asserts that the constraint satisfaction problem in which the instances have $k$ variables that range an alphabet of size $n$ and (2-variable) constraints that are specified in the input are not in FPT. (Note that unlike in CSP, the set of possible constraints is not fixed for the problem; instead, they are an integral part of the input.)

It was known that gapETH implies PIH, and the focus here is in proving that ETH itself implies PIH. See GLRSW24.

Shafi Goldwasser -- Verification of ML

My take-home is a non-ML version of what Shafi talked about.

Recall that a checker for a search problem R is a probabilistic oracle machine, denoted $C$, that is required to satisfy the following two conditions:

For every correct program $P$ (i.e, $P(x) \in R(x)$ for every $x$ that has a solution and $P(x)=\bot$ otherwise), and for every input $x$, it holds that the probability that $C^P(x)$ accepts is at least $2/3$ (alternatively, one may insist on probability 1).
(This may be viewed as a completeness requirement.)
For every program $P$ and every input $x$, if $P$ outputs a wrong answer on $x$ (i.e., $P(x)\not\in R(x)$ when $R(x)$ is non-empty, and $P(x)\neq\bot$ otherwise), then the probability that $C^P(x)$ rejects is at least $2/3$.
(This may be viewed as a soundness requirement.)

Note that when $P$ outputs a correct answer on $x$, although $P$ is not correct on all inputs, we are allowed to reject although $P$ may be incorrect on very few inputs. Hence, any problem that ca be solved in polynomial-time has a trivial checker which always rejects (assuming we limit the scope to polynomial-time programs). This unappealing phenomenon does not arise if we strengthen the first condition as follows

Strong completeness (wrt the uniform input distribution)

Note that combining such a strong checker with a program that is correct on $1-\eps$ fraction of the inputs yields a program that yields a correct answer on $1-O(\eps)$ fraction of the inputs, and outputs a special failure symbol on the other inputs. This effect can be met by any self-corrector for $R$, but the self-correction requirement seems stronger; that is, the latter requires obtaining the correct output for every input when providing with a program that is correct on at least $1-\eps$ of the inputs, for some specified $\eps$. In other words, combining a self-corrector with a self-tester, we do get a strong checker, but this may be an overkill. Indeed, can we obtain strong checkers in cases that we know of no (efficient) self-correctors and testers (let alone when we know that such cannot exist)?

The definition can be generalized to arbitrary input distributions (and the requirement may be made for any distribution in a specified class).

Xin Li - Recent Developments in the Theory of Randomness Extractors

A useful observation that unifiers many (non-explicit) results regarding restricted classes of random sources asserts that for any class of $2^{2^k}$ sources of min-entropy $k$, there exists a deterministic (seedless) extractor. However, we seek explicit and efficient constructions, and the actual focus of the presentation was on two-source extractors (i.e., the source may be viewed as partitioned into two parts yielding two stochastically independent sources).

Turning to two-source extractors, one may note that the inner-product (mod 2) extractor breaks at min-entropy rate 1/2 because the two sources may reside in orthogonal affine subspaces. The solution is to encode each source before applying inner-product (e.g., encode the $n$-bit $x$ by the $n^2$ outer product of $x$ with itself).

The focus of the talk was on constructing a two-source extractor for asymptotically optimal min-entropy (i.e., logarithmic min-entropy). See my comments and the original paper.

Let me seize the opportunity and comment that it is important to consider relaxations of the independence (of the two sources) requirement. An initial study of this direct is reported in BGM19.

Oded Goldreich - On the Size of Depth Three Boolean Circuits for Computing Low Degree Polynomials

(I deviate from my own policy of not reporting here of my own work. Here are notes on a presentation I gave at the very last session of the workshop, which was devoted to free discussions.)

We consider (block) multi-linear functions, which are sums of monimials where each monomial has a single variable from each of the $t$ block, which are each of length $n$. We focus on small $t$ (e.g., $t=1,2,3,...\log n)$). For $t=1$ (Parity), it is known that depth three circuits are of size $\exp({\sqrt n})$, and the goal is to obtain higher lower bounds for explicit multi-linear functions. But we don't even know whether this is true existentially.

Question: For $t\geq 2$, do there exists $t$-linear functions that require depth three circuits of size $\exp(n^{t/(t+1)})$? A restricted class of such circuits arises from a model of Arithmetic circuits (over GF(2)) with general (multi-linear) gates of bounded arity, which is one of the two complexity measures. Specifically, the AN-complexity of such circuit is the maximum between the arity of its gates and the number of gates. Hence, AN(f) is the minimum AN-complexity of circuits computing f, and AN2(f) is the minimum AN-complexity of depth two circuits computing f. The model as well as the following three results are taken from GW13.

The size of depth-three Boolean ciruits is at most exponential in their AN-complexity: for every $t$ and $t$-linear $f$, there exists a depth three Boolean circuit of size $\exp(AN(f))$ that computes $f$.
This is easy to show for AN2, whereas the general case uses the Cook/Valiant trick.
A generic upper bound: for every $t$ and $t$-linear $f$, it holds that $AN(f) \leq AN2(f) = $O((tn)^{t/(t+1)})$.
A non-explicit lower bound: for every $t$ and lmost all $t$-linear $f$, it holds that $AN2(f) \geq AN(f) = $\Omega((tn)^{t/(t+1)})$.

So the challenge is to meet the (tight) lower bound by an explicit construction. Progress towards this goal has been made:

The explicit 3-linear function $f(x,y,z)=\sum_{i.j}x_i y_j z_{i+j}$ satisfies $AN2(f) = $\tildeOmega(n^{2/3)})$ as well as $AN(f) = $\tildeOmega(n^{3/5})$.
There exists an explicit 4-linear function $f$ that satisfies $AN(f) = $\tildeOmega(n^{2/3)})$.
For exery $\eps > 0$, when using $t=\ceil{4/\eps^2}$, the following explicit $t$-linear function $f$ that satisfies $AN2(f) = $Omega(n^{1-\eps})$. (N.B.: This is only for AN2.)
$f(x^{(1)},...,x^{(t)}) = \sum_{i_1,...,i_{t-1}} (\prod_{j\in[t-1]} x_{i_j}) x^{(t)}_{i_1+...+i_{t-1}}$

The first two items are from GT15 and the last one is from G19.

Avishay Tal - TreeEval is in almost log space (by Cook and Mertz)

(Amnon Ta-Shma told me about this presentation at the airport. I did recommend this work in Choice Nr 362, but wish to add a comment now.)

Rather than discussing the catalytic space model, I prefer to use the model of global-tape oracle machines presented in Definition 5.8 of my computational complexity book. (The catalytic model is a special case.) The crux of the current result is that any Boolean function over $\{0,1\}^\ell$ can be computed by this model using global space $O(\ell\log\ell)$ and local space $O(\log\ell)$. Furthermore, this holds also if the input bits are obtained by oracle calls. Hence, using Lemma 5.10, TreeEval for depth $d$ and $k$-bit long inputs at the leaves can be computed in (the standrad model in) space $O((d+k)\log k)$. As for the crux itself, it is proved by taking low-degree extensions of the original functions and manipulating them within a field that has more that $\ell$ elements.

Added on 26-June-24: my overview and digest.

Back to list of Oded's choices.