Oded's choices (Nr. 415)

From Learning Theory to Cryptography: Provable Guarantees for AI

by Jonathan Shafer

Oded's comments

I guess this was prepared as an interview talk, an abusive requirement imposed on defenseless candidates (see my opinion). Still, Jonathan demonstrated how to transform an imposition to an educational opportunity. Specifically, I was very happy to learn about the impressive results presented in his works on classical learning theory as well as his work on interactive proofs for verifying ML.

While it is infeasible to detail the corresponding results in the abstract, and I'm a bit too busy to risk recalling them right now, lel me direct you to the relevant papers in [NeurIPS23, NeurIPS25] and [COLT23] (for the classical ML aspect) and [ITCS21] for the interactive proof.

Let me also mention that talking about unlabel data in the context of on-line learning is a (good) metaphor; what it actually means is that the learner is given all data points upfront and then asked to predict their labels in an on-line matter. The fact that this can reduced the fraction of learning errors (a.k.a. ``loss'') stands in contrast to the PAC context.

Original abstract

Ensuring that AI systems behave as intended is a central challenge in contemporary AI. This talk offers an exposition of provable mathematical guarantees for learning and security in AI systems.

Starting with a classic learning-theoretic perspective on generalization guarantees, we present two results quantifying the amount of training data that is provably necessary and sufficient for learning: (1) In online learning, we show that access to unlabeled data can reduce the number of prediction mistakes quadratically, but no more than quadratically [NeurIPS23, NeurIPS25 Best Paper Runner-Up]. (2) In statistical learning, we discuss how much labeled data is actually necessary for learning—resolving a long-standing gap left open by the celebrated VC theorem [COLT23]. Provable guarantees are especially valuable in settings that require security in the face of malicious adversaries. The main part of the talk adopts a cryptographic perspective, showing how to: (1) Utilize interactive proof systems to delegate data collection and AI training tasks to an untrusted party [ITCS21, COLT23, NeurIPS25]. (2) Leverage random self-reducibility to provably remove backdoors from AI models, even when those backdoors are themselves provably undetectable [STOC25]. The talk concludes with an exploration of future directions concerning generalization in generative models, and AI alignment against malicious and deceptive AI.

Back to list of Oded's choices.