Why I Am Excited About This Research Direction: A Tribute to Shafi Goldwasser

by Madhu Sudan, Huijia (Rachel) Lin, Maria-Florina (Nina) Balcan, and Ronitt Rubinfeld

Preface

One of the many distinctive features of Shafi Goldwasser's research is excitement: about the topics, about the directions, and above all about the research process itself, not to mention the exciting and celebrated breakthroughs that have ensued.

Inspired by Shafi Goldwasser's approach and contributions, the theme of this event is highlighting the excitement of the process of doing research. Four speakers presented research they are particularly excited about through this lens. Following is my take of their presentations.

Context in Communication, by Madhu Sudan

Focusing on the complexity of communication, the model of communication complexity seems an adequate model. But the focus will not be on using it for deriving loiwer bounds that shed light on other computational models, but rather using it as a framework to upper bounds. The aim is to capture and reflect the fact that communication is made feasible by a huge amount of common context, although this context is never spelled-out and the parties actually do not have the same view of what this context is.

In any human life situation, it is this common context that makes coordination of action, let alone a feeling of mutual understanding, possible via brief communication. The communication is brief in comparison to the parties' inputs, let alone in comparison to the vast context that they are aware of. A technical talk in a conference such as FOCS presumes the use of a common natural language, and several layers of technical knowledge build on top of it (e.g., high-school math, knowledge of certain parts of TOC at large, and specific knowledge of the specific area).

Turning to the actual studies surveyed in this talk, they are toy examples in which the context takes the form of shared randomness, knowledge of the joint distribution from which inputs are picked, or the function that needs to be computed. The point is that each party may have a different view of these key components, and these views are highly correlated but not identical.

Program Obfuscation, by Huijia (Rachel) Lin

The initial motivating application of obfuscation is to obfuscate programs (or software) so that copies can be used without being reverse-engineered (i.e., gaining any knowledge beyond what is disclosed by using the program as a black box). But obfuscation has vast amount of applications in Cryptography, yielding almost all known primitives (possibly with the help of some very basic primitives) and even very fancy ones such as FHE (Fully Homomorphic Encryption) and Functional Encryption. Furthermore, it also implies functionalities that are unknown to be achievable otherwise. All this is easy to see for the virtual black-box (VBB) notion of obfuscation, which is known to be unimplementable, but it was also shown for a seemingly weak notion called indistinguishable obfuscation (iO), which coincides with the best one can do (wrt obfuscation).

The second generation of candidate constructions is based on combining ``bilinear'' and ``trilinear maps'' with strong notions of pseudorandom generators (i.e., ones that are highly local, meaning having degree two or three when written as polynomial maps).

Machine Learning for Algorithm Design, by Maria-Florina (Nina) Balcan

Machine Learning usually deals with concept classes containing simple functions. In some cases, these functions may be viewed as computing devices (e.g., Neural Networks), but this perspective is not taken typically, let alone emphasized or highlighted. The talk focuses on research that explicitly takes this perspective; that is, it focuses on concept classes that are families of complex and parameterized algorithms. The parameters are typically thresholds that determine choices between natural choices of cost measures, distance measures, or algorithmic methods. In practice, these threshold are typically determined by gut feeling and ad hoc experiments. What can be more natural than to systematically employ machine learning methods to determine these thresholds?

Of course, easier said than done. It turns out that devising learning algorithms for this setting requires dealing with new challenges and phenomenon (e.g., the performance of the parametric algorithms is typically non-continuous in the parameters, a phenomenon that does not occur in standard machine learning settings). So this research is pushing forward both the algorithms used in applications and the algorithmic methods of machine learning itself.

Local Algorithms, by Ronitt Rubinfeld

Property testing and its immediate extensions to approximation (of values) deals with huge inputs, but seek (relatively) tiny outputs. In contrast, local computation algorithms (LCAs) deal with huge inputs and huge outputs, akin locally decodable codes. This requires a definition of what does it mean to compute an output, which essentially says that one can retrieve any desired part of the output by a super-fast computation, which probes few locations in the input. Hence, we have queries to the output that are served by making probes to the input.

A few comments are in place. First, we are talking about centralized algorithms, which are not restricted in their access to the input (unlike distributed algorithm in the so-called LOCAL model where, in each round, each processor can obtain information only from its neighbors). Second, few copies of the centralized algorithm may run on the same input, without coordinating their actions (via communication) beyond sharing some initial global randomness. Using randomization seems essential (challenge: formulate and prove this assertion), and different random choices may yield different outputs, but each choice of randomness must determine a unique output. In particular, the output is not allowed to depend on the sequence of queries. The latter requirement is natural per se, but it also implies that there is no need for inter-queries storage, and that all copies of the algorithm (which may be used by different entities) obtain the same output (provided that they start with the same randomness).

In spite of the differences, there is a tight connection between the study of LCAs and the study of the LOCAL model of distributed computing. In particular, for problems regarding bounded-degree graphs, the probe complexity of LCAs is at most exponential in the round complexity of the distributed algorithm.


Back to list of Oded's choices.