Introduction to Complexity Theory

(Lecture Notes - Summaries)

Oded Goldreich

Consider looking at the preface.

First Semester

Lecture 1: The P vs NP Question. We review the fundamental question of computer science, known as the P versus NP question: Given a problem whose solution can be verified efficiently (i.e., in polynomial time), is there necessarily an efficient method to actually find such a solution? Loosely speaking, the first condition (i.e., efficient verification) is captured in the definition of NP, and the second in that of P. The actual correspondence relies on the notion of self-reducibility, which relates the complexity of determining whether a solution exists to the complexity of actually finding one.

Lecture 2: NP-completeness and Self Reducibility. We prove that any relation defining an NP-complete language is self-reducible. This will be done using the SAT self-reducibility (proved in Lecture 1), and the fact that SAT is NP-Hard under Levin Reductions. The latter are Karp Reductions augmented by efficient transformations of NP-witnesses from the original instance to the reduced one, and vice versa. Along the way, we give a simple proof of the existence of NP-Complete languages (by proving that Bounded Halting is NP-Complete). NP-completeness.

Lecture 3: More on NP and some on DTIME. In the first part of this lecture we discuss two properties of the complexity classes P, NP and NPC: the first property is that NP contains problems which are neither NP-complete nor in P (provided NP not equal P), and the second one is that NP-relations have optimal search algorithms. In the second part we define new complexity classes based on exact time bounds, and consider some relations between them. We point out the sensitivity of these classes to the specific model of computation (e.g., one-tape versus two-tape Turing machines).

Lecture 4: Space Complexity. We define ``nice'' complexity bounds; these are bounds which can be computed within the resources they supposedly bound (e.g., we focus on time-constructible and space-constructible bounds). We define space complexity using an adequate model of computation in which one is not allowed to use the area occupied by the input for computation. Before dismissing sub-logarithmic space, we present two results regarding it (contrasting sub-loglog space with loglog space). We show that for ``nice'' complexity bounds, there is a hierarchy of complexity classes - the more resources one has the more tasks one can perform. One the other hand, we mention that this increase in power may not happen if the complexity bounds are not ``nice''.

Lecture 5: Non-Deterministic Space. We recall two basic facts about deterministic space complexity, and then define non-deterministic space complexity. Three alternative models for measuring non-deterministic space complexity are introduced: the standard non-deterministic model, the online model, and the offline model. The equivalence between the standard and online models and their exponential relation to the offline model are proved. We then turn to investigate the relation between the non-deterministic and deterministic space complexity (i.e., Savitch's Theorem).

Lecture 6: Non-Deterministic Logarithmic Space. We further discuss composition lemmas underlying previous lectures. Then we study the complexity class NL (the set of languages decidable within Non-Deterministic Logarithmic Space): We show that directed graph connectivity is complete for NL. Finally, we prove that NL = coNL (i.e., NL is closed under complementation).

Lecture 7: Randomized Computations. We extend the notion of efficient computation by allowing algorithms (Turing machines) to toss coins. We study the classes of languages that arise from various natural definitions of acceptance by such machines. We focus on probabilistic polynomial-time machines with one-sided, two-sided and zero error probability (defining the classes RP (and coRP), BPP and ZPP). We also consider probabilistic machines that uses logarithmic spaces (i.e., the class RL).

Lecture 8: Non-Uniform Polynomial Time (P/poly). We introduce the notion of non-uniform polynomial-time and the corresponding complexity class P/poly. In this (somewhat fictitious) computational model, Turing machines are provided an external advice string to aid them in their computation (on strings of certain length). The non-uniformity is expressed in the fact that an arbitrary advice string may be defined for every different length of input. We show that P/poly ``upper bounds'' the notion of efficient computation (as BPP} subseteq P/poly), yet this upper bound is not tight (as P/poly contains non-recursive languages). The effect of introducing uniformity is discussed, and shown to collapse P/poly to P. Finally, we relate the P/poly versus NP question to the question of whether NP-completeness via Cook-reductions is more powerful that NP-completeness via Karp-reductions. This is done by showing, on one hand, that NP is Cook-reducible to a sparse set iff NP subset P/\poly, and on the other hand that NP is Karp-reducible to a sparse set iff NP=P.

Lecture 9: The Polynomial Hierarchy (PH). We define a hierarchy of complexity classes extending NP and contained in PSPACE. This is done in two ways, shown equivalent: The first by generalizing the notion of Cook reductions, and the second by generalizing the definition of NP. We then relate this hierarchy to complexity classes discussed in previous lectures such as BPP and P/poly: We show that BPP is in PH, and that if NP subseteq P/poly then PH collapses to is second level.

Lecture 10: The counting class #P and Approximating it. The class NP captures the difficulty of determining whether a given input has a solution with respect to some (tractable) relation. A potentially harder question, captured by the class #P, refers to determining the number of such solutions. We first define the complexity class #P, and classify it with respect to other complexity classes. We then prove the existence of #P-complete problems, and mention some natural ones. Then we try to study the relation between #P and NP more exactly, by showing we can probabilistically approximate #P using an oracle in NP. Finally, we refine this result by restricting the oracle to a weak form of SAT (called uniqueSAT).

Lecture 11: Interactive Proof Systems. We introduce the notion of interactive proof systems and the complexity class IP, emphasizing the role of randomness and interaction in this model. The concept is demonstrated by giving an interactive proof system for Graph Non-Isomorphism. We discuss the power of the class IP, and prove that coNP subseteq IP. We discuss issues regarding the number of rounds in a proof system, and variants of the model such as public-coin systems (a.k.a.\/ Arthur-Merlin games).

Lecture 12: Probabilistically Checkable Proof (PCP). We introduce the notion of Probabilistically Checkable Proof (PCP) systems. We discuss some complexity measures involved, and describe the class of languages captured by corresponding PCP systems. We then demonstrate the alternative view of NP emerging from the PCP Characterization Theorem, and use it in order to prove non-approximability results for the problems max3SAT and maxCLIQUE.

Lecture 13: Pseudorandom Generators. Pseudorandom generators are defined as efficient deterministic algorithms which stretch short random seeds into longer pseudorandom sequences. The latter are indistiguishable from truely random sequences by any efficient observer. We show that, for efficiently sampleable distributions, computational indistiguishability is preserved under multiple samples. We related pseudorandom generators and one-way functions, and show how to increase the stretching of pseudorandom generators. The notes are augmented by an essay of Oded.

Second Semester

Lecture 14: Pseudorandomness and Computational Difficulty. We continue our discussion of pseudorandomness and show a connection between pseudorandomness and computational difficulty. Specifically, we show how the difficulty of inverting one-way functions may be utilized to obtain a pseudorandom generator. Finally, we state and prove that a hard-to-predict bit (called a hard-core) may be extracted from any one-way function. The hard-core is fundamental in our construction of a generator.

Lecture 15: Derandomization of BPP (The NW-generator). We present an efficient deterministic simulation of randomized algorithms. This process, called derandomization, introduce new notions of pseudorandom generators. We extend the definition of pseudorandom generators and show how to construct a generator that can be used for derandomization. The new construction differ from the generator that constructed in the previous lecture in it's running time (it will run slower, but fast enough for the simulation). The benefit is that it is relying on a seemingly weaker assumption.

Lecture 16: Derandomizing Space-Bounded Computations. We consider derandomization of space-bounded computations. We show that BPL subseteq DSPACE(\log^2 n), namely, any bounded-probability Logspace algorithm can be deterministically emulated in O(\log^2 n) space. We further show that BPL subseteq SC, namely, any such algorithm can be deterministically emulated in O(\log^2 n) space and (simultaneously) in polynomial time.

Lecture 17: Zero-Knowledge Proof Systems. We introduce the notion of zero-knowledge interactive proof system, and consider an example of such a system (Graph Isomorphism). We define perfect, statistical and computational zero-knowledge, and present a method for constructing zero-knowledge proofs for NP languages, which makes essential use of bit commitment schemes. We mention that zero-knowledge is preserved under sequential composition, but is not preserved under the parallel repetition.

Lecture 18: NP in PCP[poly,O(1)]. The main result in this lecture is NP subseteq PCP(poly,O(1)). In the course of the proof we introduce an NPC language ``Quadratic Equations'', and show it to be in PCP(poly,O(1)). The argument proceeds in two stages: First assuming properties of the proof (oracle), and then testing these properties. An intermediate result that of independent interest is an efficient probabilistic algorithm that distinguishes between linear and far-from-linear functions.

Lecture 19: Dtime(t) contained in Dspace(t/log t). We prove that Dtime(t(\cdot)) subseteq Dspace(t(\cdot)/\log t(\cdot)). That is, we show how to simulate any given deterministic multi-tape Turing Machine (TM) of time complexity $t$, using a deterministic TM of space complexity $t/\log t$. A main ingrediant in the simulation is the analysis of a pebble game on directed bounded-degree graphs.

Lecture 20: Circuit Depth and Space Complexity. We study some of the relations between Boolean circuits and Turing machines. We define the complexity classes NC and AC, compare their computational power, and point out the possible connection between uniform-NC and ``efficient'' parallel computation. We conclude the discussion by establishing a strong connection between space complexity and depth of circuits with bounded fan-in.

Lecture 21: Communication Complexity. We consider Communication Complexity - the analysis of the amount of information that needs to be communicated betwen two parties that wish to reach a common computational goal. We start with some basic definitions, considering both deterministic and probabilistic models for the problem, and annotating our discussion with a few examples. Next we present a couple of tools for proving lower bounds on the complexity of communication problems. We conclude by proving a linear lower bound on the communication complexity of probabilistic protocols for computing the inner product of two vectors, where initially each party holds one vector.

Lecture 22: Circuit Depth and Communication Complexity. The main result presented in this lecture is a (tight) nontrivial lower bound on the monotone circuit depth of s-t-Connectivity. This is proved via a series of reductions, the first of which is of significant importance: A connection between circuit depth and communication complexity. We then get a communication game and proceed to reduce it to other such games, until reaching a game called FORK. We conclude that a lower bound on the communication complexity of FORK, to be given in the next lecture, will yield an analogous lower bound on the monotone circuit depth of s-t-Connectivity.

Lecture 23: Communication Complexity of the FORK Game. We analyze the FORK game, introduced in the previous lecture. We give tight lower and upper bounds on the communication needed in a protocol solving FORK. This completes the proof of the lower bound on the depth of monotone circuits computing the function st-Connectivity.

Lecture 24: Average-Case Complexity. We introduce a theory of average-case complexity which refers to computational problems coupled with probability distributions. We start by defining and discussing the classes of P-computable and P-samplable distributions. We then define the class DistNP (which consists of NP problems coupled with P-computable distributions), and discuss the notion of average polynomial-time (which is unfortunately more subtle than it may seem). Finally, we define and discuss reductions between distributional problems. We conclude by proving the existence of a complete problem for DistNP.

Lecture 25: Computational Learning Theory. We define a model of automoatic learning called probably approximately correct (PAC) learning. We define efficient PAC learning, and present several efficient PAC learning algorithms. We prove the Occam's Razor Theorem, which reduces the PAC learning problem to the problem of finding a succinct representation for the values of a large number of given labeled examples.

Lecture 26: Relativization. In this lecture we deal with relativization of complexity classes. In particular, we discuss the role of relativization with respect to the P vs NP question; that is, we shall see that for some oracle A, $P^A = NP^A$, whereas for another $A$ (actually for almost all other $A$'s) $P^A \neq NP^A$. However, it also holds that $IP^A \neq PSPACE^A$ for a random $A$, whereas $IP = PSPACE$

Back to Oded Goldreich's homepage or to main page of these notes.

Copyright (C symbol) 1999 by Oded Goldreich. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that new copies bear this notice and the full citation on the first page. Abstracting with credit is permitted.

This work may be published or be a basis for publication in the future. Copyright may be transferred without further notice and this version may no longer be accessible.