Notes for a computational complexity course
Oded Goldreich
Quick links: The notes on
P, NP, and NPC,
P/poly and PH,
space complexity,
randomness and counting,
hardness amplification,
pseudorandomness,
probabilistic proof systems,
and average-case complexity.
GENERAL NOTES ABOUT THE COURSE
This guided reading course is based on my book
Computational Complexity:
A Conceptual Perspective.
(All sections and exercises refer to this book.)
An e-copy of the book will be sent to you upon request.
In addition, I have a few hard copies that I can lend out.
The conceptual perspective.
This guided reading course (as the book it uses) is not aimed
at merely teaching technical material,
but rather also aims at addressing the question of why are
certain questions asked.
The answers are rooted in natural (intuitive) concepts and concerns,
and we shall pay attention to how these intuitive concepts (resp., concerns)
are formalized in definitions (resp., rigorous questions).
Thus, even if you know some material from other studies, please
read the assigned material (and, especially, the high level discussion).
In general, don't "fast-forward" over the "easy" conceptual discussions,
but rather read them carefuly, think them over, and even criticize
(but not for the sake of critique or posing clever -- we have no
time for poses...).
Overview versus details.
Sometimes, details are crucial, but the overview comes first.
Don't get too entangled in the details, always keep in touch
with the overview. If you fail to understand some details,
make sure first that you understand the overview.
Then, try to "locate" your difficulty.
The homework assignments.
The "homework assignments" (about 4 per semester) are meant to
provide a framework for the course. They are meant to be light load,
so don't ever spend too much time on them (i.e., more than a few hours).
If you cannot do the homework assignment in a couple of hours,
then it may be the case that you got entangled in some low-level details,
which is not my intention. In such a case, please consult
other students, and if necessary consult me.
But do so before wasting too much time on it!
The reading assignments.
The required reading for the next meeting will be called
reading assignment (and this is what you should not skip,
even if you learned the material before). In addition,
I will offer optional reading (which is not mandatory).
Errors and additions to the book.
As any text, the book contains some unfortunate errors.
In case of doubts, please consult the
list of currently detectable
errors, gaps and missing clarifications
in the webpage of the book.
Needless to say, I'd be grateful for pointing out add'l errors,
which I'm sure exist, unfortunately.
NOTES FOR 1ST MEETING
Topic: P-versus-NP.
Indeed, the P-vs-NP question will be our first topic.
I do expect that almost all of you heard these terms before,
but I guess that most did not get an adequate perspective on it.
Therefore, I insist that you read the relevant Section 2.1,
and test yourself with the quiz provided below.
(Specifically, this section contains a formulation of
the P-vs-NP question/conjecture in terms of search problems
as well as a definition of NP in terms of proof systems.)
See handwritten preparation notes (partially in Hebrew):
Sheets 1 and 2.
BACKGROUND BEING ASSUMED: Familarity with the notion of an algorithm,
and the basic formulations of computability theory. Take a look at
Section 1.2.1-1.2.3 of the book and read it if you don't know it.
[This is not a reading assignment, but rather the "prerequisite".]
(A more detailed description of the background, with some exercises,
is provided in chapter 1 of my Basic Complexity book
available at ~oded/bc-book.html.)
READING ASSIGNMENT:
- A short section on Universal Algorithms (Sec 1.2.3.4)
- A short section on Time Complexity (Sec 1.2.3.5)
- The P-vs-NP Problem in its two formulations
(Sec 2.1.0-2.1.4 and Sec 2.1.6, where Sec 2.1.0 means
the part of Sec 2.1 that comes before Sec 2.1.1).
A revision of Sec 2.1 appears as Chapter 2 of my Basic Complexity book
(available HERE);
the pace there is a bit slower and more details are given.
SUGGESTED OPTIONAL READING
- The "traditional definition of NP" and how it fits in (Sec 2.1.5).
- Meditations, only if you are philosophically inclinded! (Sec 2.1.7).
SUGGESTED QUIZ FOR THE BACKGROUND MATERIAL (not to be submitted):
- What is the default representation of integers (in complexity theory)?
- What are search and decision problems?
- What is the motivation for considering the model of Turing machines?
- What does the Church-Turing Thesis assert?
- What is a universal algorithm?
- What does undecidability mean?
- What is the time complexity of an algorithm?
- What does the Cobham-Edmonds Thesis assert?
- What are Boolean circuits and formulae?
SUGGESTED QUIZ FOR SEC 2.1 (not to be submitted):
- What are the justifications for associating
efficient computation with polynomial-time algorithms?
- What are the classes PF and PC?
- What are the classes P and NP?
- List a few computational problems in PF (resp., P).
- Going beyond the list of the previous question,
list a few problems in PC (resp., NP).
- What does "PC not contained in PF" mean in intuitive terms?
- What does "P \neq NP" mean in intuitive terms?
- Is it the case that "PC not contained in PF" if and only if "P \neq NP"?
- What are the justifications for believing that "P \neq NP"?
SUGGESTED EXERCISES FOR SEC 2.1 (not to be submitted):
- Exer 2.1-2.3 are all that is offered.
They are simple but test understanding of the basic defs.
NOTES FOR 2ND MEETING
Topic: Polynomial-time reductions.
The topics of the reading assignment (for next meeting)
refer to poly-time reductions. As stressed in prior meeting,
I wish to stress that there is nothing fundamental in
using polynomials as bounds on efficient computation.
This is merely a natural and convenient convention;
the phenomena that we discuss (e.g., finding vs checking,
deciding vs verifying based on proofs, reductions, etc)
are independent of this convention, but are best captured by using it.
I wish to highlight the general notion of a reduction
(and poly-time reduction), which is fundamental beyond its role
in the theory of NPC. The general notion of an oracle machine
[Sec 1.2.3.6] is the basis of the general definition of a reduction,
sometimes called Turing-reductions, and we focus on poly-time reductions,
which are called Cook-reductions. Note that Karp-reductions
(i.e., reduction by mapoping/translation/encoding) as used
in the theory of NPC are a restricted/special case of Cook-reductions.
The notion of Cook-reductions is used in discussing the reduction
of optimization problems to search problems, and the reduction
of search problems to decision problems. Each search problem $R$
is reducible to $S'_R = \{\pair{x,y'} \exists y'' (x,y'y'')\in R\}$
(which is a decision problem...), but in some cases $R$ is reducible
to $S_R = \{x: R(x)\neq\emptyset\}$. In the latter case, we say that $R$
is self-reducible.
READING ASSIGNMENT
- A short section on oracle machines (Sec 1.2.3.6)
- Polynomial-time reductions (Sec 2.2.0-2.2.3).
Section 2.2.3.2 assumes you know the definition of NP-completeness;
if you don't please read Sec 2.3.1 before.
SUGGESTED OPTIONAL READING
- Digest and general perspectives re poly-time reductions (Sec 2.2.4).
SUGGESTED QUIZ FOR SEC 2.2 (not to be submitted):
- What are Cook-reductions?
- What are Karp-reductions and Levin-reductions?
- What is the motivation for defining all these types of reductions?
- Can any problem in PC be reduced to some problem in NP?
- What is self-reducibility and how does it relate to the previous question?
- List five search problems that are self-reducible.
SUGGESTED EXERCISES FOR SEC 2.2 (not to be submitted):
- Exer 2.5 and 2.6: Basic properties of reductions
- Exer 2.8: Search vs Optimization
- Exer 2.11-2.12: Self-reducibility of popular search problems.
NOTES FOR 3RD MEETING
Topic: NP-completeness (NPC).
Yes, I know most of you heard about NP-completeness,
but I wish to highlight a few things.
Firstly, the very existence of NP-complete problems,
which is non-obvious a priori!
Secondly, it is easier to establish the existence of NPC
than to prove that natiral problems such as SAT are NPC.
(Indeed, SAT and other famous NPC problems are very natural
computational problems, more natural than "bounded halting",
although the latter is not as unnatural as usually perceived...)
Thirdly, when proving that SAT is NPC,
I advocate decoupling the reduction of NP to SAT into two reductions:
The 1st reduction is from NP to CSAT, and the 2nd from CSAT to SAT.
The 1st reduction relies on the fact that circuits can emulate
the computation of Turning machins (TMs) on inputs of fixed length,
while the 2nd reduction shows that CNFs with auxiliary variables
can "emulate" the computation of arbitrary circuits.
These are important "lessons to take home".
Note that assuming that $P\neq NP$, there exist problems in $NP-P$
that are not NPC. Make a picture of the "world" under this assumption,
and ask what does the "world" look if $P=NP$.
READING ASSIGNMENT
- NP-completeness (Sec 2.3.0-2.3.3.1);
this covers the definitions, existence of NPC, and NPC of CSAT/SAT.
- When reading Sec 2.3.3.1, read also Sec 1.2.4.0-1.2.4.1 (for perspective).
- Read the first page of Sec 2.3.4 (i.e., Thm 2.28).
Reading the proof sketch is not required.
I want you to understand Thm 2.28 and its meaning,
and I don't care if you have no clue re how to prove it.
SUGGESTED OPTIONAL READING
- If you did not see other NP-completeness proofs before,
then you may want to read Sec 2.3.3.2.
- Read and think (but don't overdo nor submit) Exer 2.28, 2.29, and 2.30,
which discuss add'l features of some (standard/modified) Karp-reductions.
- Meditations (Sec 2.2.4 and 2.3.5).
- The proof sketch of Thm 2.28 may be hard to get.
It is better to read it after reading simpler proofs "by diagonalization"
such as the proof of Thm 4.3 (which you may read now...).
In any case, if you decide to look at the proof sketch of Thm 2.28,
be prepared to only get a vague feeling of how it is proved
(see first page of this sketch). That is, try to get the basic idea.
It is likely that if you want to get the proof in all its details,
this may require much time (possibly consulting other sources...),
and I do not recommend spending so much time on it.
SUGGESTED QUIZ FOR SEC 2.3 (not to be submitted):
- What are NP-complete (search and decision) problems?
- Is it likely that the problem of finding a perfect matching
in a given graph is NP-complete?
- Prove the existence of NP-complete problems.
- How does the complexity of solving one NP-complete problem
effect the complexity of solving any problem in NP (resp., PC)?
That is, assuming that some NP-complete problem can be solved in time $t$,
upper-bound the complexity of solving any problem in NP (resp., PC).
- List five NP-complete problems.
- Why does the fact that SAT is Karp-reducible to "Set Cover"
imply that "Set Cover" is NP-complete?
- Are there NP-problems that are neither in P nor are NP-complete?
SUGGESTED EXERCISES FOR SEC 2.3 (not to be submitted):
- Exer 2.21: Details re reducing SAT to 3SAT.
- Exer 2.23-2.25: NP-hardness of some algebraic problems.
- Exer 2.28-2.29: Improtant additional properties of some reductions.
NOTES FOR 4TH MEETING
Topics: promise problems, optimal search algorithms, and coNP.
This time you get a homework assignment, to be submitted by DATE1.
(This is more time than you need, so please do not ask for extensions.)
Also bear in mind my warning regarding not spending too much time
on homework and/or exercises.
The following reading assignment consists of three parts.
The FIRST part (Sec 2.4.1 on "promise problems") is mainly conceptual,
and it elaborate on an important issue that were mentioned already
(in prior sections of the book).
Specifically, the point is that natural computational problems are
actually promise problems (with an efficiently recognizable promise),
although they are computationally equivalent to computational problems
with a trivial promise (i.e, standard decision and search problems).
This issue will be ignoring in the rest of the course.
You should understand what are they issues involved
and why (and when) we can ignore them.
Furthermore, the formalism of promise problems
is essential in some places (see an example next).
One special type of interest is the notion of a "candid search problem",
which is a promise problem consisting of finding solutions when promised
that such exist. Note that the promise may not be poly-time decidable.
The SECOND part (sec 2.4.2) is an intriguing claim asserting that
we do know the best (i.e., optimal in a natural sense) algorithm
for any candid search problem in PC, we just don't know its running time.
But you may be diappointed when seeing what this algorithm does.
The theorem states that for every $R\in\PC$,
there exists a polynomial $p$ and an algorithm $A$
that solves the candid search problem of $R$ such that
for every $A'$ that solves the candid search problem of $R$
it holds that $t_A(x) = \tildeO(t_{A'}(x)+p(|x|))$,
where $t_B(x)$ denotes the number of steps of $B(x)$
and $\tildeO(n) = n\cdot \poly(\log n)$.
(The $\tildeO$ expression in the THM depends on $A'$.)
The LAST part (Sec 2.4.3) introduces the class coNP, and sheds light
on the difference between Karp and Cook reduction in the context
of reductions among decision problems. In particular, it is proved that
(i) NP-complete sets cannot be in coNP,
where completeness is under Karp-reductions;
and (ii) that NP-complete sets cannot be Cook-reduced to
a set in NP intersection coNP.
(Note that (i) talks about being in some class,
whereas (ii) talks about being Cook-reducible to some class.)
READING ASSIGNMENT:
Basically it is to read Sec 2.4 (i.e., "Three Advanced Topics").
Please read Exercise 2.36 (without the guideline and Part 3).
HOMEWORK ASSIGNMENT (to be submitted by DATE1):
- Exercises 2.10 and 2.11, which I discussed shortly in the Nov 7th meeting.
- Exercise 2.13 and Part (1) of Exercise 2.14.
See add'l guidelines in the correction list at
the book's webpage
- Bonus (not required!): Exercise 2.36.
SUGGESTED *OPTIONAL* READING -- REPEATING THE LAST ONE:
- Digest and general perspectives re poly-time reductions (Sec 2.2.4).
- If you did not see other NP-completeness proofs before,
then you may want to read Sec 2.3.3.2.
- Read and think (but don't overdo nor submit) Exer 2.28, 2.29, and 2.30,
which discuss add'l features of some (standard/modified) Karp-reductions.
- Meditations, only if you are philosophically inclided! (Sec 2.1.7).
- The proof sketch of Thm 2.28 may be hard to get. It may be better
to read it after reading simpler proofs "by diagonalization"
such as the proof of Thm 4.3 (which you may read now...).
In any case, if you decide to look at the proof sketch of Thm 2.28,
be prepared to only get a vague feeling of how it is proved
(see first page of this sketch). That is, try to get the basic idea.
It is likely that if you want to get the proof in all its details,
this may require much time (possiblly consulting other sources...),
and I do not recommend spending so much time on it.
SUGGESTED QUIZ FOR SEC 2.4 (not to be submitted):
- What are promise problems?
- What is the justification for ignoring the promise (in a promise problem)
whenever it is polynomial-time recognizable?
- What is a candid search problem?
- Is it the case that the P-vs-NP Question boils down to determining
the time complexity of a single (known) algorithm?
- What is the class coNP?
- How does NP relate to the class of decision problems that
are Cook-reducible to NP?
- How does NP relate to the class of decision problems that
are Karp-reducible to NP?
NOTES FOR 5TH MEETING
Topic: P/poly and the Polynomial-time Hierarchy (PH).
The following reading assignment consists of three small parts.
The FIRST part (Sec. 3.1) presents two alternative definition of the
class P/poly (via poly-size circuits and via "machines that take advice"),
proves their equivalence (Thm 3.6),
and shows that P/poly (and in fact even P/1)
contains P as well as some undeciable sets (Thm 3.7).
The formalism of "machines that take advice" offer
a way of measuring the amount of non-uniformity.
The SECOND part (Sec. 3.2.1) presents a definition of PH
(the "Polynomial-time Hierarchy" and its various levels).
The main definition is in terms of "quantifier alternation",
but an alternative using "non-deterministic poly-time oracle machines"
is also presented (in optional reading -- see Sec 3.2.2).
The LAST part (Sec 3.2.3) links the two first parts.
Specifically, Thm 3.12 (known as "Karp-Lipton Thm"),
states that if NP is in P/poly then PH collpases to its 2nd level.
The proof has inspired many follow-ups and is important to internalize.
The initial observation is that it suffices to prove that Pi_2=Sigma_2
(and in fact this is the formulation used in Thm 3.12).
Thus, starting with $S\in\Pi_2$ and using the hypothesis
(i.e., "NP in P/poly"), we shall show that $S\in\Sigma_2$.
The second observation is that $S=\{x:\forall y, (x,y)\in S'\}$,
where $S'$ is an NP-set (and $|y|=\poly(|x|)$).
Now comes the main step of the proof:
It is that $x$ is in $S$ if and only if there exists
a polynomial-size circuit $C$ that "decides $S'$ correctly
on *all* inputs of length $m=\poly(|x|)$" and $C(x,y)=1$
for *all* strings $y$ of length $m-|x|$. Note that the condition
has the form "there exists such that for all" (i.e., a Sigma_2-form),
but the issue is whether the 1st condition (i.e., the one in quotes)
is poly-time decidable. The traditional proof appears in Exer 3.11,
but we prefer the following alternative.
Since NP is in P/poly, it follows that PC is in the serach version of P/poly
(because PC is Cook-reducible to NP).
Now, letting $R'$ be the witness relation of $S'$
(i.e., $S' = \{(x,y):\exists z ((x,y),z)\in R'$),
we have (schematically) $x\in S$ iff $\forall y \exists z ((x,y),z)\in R'$.
Now, *if* the circuit $C$ solves the search problem $R'$,
then we have $x\in S$ iff $\forall y ((x,y),C(x,y))\in R'$
(because $(x,y)\in S'$ iff $((x,y),C(x,y)) \in R'$).
Furthermore, and this is a crucial observation, if $x\not\in S$,
then no matter what $C'$ you take, there exists $y$ s.t. $C(x,y)$
is not a solution to $(x,y)$ wrt $R'$ (because no solutions exist).
Hence (ignoring details (*)), $x\in S$
iff $\exists C\forall y ((x,y),C(x,y)) \in R'$
(which can be written as $\exists C\forall y \Phi(x,C,y)$,
where $\Phi$ is a condition that can be evaluated in polynmial-time).
*) ignoring details such as the fact that we are quantifying
over strings of adequate length.
READING ASSIGNMENT:
Read almost all of Chapter 3 -- on the classes P/poly and PH.
You may skip Sec 3.2.2 (PH defined by ND poly-time oracle machines).
SUGGESTED EXERCISES FOR Chap 3 (not to be submitted):
- Exer 3.2: sparse sets and P/poly,
helps verify taht you understand the def of P/poly.
- Exer 3.3: "advice hierarchy"
- Exer 3.4: Sigma2 contains all that is Cook-reducible to NP.
See also Exer 3.5.
- Exer 3.6 and 3.9: easy quiz re PH.
All these exercises are supposed to be easy.
SUGGESTED *OPTIONAL* READING:
- Read also Sec 3.2.2. Note the warning on the "non-generic" meaning
of the notation ${C_1}^{C_2}$ (bottom of page 118).
- Exercise 3.11.
SUGGESTED QUIZ (which is actually a digest) FOR CHAPTER 3
Since the answers are not explicit in the text,
they are provided HERE.
- Assuming that $C$ is a complexity class and $S$ is a set of strings,
when does the notation $C^S$ makes sense?
- What features of NP and PC are used to show that
if NP is in P/poly, then PC has polynomial-size circuits?
- What feature of the ternary relation $R$ is used when showing
that for every $x$ it holds that
for all y exists z s.t. $(x,y,z)\in R$ if and only if
there exists a polynomial-size circuit $C$ such that
for all y it holds that $C(x,y)=1$?
(Note: the quantifications over y and z refer to
all strings of $\poly(|x|)$-length.)
NOTES FOR 6TH MEETING
Topic: Do more resources yield more power?
In the context of non-uniform computation
the answer is a definite YES:
Sec 4.1 asserts that $P/(\ell+\omega(1))$ strictly contains $P/\ell$,
provided that $\log_2 \ell(n) \leq n-\omega(n)$.
[Actually, a stronger statement holds -- see corrections on the
web-site of the book (or text):
It holds that $P/(\ell+1)$ strictly contains $P/\ell$,
provided that $\log_2 \ell(n) \leq n$.]
In the context of uniform computation,
the answer is yes *provided* that these resources
are "self measureable" (i.e., measured by an algorithm that does not
use more than the measured amount of resources).
Intuitively, if a resource is not self-measurable,
then it cannot be used "safely" (i.e., you cannot use the specified
amount of resources without risking exceeding it).
Def (for any reasonable and general model of computation):
we denote by $DTIME(t)$
the class of sets that can de decided within time complexity $t$
(in this model, which "should" appear as subscript in DTIME,
if we are pedantic).
Recall (the Cobham-Edmonds thesis): Upto a polynomial composition,
the class $DTIME(t)$ is the same for all reasonable models.
Note that the exact class (without the aforementioned slackness)
are not model independent...
Def (time constructability): $f:\N\to\N$ is *time constructible* (in the model)
if there exists an algorithm that on input $n$ computes $t(n)$
within $t(n)$ steps.
EXER: For every algorithm $A$, the function f(n) that equals the number of
steps of A on input $1^n$ is time constructible
(by running $A$ itself + mapping $n$ to $1^n$,
where the latter overhead can be compensated by linear speed-up).
Cor 4.4: For every reasonable and general model of computation,
there exists a polynomial $p$ such that
for every time constructible $t_1$ (that is at least linear)
it holds that DTIME(t_1) is strictly contained in $DTIME(p\circ t_1)$.
PROOF: The idea is that in such a model, $t_1$ steps of any machine $M$
can be emulated by less than $p \circ t_1 /2$ steps of an emulating machine.
Fixing $t_1$, for each machine $M$ we define a "truncated emulation" $M'$
such that $M'(x)=1$ iff $M(x)$ halts within $t_1(|x|)$ steps with output 1.
Otherwise, $M'(x)=0$ (e.g., if $M(x)$ does not halt within $t_1(|x|)$ steps).
By the above, $M'$ has running time smaller than $p\circ t_1/2$.
For any (simple) mapping of machines $M$ to inputs $x_M$,
define $f(x_M) = 1-M'(x_M)$. Then $f$ is in DTIME(p \circ t_1 /2)$,
but cannot be in DTIME(t_1)$ as otherwise we derive a contradiction
by considering the machine $M_f$ that computes $f$ in time $t_1$
(and, specifically, on what this machine does on the corresponding input $x_{M_f}$).
QED
Re the terminoloy: Hierarchy Theorems refer to the "normal" case
in which more resources provide more power (or more ability;
i.e., ability to perform more tasks = solve more problems).
In contrast, Gap Theorems assert that under some weird circumstances,
resources does not provide more power; this is viewed as a "gap"
in the (computing) power as a function of resources
(where within the "gapped" region, power does not increase).
READING ASSIGNMENT:
Read almost all of Chapter 4, skipping Sec 4.2.1.3.
- Sec 4.1 (non-uniform hierarchy) is a very easy read.
- Sec 4.2.0-4.2.1.2 (Time hierarchy) is the core of the current assignment.
- Sec 4.2.2 (Gaps and Speedups) is more important for the stated results,
not their proofs. Make sure you understand what Thms 4.7 and 4.8 say.
- Sec 4.2.3 merely discusses analogues for space complexity.
Note that emulation is "easier" wrt *space* (compared to *time*).
SUGGESTED OPTIONAL READING
- Sec 4.2.1.3 (Thm 4.6: NTIME hierarchy) is recommended
as a good warm-up towards the proof of Thm 2.28 ("NPI").
- Now, one can try to read the proof of Thm 2.28
(i.e., on NP-sets that are neither in P nor NP-complete).
Although it does *not* build on the proof of Thm 4.6,
it uses the idea of "delayed diagonalization" which
is easier to grasp in the context of the proof of Thm 4.6.
I'm not sure about suggesting to look for the proof of Thm 4.8
in other sources.
SUGGESTED EXERCISES FOR Chap 4 (not to be submitted):
- Exer 4.1 and 4.2 refer to the non-uniform model.
- Exer 4.7 (re "an efficient counter") may be of general interest.
- Exer 4.8 and 4.9: A set in $E - P$, and EXP-completenss.
SUGGESTED QUIZ (which is actually a digest) FOR CHAPTER 4
Since the answers are not explicit in the text,
they are provided HERE.
- Fixing a specific model of computation
and a definition of time complexity for it,
what enables showing that DTIME(t1) is strictly contained in DTIME(t2)?
That is, what should be the relation between t1 and t2
and aspects of the model, for such a results to hold?
- Similarly, what enables showing that DTIME(t1) = DTIME(t2),
when t1 << t2?
- Ditto for space.
NOTES FOR 7TH MEETING
Topic: Space complexity - Part 1 (basics).
Space complexity is of intrinsic interest,
but is also interesting as offering a perspective
(and sometimes a contrast) to issues w.r.t time complexity.
When studying space complexity,
we wish to account for the storage ("space") used for the actual computation
as opposed to the storage taken by the input and output.
This is crucial since we are often interested in sub-linear
space complexity, and conventions should be introduced in order to
distinguish space used for ("active") computation from space used
for ("passive") storage of the input and output.
This is done by using separate storage for each of the three roles.
That is, we use a read only input device,
a write only output device,
and a storage device to the actual computation,
called work space device.
Only the work space device is charged for space consumed by computation,
and the read-only and write-only conventions prevent abuse of the other
(i.e., input and output) devices for conducting computation.
While one may conjecture that the minimal amount of (super-constant)
storage that can be useful for computation is logarithmic
(since it seems that one needs to "compute" the input length
in order to allocate the space consumption),
this turns out wrong
(i.e., double-logarithmic is the correct lower bound
for computations that cannot be performed in constant amount of space).
Nevertheless, we shall only consider space complexity that is at least
logarithmic (in the input length).
Issue to discuss: The instantaneous configuration
as consisting of a fixed part and a variable part,
where the partition is application-dependent.
E.g., in the proof of Thm 5.3, the input is fixed,
and the loaction of the head on it as well as the
contents of the work space (and the location on it)
are the variable parts.
READING ASSIGNMENT:
- Sec 5.1 ("space complexity -- general issues"),
but *skip* most of Sec 5.1.3.3 ("subtleties re space-bounded reductions")
from the Teaching Note on p. 150 (i.e., start skipping there).
Do read Sec 5.1.4 and also the laconic sections 5.1.3.4-5.1.3.6.
- Sec 5.2.0-5.2.3 (see, in particular, Thm 5.4 and 5.5).
SUGGESTED *OPTIONAL* READING:
- Look at Exer 5.3 and 5.4 (on double-logarithmic space complexity),
but do *not* exhaust yourself with attempts to provide all
low-level details for these exercises.
- Read all of Sec 5.1.3.3
(i.e., including the part skipped in the main assignment).
SUGGESTED EXERCISES (not to be submitted):
- Exer 5.1, 5.2 and 5.5: on some "w.l.o.g" -- simple modifications
- Exer 5.6: some computations in log-space
- Exer 5.11 and 5.12: using the "emulative composition lemma" (Lem 5.2)
SUGGESTED QUIZ (for Sec 5.1)
- What are the most important conventions regarding space complexity?
- Suppose that a graph theoretic problem is solved in space $s$,
when the graph is presented by its adjacency matrix.
What is the space complexity of solving it
when the graph is presented by incidence lists?
- Assuming that a problem can be solved in space $s$,
what can be said of its time complexity?
NOTES FOR 8TH MEETING
Topic: Space complexity - Part 2 (UCONN in L).
[I was debating with myself whether to assign you first the NL=coNL
(incl reading about NSPACE) or to start with the log-space algorithm
for deciding undeirected connectivity of graphs (UCONN).
The latter algorithm (especially its details) is far more complex,
but the setting (unlike NSPACE) sounds more familar and less problematic
(from a conceptual point of view).
So I decided to start with UCONN...]
Warning: This reading assignment is significantly more difficult
than any of the material covered so far;
it is focused on a rather sophisticated algorithm and analysis.
The reading sequence that I propose starts with the overview
of the UCONN algorithm (Sec 5.2.4.0). They key observation
is that if the input graph were of (1) bounded degree
and (2) logarithmic diameter, then a simple log-space
algorithm can solve the problem by "enumerating" all
possible paths. But, of course, the graph may not satisfy
these conditions and so the idea is to transform it to
a graph that does satisfy these condition while preserving
the relevant connectivity property.
As a warm-up, note that we can easily transform the graph
into one that supports one of the properties but not the other.
(1) We can make it 3-regular by replacing
each vertex $v$ of degree $d(v)$ by a $d(v)$-cycle
and connect edges between "representatives".
(2) We can reduce its diamater by "raising it to a power".
Sec 5.2.4.1 present this transformation on an abstract level
(i.e., avoiding the question of effecting it in log-space).
You will need to read first about expander graphs
and about the zig-zag product (Apdx E.2.1.1 and E.2.2.2 will do!).
Once you read these, read Sec 5.2.4.1 (again or for the 1st time).
Basically, what is done is interleaving steps of reducing the diameter
(or actually increasing expansion) with steps of decreasing the degree.
The expansion is increased by raising the graph to a constant power,
and the degree is reduced by the zig-zag product.
We take a logarithmic number of such iterations,
to increase expansion from (the trivial) 1/poly to a constant.
Surely, implementing the above sequence of transformations
in the straighforward way will not do
(as it will yield a space bound of log-square).
The key ideas presented in Sec 5.2.4.2 are
(i) implementing an "oracle access to the current graph"
rather then actually printing it.
(ii) using a "global" accounting for the space used by recursive calls
of a subroutine (see the notion of "global-tape oracle machine").
The relevant computation in that model are illustrated
HERE.
ADD'L INTUITION
Recall that the algorithm is motivatied by considering the diameter
of the graph, but the actual analysis cannot be carried out in these terms.
This is the case because the diameter reduction effect of the powering
is undone (as far as a local analysis goes) by the diameter increase
caused by the degree reduction.
However, it is common in mathematics (incl in TCS), to use an invariance
and/or a parameter that is stronger than the one that we actually want.
Hence, instead of diameter, we may consider the "mixing time" (of the graph),
defined as the minimum number of random steps you need to take
(when starting from any fixed vertex) till you get to a distribution
that is close to the uniform (say get to a distribution in which
each vertex is reached with probability $(1\pm0.1)/n$).
So "mixing time" can be viewed as a strengthening of diameter;
in particular, the diameter is upper bounded by the mixing time...
Also note that in a few weeks we shall see than any $n$-vertex graph
has mixing time that is at most $n^3$ (or so).
So it would be great if we could directly prove that the cover time shrinks
by a constant factor in each iteration of the foregoing procedure.
But, the mixing time is closely related to the "spectral gap" (or "gap"),
which is the gap between the 1st eigenvalue (i.e. 1) and the 2nd eigenvalue
(which is "bounded away" from 1). Roughly speaking, the 2nd eigenvalue
(i.e., 1 minus the gap) equals the rate of convergence of a random walk
(to the staionary distribution),
and the mixing time is inversely proportional to the gap,
where the factor of proportion is logarithmic
(i.e., the mixing time is $O(log n)/gap$).
Hence, by showing that the gap increases by a factor of two in each iteration,
we would be showing that mixing time shrinks in that manner....
I will try to provide some intuition regarding expander graphs,
and in particular their algebriac definition.
The key point is to note the following:
- For convenience, for a $d$-regular $n$-vertex graph $G$,
consider the *normalized* adjacency matrix of $G$
(i.e., the adjacency matrix divided by $d$), denoted $A_G$.
This is a stochastic matrix; that is, each row sums to 1.
- The uniform vector $(1,....,1)$ (or $(1/n,....,1/n)$)
is an eigenvector of $A_G$ with eigenvalue 1.
- Each vector orthogonal to $e_1=(1/n,....,1/n)$ has sum 0.
- Writing each vector as a linear combination of the eigenvectors
(or an orthogonal basis of these), each probability vector
has the form $e_1$ plus a vector in the orthogonal space.
Thus, applying $A_G$ to a probability vector yields a probability
vector that is closer to $e_1$, because the component in direction $e_1$
remains intact whereas the orthogonal component shrinks by a factor
of $\lambda_G$ (= the 2nd eigenvalue of $A_G$ in absolute value).
Thus, if you start from any distribution over $V(G)$ and take a random
step, then you end up in a distribution closer to uniform over $V(G)$.
In particular, if you start from a uniform distribution over a set $A$
(of less than half $V(G)$), then you get to a distribution that has
a significantly larger support. This is the "expansion" property (*).
Also, if you take $O(\log_{1/\lambda_G} n) = O(\log n)/(1-\lambda_G)$
random steps from any start vertex,
then you reach almost uniform distribution (over $V(G)$).
*) The fact that eigenvalue gap implies expansion is closely related
to the Expander Mixing Lemma (i.e., Lemma E.8, page 558).
Note that in the special case of $B=V-A$,
we can use $\sqrt{|A|\cdot|B|} \leq N/2$,
and so the number of edges going out
of $A$ is at least $((1-\rho(A))\cdot d - \lambda/2)\cdot|A|$.
It follows that for $|A| \leq N/2$,
it holds that $|\Gamma(A)-A| \geq ((1/2)-(\lambda/2d))\cdot|A|$.
READING ASSIGNMENT:
- Sec 5.2.4.0 (overview of the UCONN algorithm).
- Apdx E.2.1.1 (two equivalent defs of expanders).
- Apdx E.2.2.2 (the ZigZag product).
- Sec 5.2.4.1 (the transformation into an expander)
- Sec 5.2.4.2 (a log-space implementation)
HOMEWORK ASSIGNMENT (to be submitted by DATE2):
- Exercises 5.13, 5.15 and 5.16.
Please do not read the guidelines (which are hints)
unless you feel you need them.
- Optional (not for submission): Exer 5.17.
SUGGESTED *OPTIONAL* READING:
- Read the entire section on expander graphs: All of Apdx E.2.
- Think of Exer 5.17.
SUGGESTED QUIZ (for UCONN in L)
It seems quite hard to design a meaningful quiz for this meeting.
So I'll present a pseudo-quiz, which is a kind of digest.
- Describe a log-space algorithm for counting the number of
connected components in bounded-degree graphs of logarithmic diameter.
- Given a log-space transformation of any graph into
a graph of bounded-degree graphs of logarithmic diameter
such that the transformation preserves the number of connected
components in the graph, describe a log-space algorithm for
counting the number of connected components in any graph.
- Describe a transformation as in the previous item,
with the exception of using a polynomial-time algorithm
that uses a poly-logarithmic amount of space.
How much space did you use?
- Describe, in high level, the nature of the observations
and conventions that allow to implement the foregoing
transformation in logarithmic space?
NOTES FOR 9TH MEETING
Topic: Space complexity - Part 3 (NL).
The reading assignment has two parts: (1) a mainly conceptual part
that discusses on-line vs off-line non-determinism,
and (2) a technical part about NL and Directed Connectivity (st-CONN).
Re (1). As explained in Sec 5.3.1, these two non-deterministic models
are significantly different in the context of space complexity,
since (intuitively) an off-line non-deterministic device can be used
as an unaccounted work space. For this reason, the on-line model
is the model of choice (but results for the off-line model follow).
Re (2). In a sense, st-CONN "encodes" all of NL;
it is definitely NL-complete under log-space reductions.
It is easy to see that st-CONN can be solve in space log-squared
(and in general $NSPACE(s)$ is contained in $DSPACE(s^2)$).
Finally, there is the celebarated theorem NL=coNL
(or, equivalently, directed st-non-connectivity is in NL).
The key idea is an NL machine for counting the number of vertices
in a given graph that are at a given distance from a given vertex.
Let $R_G(v,k)$ denote the set of vertices that are at distance
at most $k$ from $v$, then it holds that $R_G(v,k)$ equals the
set of vertices that are either in $R_G(v,k-1)$ or are adjecent
to a vertex in $R_G(v,k-1)$. Thus, if you can non-deterministically
enumerate $R_G(v,k-1)$ (which means that (i) a correct output is
verified as such, and (ii) a correct output is sometimes produced),
then you can enumerate $R_G(v,k)$. This is done by going over all $V(G)$,
enumerating $R_G(v,k-1)$ a new for each target vertex,
and testing whether the target vertex should be taken per $R_G(v,k-1)$.
READING ASSIGNMENT: Basically read all of Sec 5.3.
- Sec 5.3.1 two model of non-determinstic space complexity.
- Sec 5.3.2 NL and Directed Connectivity (denoted st-CONN),
including the completeness of st-CONN, NSPACE vs DSPACE,
and NL=coNL (or solving st-non-connectivity in NL).
- Sec 5.3.3 a "philosophical" discussion.
SUGGESTED *OPTIONAL* READING AND THINKING:
Read and think of Exer 5.18 relating the on-line and off-line models.
SUGGESTED *OPTIONAL* EXERCISES (do not submit)
- Exer 5.19-5.21: about NL and directed connectivity
- Exer 5.25: On the notion of non-deterministic computation of functions.
- Exer 5.26: On the algorithmic graph theoretic meaning of NL=coNL.
SUGGESTED QUIZ (which is actually a digest for NL=coNL)
- What does it mean for a non-deterministic machine to compute a function?
In the case the function is Boolean,
how is this notion different from the notion of
non-deterministically accepting a set?
-
Suppose that $R(x) \subseteq\{0,1\}^\ell$, where $\ell=O(\log|x|)$, and
that $R=\{(x,y):y\in R(x)\}$.
- Show that if $R\in\NL$, then there exists a non-deterministic log-space
machine that on input $x$ and $|R(x)|$, outputs $R(x)$ itself.
- Suppose that $f$ is a length-preserving function
that is computable by a non-deterministic log-space machine,
and that there exists a non-deterministic log-space machine
that given $x$ and $R(x)$ computes $R(f(x))$.
Assuming that $R\in\NL$ and using Item 1, show that there exists
non-deterministic log-space machine that
given $x$ and $R(x)$ computes $R(f^{|x|}(x))$.
Ditto for $f^{p(|x|)}$ for any other fixed polynomial $p$.
Why is the use of Item 1 essential here?
Repeat both steps when replacing non-determinism by determinism
(and NL by L).
-
Show that coNL is in NL by using the framework
of the previous item.
(Hint: Let $x=(G,s,i)$, where $G$ is a directed graph,
$s$ is a vertex in $G$, and $i$ is an integer.
Consider the problem of finding the set of vertices
that are reachable from $s$ in at most $i$ steps.)
NOTES FOR 10TH MEETING
Topic: Space complexity (last part)
and randomized complexity classes (introduction).
The next reading assignment consists of two unrelated topics.
The 1st topic is the last topic we shall learn regarding space complexity,
and the 2nd topic is a set of issues regarding randomized complexity classes.
The 1st topic is showing that
QBF is complete for PSPACE under Karp reductions.
We shall show how the acceptance condition of any polynomial-space computation
can be encoded by a Quantified Boolean Formula of polynomial length.
The underlying idea resembles/mimics the proof that NSPACE(s)
is contained in DSPACE(s^2); essentially, we shall "re-use" formulas
in a way analogous to the re-usage of space.
The 2nd topic enters the study of the
complexity of randomized algorithms.
We start with introducing the notion of randomized algorithms
and various issues and conventions related to them.
We will discuss the equivalence on-line and off-line formulations
(which are useful for thinking about randomized algorithms)
as well as the type and magnitude
of the failure probability in such algorithms.
We then move to discuss the class BPP capturing two-sided error PPT
(probabilistic polynomial time) decision procedures.
We shall prove that BPP is (strictly) contained in P/poly,
and postpone to later showing that BPP is in PH (see Sec 6.1.3).
In addition, please read Appendix D.1 for probabilistic preliminaries
that we need for this chapter (on randomized algorithms).
Sec D.1.2.4 is not important at this point,
but it is very important in many other settings,
so I strongly recommend reading it too.
READING ASSIGNMENT
- Sec 5.4 (on PSPACE).
- Sec 6.1.0-6.1.2 ("Introduction to PPT" and BPP).
Sec 6.1.2.2 is an illustration -- please don't get entangled in it,
and do take the two stated facts as given
(i.e., don't worry about proving them, although this is not a "big deal").
- Apdx D.1 -- probabilistic preliminaries.
SUGGESTED EXERCISES (not to be submitted):
- Exer 5.30: The QBF game (for Sec 5.4)
- Exer 6.2: about randomized reductions, very easy
- Exer 6.4: about error reduction, easy but very important
SUGGESTED QUIZ FOR THE BASICS OF PROBABILISTIC POLYNOMIAL TIME
- What is the difference between BPP, RP, and ZPP.
- When viewing these classes as natural classes of PPT algorithms,
why is it important to consider failure probability
that is bounded-away from the trivial upper bound
(rather than only strictly smaller than it)?
What is the trivial upper bound in each of the cases
(i.e., of BPP, RP, and ZPP)?
What does bounded away mean,
and why is this specific definition used?
- Similarly, why is it important to have a failure probability bound
that is significantly larger than zero?
NOTES FOR 11TH MEETING
Topic: randomized complexity classes.
The next reading assignment consists of several issues relating to PPT.
The highlights include a proof that BPP is reducible via
a one-sided error randomized Karp reduction to coRP
and a discussion of RL (randomized log-space).
The aforementioned randomized reduction is typically use
to establish a weaker result, stating that BPP is in PH
(or, more specifically, that BPP is in Sigma2).
In our meeting, I may present the ideas in these weaker terms.
Actaully, I currently tend to present them in terms of MA2 vs MA1,
where MA denotes a generic randomized versuion of NP,
and MA2 (resp., MA1) are two-sided error (resp., one sided-error)
version of it.
Specifically, $S\in MA2$ (resp., $S\in MA1$) if there exists
a probabilistic polynomial-time machine $V$ such that
- If $x\in S$, then exists $y$ of $\poly(|x|)$ length
such that $\Prob[V(x,y)=1] \geq 2/3$ (resp., equals 1);
- If $x\not\in S$, then for every $y$
it holds that $\Prob[V(x,y)=1] \leq 1/3$ (resp., $\leq 1/2$).
(Note that, like for BPP and RP, error-reduction holds here too.)
Obviously, both BPP and MA1 are contained in MA2;
and recall that it is not know whether BPP is in NP.
The proof to be presented shows that MA2 is contained in MA1;
that is, we can eliminate the error in the completeness condition.
The proof, due to Lautemann, is based on providing some choice
for the verification process (i.e., its randomness may be shifted
in one of polynomially many options provided as part of the proof).
When discussing RL, I will stress the importance of restricting
computation to polynomial-time (which was the case "automatically"
in the cases of L and NL). Without these restriction, the "bad RL"
conatins all of NL (and the proof of this fact contains an interesting
idea -- the idea of an "approximate randomized counter", which allows
counting $Omega(T)$ steps at the cost of $\log\log T$ space).
READING ASSIGNMENT:
The rest of Section 6.1. Specifically:
- Sec 6.1.3: One-Sided Error and the proof that BPP is reducible
with a one-sided error Karp reduction to coRP.
- Sec 6.1.4: Zero-error PPT (ZPP) and "ZPP = RP intersection coRP".
- Sec 6.1.5: RL, issues in its definition, and UCONN in RL.
SUGGESTED EXERCISES (not to be submitted):
- Exer 6.4-6.6: about error reduction, easy but very important.
(The point is merely defining the random variables on which
the probabilistic inequalities are to be applied. You should
be in a state in which you consider such an exercises trivial.)
- Exer 6.7: about an alternative definition of ZPP, very important!
- Exercises 6.10 and 6.11: Add'l details for the illustrative paragraph
in the proof of Thm 6.9. Exec 6.10 is good to know in general!
- Exer 6.14-6.15: Exer 6.14 is easy, but Exer 6.15 is tedious.
They show completeness and hierarchies for promise problem version of BPP.
SUGGESTED QUIZ
- Define log-space analogues of BPP and ZPP,
to be denoted BPL and ZPL respectively.
- Show that ZPL equals RL intersection coRL.
- Can the ideas that underly the proof that
BPP is randomly reducible to RP by a one-sided error reduction
be applied to BPL?
NOTES FOR 12TH MEETING
Topic: Counting classes - Part 1 (exact and approximate).
The issues that I'd like to highlight include
- The notion of a counting problem and the counting class #P:
For every $R\in\PC$, the (function) class $\#P$ contains
the function $F_R$ defined by $f_R(x) = |R(x)|$.
- A decision problem that is computationally equivalent to
a counting problem (stressing the use of LEQ/GEQ rather than EQ).
I.e., $\{(x,N): f(x)\geq N\}$.
- One type of #P-completeness results is obtained by use
of parsimonious Karp-reductions (from PC to some R in PC).
But this type of reduction, which yields is itself a Karp-reduction,
implies PC-completeness of the corresponding search problem,
and thus is unlikely to be applibacle to problems in PF
(assuming PF is different from PC).
Still, other reductions (i.e., non-parsimonious ones) yield
another type of #P-completeness results refers to problems in PF;
one case is #DNF, another is #PerfectMatching,
and one can also show artificial cases (see Exer 6.22).
- On additive and relative notions of approximate counting.
For any $R\in\PC$ with $R \subseteq \cup_{n}\{0,1\}^{n+m(n)}$,
it is easy to approximate $f_R=#R$ up to an additive error
of $2^{m(n)}/\poly(n)$ (i.e., this can be done in PPT).
However, relative factor approximation is at least as hard
as deciding $S_R$ (since it allows to distinguish zero from positive).
- Warning: relative approximation of the number of valid solutions
may not be computationally equivalent to relative approximation
of number of invalid solutions. In other words, relative approximation
of $Q\in[0,1]$ is not necessarily computationally equivalent to
relative approximation of $1-Q$.
(In contrast, an additive approx of $Q$ is computationally equivalent
to an additive approx of $1-Q$.)
- A (polynomial-time deterministic) reduction
of "relative approximation of #DNF" to "additive approximation of #DNF".
For $\phi = \vee_{i\in[m]}\phi_i$ over $n$ variables, use
$$\Prob_x[\phi(x)=1]
= \sum_{i=1}^m \Prob_x[\phi_i(x)=1 \wedge (\vee_{j\ls i}\phi_j(x)=0)]
= \sum_{i=1}^m \Prob_x[\phi_i(x)=1]
\cdot\Prob_x[\vee_{j\ls i}\phi_j(x)=0 | \phi_i(x)=1]$$
while noting that
- (i) $q_i = \Prob_x[\phi_i(x)=1]$ is easily computable; and
- (ii) $p_i = \Prob_x[\vee_{j\ls i}\phi_j(x)=0 | \phi_i(x)=1]$
can be efficiently approximated up to an additive deviation of $\eps$
(w.p. $1-\exp(-t)$, by $O(t/\eps^2)$ random samples in $\phi_i^{-1}(1)$).
Note that $\sum_{i=1}^m (p_i\pm \e)\cdot q_i$
is $\sum_{i=1}^m p_i q_i \pm m\e\cdot Q$,
where $Q=\max_{i}\{q_i\} \leq \Prob_x[\phi(x)=1]$
(since $\Prob_x[\phi(x)=1] \geq \Prob_x[\phi_i(x)=1]$ for any $i$).
READING ASSIGNMENT
- Pages 202-205, Exact Counting
(not including the proofs of Props 6.21 and 6.22).
- Sec 6.2.2.0: The notion of Approximate counting.
- Sec 6.2.2.1: Approximate counting for DNF.
PREPARING FOR THE NEXT READING ASSIGNMENT
(You do not need this now, but you'll need it for next time;
so if you have time, then you may start looking at the following):
Read Appendix D.2 on Hashing Functions.
- When reading Apdx D.2.2,
you may either take the claims on "faith"
or read Sec 8.5.1.1 now.
- When reading Apdx D.2.3,
take a look (again) at the discussion in Apdx D.1.2.4
ADD'L READING (NOT MANDATORY):
I do recommend reading the proofs of Props 6.21 and 6.22 (pages 206-211),
see more details below. These proofs are independent of one another,
and so you may read one but not the other.
I'd give priority to the proof of Prop 6.22.
- The proof of Prop 6.21 (which spans 4 pages and 3 exercises)
is via an ingenious gadget reduction;
the main lesson from it is that sometimes one just needs ingenuity.
Actually, this refers to the original ("structured") proof;
the alternative proof (via Fig 6.3) was found retrospectively.
- The (1 page) proof of Prop 6.22 is via a couple of reductions,
which offer some concrete technical lessons.
SUGGESTED QUIZ (for Sec 6.2.1-6.2.2.1)
- What is a parsimonious reduction?
- Under what notions of (exact and approximate) counting
is the complexity of counting valid solutions related to
the complexity of counting invalid solutions?
- How is the complexity of exact and approximate counting problems
related to the complexity of solving decision problems?
What is the role of the promise in some of these formulations?
Take home mesasage from the Hashing Lemma (Lem D.4):
For a family of pairwise independent hashing functions
mapping $n$-bit strings to $m$-bit strings,
almost all hashing functions map a sufficienttly large subset
almost uniformly on their range,
where sufficiently large means somewhat larger than $2^m$.
NOTES FOR 13TH MEETING
Topic: Counting classes - Part 2 (approximation reduces to NP).
The issues that I'd like to highlight include
- The Hashing Lemma (Lem D.4) refers to a family $H_n^m$ of pairwise
independent hashing functions mapping $n$-bit strings to $m$-bit strings
and to a subset $S$ of $\{0,1\}^n$ of size greater than $2^m$
(it holds also for smaller sets, in some cases vacuously...).
Note that for $|S| > 2^m$, it holds that $\mu = |S|/2^m > 1$,
where $\mu$ is the expected number of $S$-elements hashed
(by a random $h\in H_n^m$) to a fixed image (i.e., $z\in \{0,1\}^m$).
For every $z$ and any $\eps>0$, the lemma states that all but
an $1/(\eps^2 \mu)$ fraction of $h\in H_N^m$ it holds that
$|\{w\in S: h(w)=z\}| = (1\pm\eps)\cdot\mu$.
[Unfortunately, Apdx D.2 and Sec 6.2 use different notations.
In Apdx D.2, $w$ is replaced by $x$ (and $z$ by $y$),
whereas in Sec 6.2 $w$ is replaced by $y$.]
Take home mesasage:
Almost all hashing functions map a set $S$
almost uniformly on their range,
provided that $S$ is sufficiently larger than their range.
- THM 6.27 states that, for every $R\in\PC$,
a relative approximation of $\#R$ can be obtained
by a PPT reduction to NP
(i.e., a PPT orcale machine with oracle to a set in NP).
The proof uses "random sieving" (of $R(x)$ via hashing),
at varying densities, when $x$ is the given input.
When the density is too small (i.e., smaller than $1/10|R(x)|$),
no element of $R(x)$ passes (whp); but when the density is large enough
(i.e., bigger than $10/|R(x)|$), some elements of $S$ pass.
The question whether an element of $S$ passes the sieve defined
by a hashing function $h$ (i.e., $w$ passes iff $h(w)=0^i$)
is an NP-type of question (see $S_{R,H}$ which includes $(x,i,h)$
if and only if there exists $y$ s.t. $(x,y)\in R$ and $h(y)=0^i$).
This gives an approximation of $\log_2|R(x)$ to within $\pm4$
(i.e., our estimate is in the interval $[|R(x)|/16, 16|R(x)|]$).
To get a better approximation ratio, amplify via Cartesian product
(i.e., $R^t(x)=\{(y_1,...,y_t):(\forall i) y_i\in R(x)\}$).
- The notion of (promise) problems with unique solutions:
The search and decision versions are obtained by restricting
the original problems to instances that have at most one solution.
- THM 6.29 states that, for every $R\in\PC$,
the search problem of $R$ is redicible to a unique search problem in PC.
Furthermore, if $R$ is PC-complete via a parsimonious reduction,
then the search problem of $R$ is reducible to the unique search
problem of $R$.
The proof uses the same idea of random sieves, but here on input $x$
we use a sieve density that is around $1/10|R(x)|$ such that,
with constant probability, a single solution $w\in R(x)$ passes.
READING ASSIGNMENT
- Appendix D.2 on Hashing Functions.
- When reading Apdx D.2.2,
you may either take the claims on "faith"
or read Sec 8.5.1.1 now.
- When reading Apdx D.2.3, you may skip Thm D.5 and its two proofs,
but take a look (again) at the discussion in Apdx D.1.2.4
- Sec 6.2.2.2: Approximating #P via a PPT reduction to NP.
- Sec 6.2.3: Unique solution problems.
ADD'L READING (NOT MANDATORY):
- Thm D.5 (aka "Leftover Hashing Lemma") is often used in other places,
so it is a good opportunity to learn it now, especially since
it follows quite easily from Lem D.4 (see the main/first proof).
The alternative/second proof is also instructive.
NOTES FOR 14TH MEETING
Topic: Counting classes - Part 3 (uniform generation).
This is the last meeting of the current semester.
However, if you wish to set up another meeting to discuss
the current reading assignment or anything else, just let me know.
The issues that I'd like to highlight this time include
-
The notion of uniform generation (of solutions wrt $R\in\PC$).
Actually, there are two definitions and/or notions:
- Exact uniform generation:
This stronger notion requires that, on input $x$,
the generation process outputs a solution w.p. at least 1/2,
and conditioned on that event (i.e., of generating an output),
all solutions should be output with equal probability.
- Approximate uniform generation:
This weaker notion requires that, for a proximity parameter $\eps$,
the output distribution be $\eps$-close to the uniform distribution
over all valid solutions.
Note that (1) implies (2) with an exponentially vanishing $\eps$.
-
Approximate counting is computationally equivalent to approximate
uniform generation with $\eps=1/\poly$ for a sufficiently large polynomial.
The equivalence holds for each $R\in\PC$ provided it is $\PC$-complete
under a "strongly parsimonious" reduction, where "strongly" means
that there is an efficiently computable 1-1 mapping of solutions
to the target instance to solutions to the original instance.
What is actually shown is:
- For every $\R\in\PC$, uniform generation for $R$ with
exponentially vanishing deviation is PPT reducible to
approximately (up to some specific eps=1/poly) counting
the number of solutions wrt a ("related") $R'\in\PC$.
That is, using a sufficient good approximate counter for $\#R'$
allows to (approximately) uniformly generate solutions for $R$.
- For every $\R\in\PC$, approximate counting for $R$
(up to any eps=1/poly) is PPT reducible to uniform generation
(with error eps'=eps/poly) for a ("related") $R'\in\PC$.
That is, using a sufficient good approx-uniform generator for $R'$
allows to (approximately) count solutions for $R$.
In both directions, on input $x$, we consider the tree of
prefices of all possible solutions for $x$ wrt $R$.
In (1), the approximate counter is used for determining the probability
of extending the current prefix by each of the two possible bits,
where a "twist" is made to get exponenially vanishing deviation
(and the error is only due to the error of the approximate counter).
(Note: The deviation of the approximate counter only effects the
running time of the uniform generation.)
In (2), uniform generation is used to approximate the number of prefices
that extend the current prefix with each of the two possible bits,
but it is important to continue with the bigger set (since this
guarantees that an additive approximation is also a relative one).
-
A direct uniform generation (in the stronger sense) for
any $R\in\PC$ by a PPT machine with oracle to some NP set.
On input $x\in S_R$, we estimate $m_x=\log_2|R(x)|$,
and set $m=\max(0,m_x-\log 40\ell)$, where $R(x)\subseteq\{0,1\}^\ell$.
(1) Select an $\ell$-wise independent hashing function $h\in H_\ell^m$,
which wvhp partitions $R(x)$ into $2^m$ cells of size approx $40\ell$,
where for $\alpha\in\{0,1\}^m$ the $\alpha$-cell
is the set of all $y\in R(x)$ s.t. $h(y)=\alpha$.
If exists $\alpha$ s.t the $\alpha$-cell is bigger than $200\ell$,
then halt with no output. Note that this is an NP-query!
(2) Select a cell $\alpha$ at random.
Using the NP-orcale, find all its elements,
and output each of them w.p., $1/200\ell$
(which mean that no output is produced with probability $1-(s/200\ell)$
where $s$ is the number of elements in the $\alpha$-cell).
READING ASSIGNMENT: Section 6.2.4 (uniform generation).
Note that you'll need Lemma D.6 (at the end of Apdx D.2).
HOMEWORK ASSIGNMENT:
- Exercise 6.39.
Don't provide all details for part (1), just give an outline.
- Either Exercise 6.38 or Exercise 6.41 at your choice.
Try to do all without looking at the guidelines.
Note: Student who did not submit some prior task should do both!
NOTES FOR the 15TH MEETING
Topic: Hardness amplification - part 1 (One-Way Function)
The material we'll cover in this semester has a significant overlap
with the material taught in 1st semester of the "Foundations of Crypto"
course (see current sections 7.1, 8.2 and 9.1-9.2); however, the
perspective (even on this overlap) will be different.
We start with Chapter 7, titled "the bright side of hardness";
although this hints to the applications of computational hardness,
the actual focus would be on "hardness amplification" -- that is,
turning moderate levels of hardness into stronger levels which
in turn are the ones useful (e.g., for pseudorandomness and Crypto).
Regarding the current reading assignment (sec 7.1.0-7.1.2),
I will discuss three topics.
- Moving from P-vs-NP (and NP-vs-BPP), which are worst-case assumptions,
to "instances that we can put our hands on" (i.e., average-case),
and to "instances generated with solutions" (which "we can use"),
and how these lead to the definition of OWF (= One-Way Function).
- The actual definition of OWF. Issues include probability of inverting,
the need to provide the preimage-length in unary, etc.
- Hardness amplification from "weak OWF" to (strong/normal) OWF.
We cannot assume a "direct product" attacker/inverter,
and so the proof is more complex than the Information Theoretic analogue.
We use a "reducibility argument" (reduction w. an average-case analysis),
and (unlike the IT analogue) it has a cost (see the running-time analysis)!
READING ASSIGNMENT: Section 7.1.0--7.1.2.
I am/was tempted to suggest all of Sec 7.1 for this reading,
which is not too much in terms of pages (i.e., about a dozen pages).
But, in 2nd thought, I feel that Sec 7.1.2 and 7.1.3 may contain too much
of a "new mindset" that is good to digest slowly and discuss.
Anyhow, those who are bored can try reading Sec 7.1.3 too.
SUGGESTED EXERCISES (not to be submitted):
- Exer 7.1 and 7.3 should be very easy.
- Exer 7.4 may sound surprising, but is actually not hard.
All these are good both for "practice" and for the facts they communicated.
NOTES FOR the 16TH MEETING
Topic: Hardness amplification - part 2 (Hard-core predicates)
[LATER NOTE: It may be better to split this material
and do each in a separate meeting....]
The current assignment consists of two parts:
- Hard-core predicates (Sec 7.1.3),
- Starting to discuss hardness in E (Sec 7.2.0-7.2.1.1).
But I doubt I'll manage to discuss both in the meeting.
Regarding hard-core predicates, I plan to
- Define and discuss these constructs.
- Present the ideas underlying the proof of Thm 7.7, which asserts
that (*essentially*) any one-way function has a hard-core predicate.
The proof starts with the case that the predicate can be approximated
with success probability $0.76$, distills the problem of "error doubling",
presents a dream in which it can be solved, and "materialize the dream".
I will probably be "pulled" to showing the entire proof sketch.
But it will be good if time is left also for the next topic...
(One may present the dream as a model in which
we obtain a random sample of labelled examples.
Recall that the problem can be cast as computing $b_x:\{0,1\}^n\to\{0,1\}$,
where $b_x(r)=b(x,r)=\sum_ix_ir_i$,
when giving an oracle access to $B_x$ that agrees with $b_x$
on a $0.5+\eps$ fraction of the possible instances.
The dream refers to a model in which we are given a sample $R$
of random points $r$'s along with the correct values $b_x(r)$'s.
Furthermore, when solving the problem in this dream-model,
we note that our solution holds also when the sample $R$ consists
of pairwise independent points that are uniformly
distributed in $\{0,1\}^n$. This observation will allow
to implement the dream-model in the standard model.
(On the other hand, a dream that asserts totally independent $r$'s
is too good to be implementable -- since it allows to find $x$
without querying $B_x$ at all).)
Regarding hardness amplification in E,
I will focus on the amazing result that asserts
that worst-case hardness implies mild average-case hardness.
The key steps are (a) considering a low-degree extension $f'$
of the original "worst-case hard" function $f\in E$,
and (b) using "self correction" (a randomized reduction of $f'$ to itself)
we show that if $f'$ is easy on "much" of its domain then it is easy
on each point in its domain.
QnA (by Ori Sberlo):
Can one not establish average-case hardness unconditionally,
by diagonalization?
In fact one can (see Goldmann, Grape, and Hastad, IPL 1994),
but this will only hold with respect to uniform classes,
whereas here we seek hardness wrt non-uniform circuits.
READING ASSIGNMENT:
Read and go to detailed exercises of the following.
- Hard-core predicates (Sec 7.1.3).
See Exer 7.5 regarding the underlying pairwise independent construction.
- Preliminaries re hard functions in E (Sec 7.2.0).
- Hardness Amplification from Worst-case to mild average-case (Sec 7.2.1.1).
See Exer 7.11 ("self correction" of low degree polynomials)
and Exer 7.12 ("low degree extension"). These are basic tools,
often used in very different contexts of theory of computation!
If you feel this was too much and/or that you want to discuss some
issue(s) in our next meeting (of 1/4), do let me know before next time!
SUGGESTED EXERCISES (not to be submitted):
- First priority: The three exercises mentioned above.
- Simple variants on Thm 7.7: Exer 7.6 and 7.7.
- Exer 7.9 and 7.13 should also be easy.
NOTES FOR the 17TH MEETING
Topic: Hardness amplification - part 3 (the XOR lemma)
[WARNING: This reading may be the most difficult in the course]
The current reading assignment refers to hardness amplification (in E);
specifically, the amplification is from mild average-case hardness
to very strong level of average-case hardness.
In addition, there is a homework assignment and optional reading.
My plan for the meeting would include:
- Discussing the XOR Lemma and the Direct Product Lemma (DPL).
- Reducing the XOR Lemma to the DPL. This includes
- Reduction to a "Selective XOR Lemma" (a la Thm 7.7).
- Reducing the Selecting XOR Lemma to the (standard) XOR Lemma.
(Although the Selecting XOR Lemma would have been "good enough")
- Proving DPL by induction on the number of inputs,
while relying on a *quantitative* two-input version.
Warn against "unquantitative induction (for non-constant #iterations)".
Add'l ideas to highlight include the observation that
any distribution (on strings of feasible length)
can be "generated" by small (non-uniform) circuits.
READING ASSIGNMENT:
Read Section 7.2.1.2 ("Yao's XOR Lemma").
SUGGESTED ADDITIONAL READING (not mandatory):
- Hardness amplification via "list decoding" (Sec 7.2.1.3);
this goes directly from worst-case to strong average-case
(rather than going via the mild level as we've done).
The 1st priority is the more concrete treatment on pages 266-7
(the abstraction on pages 268-9 is less important and hard to follow).
- Tighter amplification (wrt exponential size circuits): Sec 7.2.2.
The amplification in prior sections moved from input-length $n$
to input-length $m=\poly(n)$, while the circuit sizes only decaresed
(and so hardness $s(n)$ is now hardness $s'(m)=s(m^{1/O(1)})$.
The 1st priority is the notion of "hard region" (Sec 7.2.2.1).
The 2nd priority is its use for amplification (Sec 7.2.2.2).
Note that Sec 7.2.2.2 is intentionally sketchy (especially Lem 7.23).
I think it is good to have a vague idea of what is going on here,
even if one does not follow all details
(which would indeed require reading beyond this text).
HOMEWORK ASSIGNMENT:
SUGGESTED EXERCISES (not to be submitted):
- Exer 7.14 (variant on Thm 7.7: "hardcore predicate")
- Exer 7.15 (reversing Exer 7.14).
- Exer 7.16 (very simple -- averaging argument applied to circuits).
NOTES FOR the 18TH MEETING
Topic: Pseudorandom generator - part 1 (the general paradigm and more)
Your last reading assignment was by
far the most difficulty and technical one for this course.
So no need to fear anything that may be coming in the next months.
The current reading assignment starts a series on pseudorandom generartors.
I suggest to view "pseudorandom generators" (PRGs) as a general paradigm
with many incarnations (or as a family name with many members).
I am a bit unsure about the organization and partition of the
material for the next two meeting. So I'll provide what is planned
for both as well as my recommendation and hesitations.
In the meeting, I will focus on three issues.
- The general paradigm of PRGs, which combines (1) a deterministic stretch
with (2) a notion of computational (i.e., restricted) indistinguishability,
which is weaker than statistic indistinguishability.
[Indeed, PRGs stretch/extend randomness, not "generate" it from scratch.]
On top of it, we consider (3) the complexity of generation.
Without item (3), we can easily get strong PRGs -- See Exer 8.1.
- In general purpose PRGs, item (2) refers to *any* PPT,
whereas item (3) refers to a fixed poly-time computation
(and thus stretch is restricted to that level).
The point is that a single PRGs is good wrt all efficient applications,
even those that require more resources that the generator's evaluation.
- A focus on Computational Indistinguishability (CI) highlighting:
(i) CI is a strict relaxation of statistical indistinguishability.
(ii) Showing that (i) holds for PPT-computable distribution ensembles
requires PRGs (see Exer 8.9).
(iii) Computational indistinguishability wrt multiple samples
and when is it implied by CI wrt a single sample (see Exer 8.10).
(iv) The hybrid method.
READING ASSIGNMENT (for the next two meeting):
[the default suggestion is to read the following five items for the 1st meeting]
- Sec 8.0 -- a general introduction (a conceptual discussion)
- Sec 8.1 -- the general paradigm (a conceptual discussion)
- Sec 8.2.1 -- General-Purpose PRGs, their definition.
- Sec 8.2.2 -- The archetypical application of General-Purpose PRGs.
- Sec 8.2.3 -- Computational Indistinguishability (CI), in general.
[and the rest for the second]
- Sec 8.2.4 -- Stretch amplification (for General-Purpose PRGs)
- Sec 8.2.5 -- Construction of General-Purpose PRGs.
- Sec 8.2.6 -- A non-uniform version of General-Purpose PRGs.
- Sec 8.2.7.2 -- reflections (a conceptual discussion)
- Sec 8.3.1 -- Derandomization via canonical derandomizers
Indeed, my basic suggestion is to read Sec 8.0-8.2.3 for the 1st meeting,
and the rest for the next (i.e., 2nd) meeting; that is, proceed by
the actual order of the sections, and break after Sec 8.2.3.
But a good alternative is to take Secs 8.2.2 and 8.2.6 out
of the order and use the following order
(i) Read Sec 8.0-8.2.1 and 8.2.3 first,
(ii) Read Sec 8.2.4, 8.2.5, and 8.2.7.2 next,
and only then (iii) read Sec 8.2.6, Sec 8.2.2, and 8.3.1.
The reason for the alternative is that Sec 8.2.6 demonstrates
the point of Sec 8.2.2 in a non-uniform context,
while using a much simpler proof.
In any case, pay close attention
to Sec 8.2.3.3 (see the "hybrid method")
and to Sec 8.2.5.2 (see "the next bit test" -- Exer 8.12).
SUGGESTED EXERCISES (not to be submitted):
- Exer 8.1 (existence of inefficient PRGs)
- Exer 8.7 (PRGs imply non-trivial cases of comput' indist.)
- Exer 8.9 (on the necessity of assumptions for Exer 8.7).
- Exer 8.10 (when single sample CI does not imply multiple sample CI)
- Exer 8.12 (unpredictability versus indistinguishability)
NOTES FOR the 19TH MEETING
Topic: Pseudorandom generator - part 2 (general-purpose PRGs)
The current reading assignment is the 2nd part of the double assigment
that I gave last time -- see below. We shall finish dealing with
general purpose PRGs and turn to "canonical derandomizers".
In the meeting, I will focus on four issues.
- Amplifying the stretch:
This relies on the distinguishers being more powerful than the PRG
(and in particular being able to invoke the PRG (on long seeds)).
There are two alternative constructions
(which are easier to explain with pictures).
Both are analyzed via a hybrid argumnent
(and the corresponding hybrids are easy to see in pictures).
- Construction of a PRG based on any one-way *permutation*
and stating the equivalence "PRG iff OWF".
- Next bit unpredictability as equivalent to pseudorandomness.
For "NB unpredictability implying pseudorandomness"
we use a hybrid argument (+ look at this stmt for 1 bit).
- The derandomizations performed using general purpose PRGs
reveal an "overkill": We don't need the PRG to work in poly-time.
It suffices that it works in exp-time.
Also, it is OK if the PRG works more time than the distinguisher.
Hence, for a hard predicate $h$,
the mapping $s \mapsto (s,h(s))$ may be a PRG....
READING ASSIGNMENT
The second half of whatever you did not read so far in the following.
[the following repeats the text of last time]
[the default suggestion is to read the following five items for the 1st meeting]
- Sec 8.0 -- a general introduction (a conceptual discussion)
- Sec 8.1 -- the general paradigm (a conceptual discussion)
- Sec 8.2.1 -- General-Purpose PRGs, their definition.
- Sec 8.2.2 -- The archetypical application of General-Purpose PRGs.
- Sec 8.2.3 -- Computational Indistinguishability (CI), in general.
[and the rest for 2nd meeting]
- Sec 8.2.4 -- Stretch amplification (for General-Purpose PRGs)
- Sec 8.2.5 -- Construction of General-Purpose PRGs.
- Sec 8.2.6 -- A non-uniform version of General-Purpose PRGs.
- Sec 8.2.7.2 -- reflections (a conceptual discussion)
- Sec 8.3.1 -- Derandomization via canonical derandomizers
Indeed, my basic suggestion is to read Sec 8.0-8.2.3 for 29/4,
and the rest for the next meeting (of 6/5); that is, proceed by
the actual order of the sections, and break after Sec 8.2.3.
But a good alternative is to take Secs 8.2.2 and 8.2.6 out
of the order are use the following order
(i) Read Sec 8.0-8.2.1 and 8.2.3 for 29/4,
(ii) Read Sec 8.2.4, 8.2.5, and 8.2.7.2 next,
and only then read Sec 8.2.6, Sec 8.2.2, and 8.3.1.
The reason for the alternative is that Sec 8.2.6 demonstrates
the point of Sec 8.2.2 in a non-uniform context, while using
a much simpler proof.
In any case, pay close attention
to Sec 8.2.3.3 (see the "hybrid method")
and to Sec 8.2.5.2 (see "the next bit test" -- Exer 8.12).
SUGGESTED EXERCISES (not to be submitted):
- Exer 8.1 (existence of inefficient PRGs)
- Exer 8.7 (PRGs imply non-trivial cases of comput' indist.)
- Exer 8.9 (on the necessity of assumptions for Exer 8.7).
- Exer 8.10 (when single sample CI does not imply multiple sample CI)
- Exer 8.12 (unpredictability versus indistinguishability)
NOTES FOR the 20TH MEETING
Topic: Pseudorandom generator - part 3 (canonical derandomizers)
The current reading assignment has two parts,
the main one being the construction of canonical derandomizers,
and the other is background material for the week that follows.
In the meeting, I will focus on the first part.
I will describe the construction
(known as the Nisan-Wigderson construction),
its analysis, and maybe (i.e., if time and energy suffices)
also how these ideas extend to other settings.
The construction consists of applying a hard to approximate function,
which is computable in E but hard to approximate by smaller circuits,
to many different projections of the seed.
We cannot afford pairwise disjoint sets
(since this violates the stretching requirement),
whereas identical sets are obviously bad.
So we need a family of sets with small pairwise intersections,
and we need to construct it in time exponentail in the seed.
The analysus shows that this yields a PRG, provided that these
pairwise intersections are smaller than the logarithm of the size
bound that we allow in the inapproximability hypothesis.
READING ASSIGNMENT -- these are three unrelated items:
- The construction of canonical derandomizers: Sec 8.3.2.
This is the main part of the current reading assignment.
[The next two items are technical background that you'll need
for the next reading assignment (i.e., Lecture 21).]
- Take another look at Apdx D.2.3,
and focus on the mixing case.
See also Item 1 in the homework assignment.
- Introduction to Randomness Extractors: Apdx D.4.0-D.4.1.1.
[We shall use (2) and (3) in our next reading assignment
(of Lecture 21, on space bounded distinguishers),
but it is good to read this material now since the next reading
will be more "dense".]
HOMEWORK ASSIGNMENT (to be submitted):
- Prove the Mixing Lemma in the middle of page 530.
See the paragraph after the proof of Lemma D.4.
- Exer 8.24 (canonical derandomizers imply hardness).
Try to do it without looking at the guidelines.
SUGGESTED ADDITIONAL READING (not mandatory):
[The following four items will not be used in this course,
but they are very useful to know!]
- Sec 8.3.3.1 (the NW-construction as a general paradigm).
This will lead into the following item (2)
- Using the NW-construction paradigm for randomness extractors: Apdx D.4.2.
[The following two exercises are unrelated to the NW-construction...]
- Randomness extraction and randomness-efficient error-reduction:
Apdx D.4.1.3
- Randomness extraction and Sampling:
Start with Apdx D.3 and then read D.4.1.2
SUGGESTED EXERCISES (not to be submitted):
- Exer 8.12 and 8.20 (unpredictability vs indistinguishability).
- Exer 8.19 (constructing a set system as required by Thm 8.18).
[The above two are most useful.]
- Exer 8.22 and 8.23 (generalizing Thm 8.18 and its application):
Be warned that this is quite tedious, but can be useful for
those who believe that they learn better "by doing".
NOTES FOR the 21TH MEETING
Topic: Pseudorandom generator - part 4 (the space-bounded case)
The current reading assignment is PRGs fooling space-bounded distinguishers.
The topics include:
- definitional issues
(which are inherited from the model of (randomized) space complexity);
- two different (unconditional) results geared to different settings.
- a construction geared towards small space (logarithmic in time),
which uses a seed that is quadratic in the space;
- a construction geared towards larger space (i.e. time is polynomial
in space), which uses a seed that is linear in the space.
- a derandomization of BPL (the log-space version of BPP)
in polylogarithmic space and polynomial time, using (2i).
READING ASSIGNMENT -- basically, entire Section 8.4.
- Sec 8.4.1 deals with the definitional issues,
- Sec 8.4.2.0 states the two results re PRGs,
- Sec 8.4.2.1 sketches both constructions and their analysis,
and Sec 8.4.2.2 explains how the 1st construction leads to
a derandomization of BPL (in polylogarithmic space and polynomial time).
SUGGESTED ADDITIONAL READING (not mandatory):
[Very recommended. All these are extremely useful to learn!]
Sec 8.5 -- three special purpose PRGs.
They are used a lot, and you already encountered the 1st
(i.e., limited independence PRG) and maybe also the 3rd
(i.e., expander walks).
SUGGESTED EXERCISES (not to be submitted):
- Exer 8.25 on the limitations of PRGs fooling space-bounded distinguishers.
- Exer 8.27 multiple samples in the context of space-bounded distinguishers.
Both are intended to promote understanding of the model;
the second may be better for that.
NOTES FOR the 22nd MEETING
Topic: Probabilistic proof systems - part 1 (interactive proofs)
We step into a new chapter, one on probabilistic proof systems.
The common theme is introducing randomization
and even (bounded) error probability in order to achieve some features,
which are impossible (or very unlikely) for deterministic systems.
Hence, randomness does help wrt proof verification!
A common convention: all complexity measures are stated in terms
of the length of the main input (i.e., the "assertion being verified").
We start with interactive proof systems (ips, and the class IP).
The definition highlights verification/proofs as a dynamic process
(rather than a static object), and the prover becomes explicit.
Soundness (and sometimes completeness) are given a probabilistic
interpretation. Still, verification is (probabilistic) polynomial-time.
IP extends NP by both randomization and interaction,
but interaction is "useless" without randomness.
Actually, the soundness error is essential.
Prop: If $S$ has an ips with zero error, then $S\in\NP$.
Pf: (1) zero-error --> deterministic,
(2) deterministic --> fixed transcript (which can be sent).
On the power of IP -- in three steps: GNI, coNP, and PSPACE.
READING ASSIGNMENT
- Sec 9.0: a general introduction to PPS.
- Sec 9.1.0-9.1.3: The basic material for IP.
- Sec 9.1.4: an overview of variants that are good to know
(i.e., AM/public-coin, two-sided error, and constant speed-up of rounds).
SUGGESTED ADDITIONAL READING (not mandatory):
Sec 9.1.5: Overview on the complexity of proving.
SUGGESTED EXERCISES (not to be submitted):
- Exer 9.2: reductions to IP (understanding the definition).
- Exer 9.4: a round reduction for "coNP in IP".
- Exer 9.5: adapting "coNP in IP" to "#P in IP".
- Exer 9.6: on the QBF format that fits our needs (for PSPACE in IP)
- Exer 9.7: reducing modolu a random prime -- a trick that is good to know!
NOTES FOR the 23RD MEETING
Topic: Probabilistic proof systems - part 2 (zero-knowledge proofs)
The material from this point on (if not for a couple of weeks now)
is presented in the form of high-level overviews.
You may view them as introductions to some research directions/sub-areas.
The presentation clarifies the definitions and issues,
but typically only sketches the constructions and/or proofs.
The meeting will be devoted to zero-knowledge proofs.
I will discuss the definition and present some simple constructions.
READING ASSIGNMENT:
On zero-knowledge (ZK) proofs (Sec 9.2.0--9.2.2, pages 368-377).
- Sec 9.2.0: high-level introduction to the topic
- Sec 9.2.1: definitional treatment
- Sec 9.2.2: outline of two constructions.
SUGGESTED ADDITIONAL READING (not mandatory):
Sec 9.2.3 (p. 378-9): Introduction to "proofs of knowledge".
This is really unrelated to the rest of the course;
read it only if you are truly interested in this puzzling notion.
SUGGESTED EXERCISES (not to be submitted):
- Exer 9.13 and 9.14: necessary contitions for non-triviality of ZK.
Good for getting used to the definitions.
NOTES FOR the 24th MEETING
Topic: Probabilistic proof systems - part 3 (PCP)
[LATER NOTE: It may be better to split this material
and do (0)+(1) in one meeting and (2)+(3) in another.]
I will present the definition of PCP and the PCP Theorem (0),
and then try to give a taste of the proof.
At best I will do two of the following three things
- Show that NP is in PCP[poly,O(1)], which is already amazing.
- Describe the "proof/PCP composition" paradigm.
- Outline Irit's gap amplification approach.
I think I will present (1) slightly differently from in the book.
After reducing to QE (satisfiability of Quadratic Equations over GF(2)),
I will suggest to use as a proof-oracle the sequence of
the evaluations of all ($2^{n^2}$) quadratic expressions
on some satisfying assignment $x$ to the $n$ variables.
Firstly, rather than checking $m$ equations,
we shall check a random linear combination of these equations.
Secondly, we shall not just fetch the corresponding value from
the adequate point in the proof, but rather use self-correction,
and this will be done in addition to testing that the proof-oracle
is (close to) a valid encoding (which is a Hadamard encoding
of the outer-product of $x$ with itself).
Hence, testing and self-correcting the Hadamard code will be our first task,
and from there we shall proceed to testing and self-correcting the sub-code
corresponding of a Hadamard code of an $n^2$-bit long string obtained
by an outer-product of an $n$-bit message with itself.
READING ASSIGNMENT:
- The PCP model and the PCP theorem (statement only).
This includes Sec 9.3.0-9.3.1 and 9.3.2.0 (i.e., pages 380-4).
- The PCP Theorem -- proof outlines: Sections 9.3.2.
Note that these are overviews.
Only Section 9.3.2.1 provides enough hints towards
a full reconstruction of the proof.
The other two subsections (of Sec 9.3.2) only give a taste.
Also, you may skip pages 392-3 if you find it too hard to follow.
(Still, I recommend you try them...)
On the other hand, reading Sec 9.3.3 before Sec 9.3.2.3
may be a good idea, mainly for sake of wider perspective on "gaps".
If you don't read Sec 9.3.3 (re "PCP, gaps, and approximation") now,
then you'll read it in next assignment.
It is a relatively easy reading anyhow (and only 2 pages).
SUGGESTED ADDITIONAL READING (not mandatory):
Sec 9.3.4 ("more on PCP itself").
HOMEWORK ASSIGNMENT (to be submitted):
- Exer 9.17 (e.g., PCP[log,poly] is contained in NP).
- Exer 9.24 (note a correction on the book's webpage) and 9.26;
they deal with the two directions of PCP vs gapCSP.
SUGGESTED EXERCISES (not to be submitted):
- 9.15 and 9.16: On the relation between randomness complexity
and proof length in the context of PCPs.
- 9.20 a partial and easier analysis of the linearity test.
- 9.21 an alternative (and full) analysis of the linearity test.
NOTES FOR the 25TH MEETING
Topic: Probabilistic proof systems - part 4 (PCP, cont)
In light of your request,
the previous reading assignment was "cut shorter" to include only
(1) the PCP model and the PCP theorem (statement only)
and (2) the proof that NP is contained in PCP[poly,O(1)];
that is, Sec 9.3.0-9.3.2.1.
The current reading assignment will consist of the overviews
of the two different proofs of the PCP Theorem (i.e., NP in PCP[log,O(1)]),
which are briefly reviewed next. (Recall that PCP[log,poly] is in NP.)
Recall that we have seen (in Sec 9.3.2.1) that NP is in PCP[poly,O(1)],
which is already amazing. This result will be an ingredient in both
alternative proof of the PCP Theorem.
The 1st proof of the PCP Theorem is based on two additional ingredients.
- The proof composition paradigm which composes a PCP[r',q'] system
with a PCP[r'',q''] system to obtain a PCP[r,q] system
such that $r(n) = r'(n) + r''(q(n))$ and $q(n) = q''(q'(n))$.
The composition relies on additional features of the PCPs
such as (i) non-adaptiveness, (ii) small decision circuits,
(iii) being a "proof of proximity", and (iv) "robustness".
[Of course, each of these notions needs to be defined and obtained...]
- Showing that NP is in PCP[log,polylog].
Now, composing (2) with itself you get NP in PCP[r,q].
where $r(n) = O(\log n) + O(\log \polylog n) = O(\log n)$
and $q(n) = \polylog(\polylog n) = \poly(\log\log n)$.
Next, we compose this system with the PCP[poly,O(1)] system
and conclude that NP is in PCP[log,O(1)], since we obtain
randomness $O(\log n) + \poly(\polyloglog n) = O(\log n)$
and query complexity $O(1)$.
Note that the above proof is obtained by composing two highly
non-trivial PCP systems (i.e., PCP[poly,O(1)] and PCP[log,polylo])
using a sophisticated composition theorem (see the add'l features...).
In contrast, the 2nd proof of the PCP Theorem
evolves via (logarithmically many) repeated applications of
an intuitive "gap amplification" step, which is sketched next.
Let $\PCP_{1,s}[r,q]$ denote a PCP system with perfect completeness
and soundness error $s(n)$, where so far we focused on $s(n)=1/2$.
Clearly, NP is in $\PCP_{1,1-(1/n)}[log,3]$ (a trivial PCP)
and the basic (gap) amplification step (which amplifies $1-s$)
is captured by $\PCP_{1,1-\e(n)}[r,6]$ in $\PCP_{1,1-2\e(n)}[r+O(1),6]$,
provided that $\e(n)0$.
(N.B.: The query complexity is preserved,
whereas the randomness complexity only increased by an additive constant.)
Hence, starting from the trivial PCP and applying logarithmically many steps,
we obtain NP in $\PCP_{1,1-c}[log,6]$ (which yields the PCP Theorem).
The point, of course, is proving the amplification step.
Here, for clarity, we move to a gapCSP (for two variables over $[7]$),
and will also use "NP in PCP[poly,O(1)]".
READING ASSIGNMENT:
- The PCP Theorem -- two proof outlines:
- The original proof (Sec 9.3.2.2) - focus on "proof composition";
- Irit's proof (Sec 9.3.2.3) - focus on "gradual gap amplification".
These overviews are very high level and only give a taste
of what is going on.
Also, you may skip pages 392-3 (showing that NP in PCP[log,polylog])
if you find it too hard to follow; still, I recommend you try reading.
- Sec 9.3.3 -- PCP, gap problems, and Inapproximability
(This is a relatively easy reading, and is only 2 pages).
Actually, reading Sec 9.3.3 before Sec 9.3.2.3 may be a good idea,
mainly for sake of wider perspective on "gaps".
SUGGESTED ADDITIONAL READING (not mandatory):
Sec 9.3.4 ("more on PCP itself").
HOMEWORK ASSIGNMENT (to be submitted, repeated/reminder):
- Exer 9.17 (e.g., PCP[log,poly] is contained in NP).
- Exer 9.24 and 9.26 (PCP vs gapCSP, two directions)
NOTES FOR the 26TH MEETING
Topic: Notions of approximation
If you have questions re the reading assignment on PCP constructions,
please let me know ASAP.
Otherwise, I will probably focus on PCP, gap problems,
and Inapproximability (i.e., item 1 on the reading assignment).
Using gapCSP, I will also provide a more detailed overview
of Irit's proof of the PCP Thm (i.e., Sec 9.3.2.3).
READING ASSIGNMENT (for July 1st):
- Sec 9.3.3 -- PCP, gap problems, and Inapproximability
- Sec 10.1.1 -- Approximate Search (aka approximate optimization)
- Sec 10.1.2 -- Approximate Decision (aka Property Testing)
Section 10.1 as a whole is technically light, since most results
are only stated but not proved. I suggest to compensate for this
by doing the suggested exercises.
SUGGESTED ADDITIONAL READING (not mandatory):
Sec 9.3.4 ("more on PCP itself"). If you did not read it already.
SUGGESTED EXERCISES (not to be submitted):
- Exer 9.18: PCP, gapClique, and inapproximability (the FGLSS reduction).
- Exer 10.3: A stronger result than in Exer 9.18, but weaker than best known...
- Exer 10.4: More of the same taste, this one for vertex cover.
- Exer 10.6: Something completely different -- inapproximability w.o. PCP.
- Exer 10.7: Important technique (expanders and approx-preserving reductions)
- Exer 10.8-10.9 (easy): MAJ viewed from the perspective of PT (Sec 10.1.2).
- Exer 10.10 (easy): clarifying PT of graphs in the adj-matrix model.
NOTES FOR the 27TH MEETING
Topic: Average-case complexity
Various notions of approximation constitute one type of relaxation
(intended to reduce the complexity of computational tasks).
Another is average-case analysis of algorithms
(indeed: the relaxation is in the analysis not in the problem).
As with approximation, the issue of meaningfulness arises,
and here it is more acute (see definitional issues below).
I will discuss several definitional issues:
- Why "average-case" (as naively understood) is inadequate
(from a conceptual perspective), and what "typical case" means.
- What are natural problems (i.e., distributional version of NP)?
Avoiding the extremes (of all distrubutions or a single distribution).
Simple (P-computable) distributions vs P-sampleable ones.
- A notion of reduction that preserves typical ease;
the role of *domination* (a notion to be defined).
If we want to see "reduction that preserves typical ease",
then we better look for new ones...
Indeed, we shall present such reductions when proving
the following results:
THM 1 (easy): completeness for (NP,P-sampleable).
THM 2 (main material -- Sec 10.2.1.2): completeness for (NP,simple).
THM 3 (advanced material -- Sec 10.2.2.2):
reducing (NP,P-sample) to (NP,simple).
Combining THM 2 and 3, we have a problem in a restricted class ("simple")
that is complete for the larger class ("sampleable").
READING ASSIGNMENT (last one):
- Sec 10.2.0 -- motivation, please don't skip it!
- Sec 10.2.1 -- the basic theory.
Pay attention to first two paragraphs of Sec 10.2.1.2 (bot p. 435)
and the two "Reflection" paragaphs on pages 439-440.
- Page 451 (reflections and relation to OWF).
SUGGESTED ADDITIONAL READING (not mandatory):
- Sec 10.2.2.1 -- search vs decision
- Sec 10.2.2.2 -- P-sampleable vs simple, skipping the proof of Thm 10.26.
- Advanced (be warned!): The proof of Thm 10.26 (pages 447-450).
SUGGESTED EXERCISES (not to be submitted):
- Exer 10.14--10.17: relations between some definitions (re Sec 10.2.1).
- Exer 10.25--10.28: relations between some definitions (re Sec 10.2.2).
Back to the Computation Complexity (book).