A few talks at China Theory Week
As usual, the partial selection is based and restricted
both by my attendence and my interests.
In particular, I only attended two days of the week.
I will comment shortly on the following six talks:
1. Chenggang Wu: Testing Surface Area
2. Huy L. Nguyen:
Approximate Near Neighbor Search Beyond Locality Sensitive Hashing
3. Reut Levi:
A Quasi-Polynomial Time Partition Oracle for Graphs with an Excluded
Minor
4. Sangxia Huang:
Improved Hardness of Approximating Chromatic Number
5. Avishay Tal:
Properties and Applications of Boolean Function Composition
6. Zeyu Guo: Randomness-Efficient Curve Samplers
My comments on all talks appear first,
and the abstracts follow after all comments.
Oded's comments
The China Theory Week is an annual event aimed at providing a platform
for top CS theory students from institutions worldwide to meet and
present their research. In 2007 and a few years following,
the event was held in Tsinghua University (Beijing),
but in the last couple of years it was held in Aarhus.
See the
official website.
1. Chenggang Wu: Testing Surface Area
This work studies a natural double-approximation problem;
as noted in the abstract, the standard (for volume appriximation)
assumptions about the body (i.e., convexity) are replaced by
a distance relaxation (i.e., the body is closed in a natural measure
(e.g., Lebesque measure) to a body with a certain volume).
2. Huy L. Nguyen:
Approximate Near Neighbor Search Beyond Locality Sensitive Hashing
The point here is out-performing the guarantees obtained via the
use of locality sensitive hashing (i.e., a hashing scheme that
maps closed points to close values). The new scheme uses
a sophisticated but natural combination of several hashing schemes,
and analyzes the effect on close and distance pairs separately.
3. Reut Levi: A Quasi-Polynomial Time Partition Oracle for Graphs
with an Excluded Minor
The problem is of implementing an oracle that on input a vertex $v$
in a graph $G$ with excluded minors returns all vertices
in a connected component of a subgraph of $G$ such that
(1) the size of each connected component is small (i.e., smaller than $k$),
and (2) the subgraph covers almost all edges of $G$
(i.e., it misses at most an $\eps$ fraction).
The complexity measure is the number of (incidence) queries made to $G$
per each query to the oracle. While prior work had a query
complexity exponential in $k/\eps$, the current work provides
quasi-polynomial dependence, and leaves open the challlenge of
obtaining a polynomial dependence.
While prior work constructed iteratively certain contractions of
the graph (and possibly "broken" some of them at the end),
the current work interatively updates connected components
of the graph by an alternate process of contraction and
breaking-up. This alternation guarantees that the size
of the connected components remains small throughout the process,
which yields the (exponential) improvement in the query complexity.
4. Sangxia Huang:
Improved Hardness of Approximating Chromatic Number
Using Chan's recent optimal inapproximability
result for CSPs, this work shows that it is NP-hard to
color $K$-colorable graphs using $exp(K^{1/3)$ colors.
This improves over a prior quasi-polynomial bound.
The use of Chan's result requires to revisit the connection
between CSPs and Chromatic Number for the specific instances
that emerge in the reduction.
5. Avishay Tal:
Properties and Applications of Boolean Function Composition
This work present a unified and general frmaework for proving
separations and collapses in decision tree complexity.
The framework, which was implicit in some prior works,
uses the behavior (e.g., multiplicativity) of various complexity
measures under functional composition.
The explicit presentation streamlines known results
as well as allows for obtaining new ones.
6. Zeyu Guo: Randomness-Efficient Curve Samplers
The goal is to construct good samplers such that the function
describing the sample points, per each choice of random seed,
is a low-degree (multi-variant) curve (i.e., sequence of polynomials).
This guarantees that the value of any given (multi-variant) polynomial
on the sample points can be described by a low degree polynomial.
In fact, the work obtains good explicit samplers such that
that are themselves curves (i.e., as a function of both
the seed and the index of the sample point).
The original abstracts
1. Chenggang Wu: Testing Surface Area
We consider the problem of estimating the surface area of an unknown
n-dimensional set F given membership oracle access. In contrast to
previous work, we do not assume that F is convex, and in fact make no
assumptions at all about F. By necessity this means that we work in the
property testing model; we seek an algorithm which, given parameters $A$
and $\eps$, satisfies:
(1) [Completeness] If the surface area of F is at most $A$, then the
algorithm accepts (whp);
(2) [Soundness] If F is not $\eps$-close to some set $G$ with surface area
at most $\kappa A$, then the algorithm rejects (whp).
We call $\kappa \geq 1$ the *approximation factor* of the testing algorithm.
The $n = 1$ case (in which surface area is the number of endpoints) was
introduced by Kearns and Ron, who solved the problem with $\kappa = 1/\eps$
and $O(1/\eps)$ oracle queries. Later, Balcan et al. [BBBY12]
solved it with $\kappa =3D 1$ and $O(1/\eps^4)$ queries.
We give the first result for higher dimensions n. Perhaps surprisingly, our
algorithm completely evades the "curse of dimensionality":
for any n and
any $\kappa > \frac{4}{\pi} \approx 1.27$, we give a test that uses
$O(1/\eps)$ queries. For small $n$ we have improved bounds. For n=1 we
can achieve $\kappa =1$ with $O(1/\eps^{3.5})$ queries
(slightly improving [BBBY12]),
or any $\kappa > 1$ with $O(1/\eps)$ queries (improving [KR98]).
For n = 2, 3 we obtain $\kappa \approx 1.08, 1.125$ respectively, with
$O(1/\eps)$ queries. Getting an arbitrary $\kappa > 1$ for n > 1 remains an
open problem.
Finally, motivated by the learning results from [KOS08], we extend our
techniques to obtain a similar tester for Gaussian surface area for any n,
with query complexity $O(1/\eps)$ and any approximation factor $\kappa>
\frac{4}{\pi} \approx 1.27$.
Joint work with Pravesh Kothari, Amir Nayyeri and Ryan O'Donnell.
2. Huy L. Nguyen:
Approximate Near Neighbor Search Beyond Locality Sensitive Hashing
The approximate near neighbor search problem is defined as follows: given a
set P of n points in a d-dimensional space, build a data structure that,
given a query point q, if there exists a point within distance r from q,
then it reports a point within distance cr from q. Here c is the
approximation factor of the algorithm.
We present a new data structure for the c=E2=80=93approximate near neighbor=
problem
(ANN) in the Euclidean space. For n points in R^d, our algorithm achieves
O(dn^=CF=81) query time and O(n^{1+?=CF=81} + nd) space, where =CF=81 =E2=
=89=A4 7/(8c^2) +
O(1/c^3) + o(1). This is the first improvement over the result by Andoni
and Indyk (FOCS 2006) and the first data structure that bypasses a
locality=E2=80=93sensitive hashing lower bound proved by O=E2=80=99Donnell,=
Wu and Zhou
(ITCS 2011). By a standard reduction we obtain a data structure for the
Hamming space and =E2=84=931 norm with =CF=81 =E2=89=A4 7/(8c) + O(1/c^{3/2=
}) + o(1), which is
the first improvement over the result of Indyk and Motwani (STOC 1998).
The talk is based on joint work with Alexandr Andoni, Piotr Indyk, and Ilya
Razenshteyn.
3. Reut Levi: A Quasi-Polynomial Time Partition Oracle for Graphs
with an Excluded Minor
Motivated by the problem of testing planarity and related properties, we
study the problem of designing efficient {\em partition oracles\/}.
A {\em partition oracle\/} is a procedure that, given access to the
incidence lists representation of a bounded-degree graph $G=3D (V,E)$ and a
parameter $\eps$, when queried on a vertex $v\in V$, returns the part
(subset of vertices) which $v$ belongs to in a partition of all graph
vertices. The partition should be such that all parts are small, each part
is connected, and if the graph has certain properties, the total number
of edges between parts is at most $\eps |V|$.
In this work we give a partition oracle for graphs with excluded minors
whose query complexity is quasi-polynomial in $1/\eps$, thus improving on
the result of Hassidim et al. ({\em Proceedings of FOCS 2009}) who gave a
partition oracle with query complexity exponential in $1/\eps$. This
improvement implies corresponding improvements in the complexity of testing
planarity and other properties that are characterized by excluded minors as
well as sublinear-time approximation algorithms that work under the
promise that the graph has an excluded minor.
4. Sangxia Huang:
Improved Hardness of Approximating Chromatic Number
A vertex coloring of a graph is an assignment of colors to its vertices
such that no two adjacent vertices receive the same color. The minimum
number of colors needed for such a coloring is called the chromatic number
of G. We mainly consider the approximation of the chromatic number, that is
given a K-colorable graph, we would like to color it with as few colors as
possible in polynomial time. As a classical combinatorial optimization
problem, vertex coloring is closely related to other problems such as
finding maximum independent sets, Max CSPs, and probabilistically checkable
proofs (PCPs) with certain special properties. In addition to being an
important theoretical challenge, graph coloring also has a number of
applications such as scheduling and register allocation.
Building on recent breakthrough in the construction of efficient PCPs, we
show that it is NP-hard to distinguish a K-colorable graph from a graph
that has chromatic number at least 2^Omega(K^{1/3}) for sufficiently large
K. This improves the previous best known lower-bound of 2^{Omega((log K)^2)}.
5. Avishay Tal:
Properties and Applications of Boolean Function Composition
This talk focuses on a simple but useful tool for separating complexity
measures: Boolean function composition. For Boolean functions f, g defined
on n, m input variables respectively, the composition of f and g is the
value of f on n inputs, each of them is the calculation of g on a distinct
set of m Boolean variables. Previous works have used this tool to achieve
some of the best separations between complexity measures of Boolean
functions such as sensitivity, block sensitivity, certificate complexity,
degree, decision tree complexity and quantum query complexity. This was due
to the multiplicative nature of these measures with respect to composition.
We use this multiplicative behavior to establish several new applications.
First, we give a counter-example using composition for a conjecture made by
Adam Kalai in 2004, which if true would imply a better algorithm for the
learning junta problem.
Second, we show an unusual behaviour of block sensitivity with respect to
composition which was not noticed before. We define a new complexity
measure called "fractional block sensitivity" which better
explains this behaviour.
Last, we show how composition can tighten relations between complexity
measures, by improving a result by Nisan and Szegedy about the relation
between block sensitivity and degree as a polynomial.
6. Zeyu Guo: Randomness-Efficient Curve Samplers
Curve samplers are sampling algorithms that proceed by viewing the domain
as a vector space over a finite field, and randomly picking a low-degree
curve in it as the sample. Curve samplers exhibit a nice property besides
the sampling property: the restriction of low-degree polynomials over the
domain to the sampled curve is still low-degree. This property is often
used in combination with the sampling property and has found many
applications, including PCP constructions, local decoding of codes, and
algebraic PRG constructions.
The randomness complexity of curve samplers is a crucial parameter for its
applications. It is known that (non-explicit) curve samplers using $O(\log
N+\log(1/\delta))$ random bits exist, where $N$ is the domain size and
$\delta$ is the confidence error. The question of explicitly constructing
randomness-efficient curve samplers was first raised in Ta-Shma and Umans
(FOCS 2006) where they obtained curve samplers with near-optimal randomness
complexity.
We present an explicit construction of low-degree curve samplers with
optimal randomness complexity (up to a constant factor), sampling curves of
degree $\lb m\log_q(1/\delta)\rb^{O(1)}$ in $\mathbb{F}_q^m$. Our
construction is a delicate combination of several components, including
extractor machinery, limited independence, iterated sampling, and
list-recoverable codes.
Note: See
ECCC TR13-120.
Back to
list of Oded's choices.