**
This is not the webpage of
Oded's forthcoming book on property testing,
titled Introduction to Property Testing.
Ditto regarding
Oded's lecture notes on property testing,
which serve as a basis for the
aforementioned book. **

Appeared (in 2010) as a LNCS (Vol 6390), in the LNCS series State of the Art Survey.

LNCS 6390 is now available online. You can find information about it or access the online version.

**
Below is the book's tentative preface
and table of contents (incl. drafts of most papers).
**

Tentative image for cover [regular format, high resolution format] and final cover.

Property Testing is the study of super-fast (randomized) algorithms for approximate decision making. These algorithms are given direct access to items of a huge data set, and determine whether this data set has some predetermined (global) property or is far from having this property. Remarkably, this approximate decision is made by accessing a small portion of the data set.

Property Testing has been a subject of intensive research in the last couple of decades (see, e.g., recent surveys [R1,R2]), with hundreds of studies conducted in it and in closely related areas. Indeed, Property Testing is closely related to Probabilistically Checkable Proofs (PCPs), and is related to Coding Theory, Combinatorics, Statistics, Computational Learning Theory, Computational Geometry, and more.

The mini-workshop, hosted by the Institute for Computer Science (ITCS) at Tsinghua University (Beijing), brought together a couple of dozen of leading international researchers in Property Testing and related areas. At the end of this mini-workshop it was decided to compile a collection of extended abstracts and surveys that reflect the program of the mini-workshop. The result is the current volume.

An asymptotic analysis is enabled by considering an infinite sequence of domains, functions, and properties. That is, for any $n$, we consider functions from $D_n$ to $R_n$, where $|D_n|=n$. (Often, one just assumes that $D_n=[n]\eqdef\{1,2,...,n\}$.) Thus, in addition to the input oracle, representing a function $f:D_n\to R_n$, the tester is explicitly given two parameters: a size parameter, denoted $n$, and a proximity parameter, denoted $\eps$.

This definition does not specify the query complexity of the tester, and indeed an oracle machine that queries the entire domain of the function qualifies as a tester (with zero error probability...). Needless to say, we are interested in testers that have significantly lower query complexity.Definition:Let $\Pi=\bigcup_{n\in\N}\Pi_n$, where $\Pi_n$ contains functions defined over the domain $D_n$. Atester for a property $\Pi$is a probabilistic oracle machine $T$ that satisfies the following two conditions:If the tester accepts every function in $\Pi$ with probability 1, then we say that it has

- The tester accepts each $f\in\Pi$ with probability at least $2/3$; that is, for every $n\in\N$ and $f\in\Pi_n$ (and every $\eps>0$), it holds that $\prob[T^f(n,\eps)\!=\!1]\geq2/3$.
- Given $\eps>0$ and oracle access to any $f$ that is $\eps$-far from $\Pi$, the tester rejects with probability at least $2/3$; that is, for every $\eps>0$ and $n\in\N$, if $f:D_n\to R_n$ is $\eps$-far from $\Pi_n$, then $\prob[T^f(n,\eps)\!=\!0]\geq2/3$, where $f$ is
$\eps$-far from $\Pi_n$if, for every $g\in\Pi_n$, it holds that $|\{e\in D_n:f(e)\neq g(e)\}|>\eps\cdot n$.one-sided error; that is, $T$ has one-sided error if for every $f\in\Pi$ and every $\eps>0$, it holds that $\prob[T^f(n,\eps)\!=\!1]=1$. A tester is callednon-adaptiveif it determines all its queries based solely on its internal coin tosses (and the parameters $n$ and $\eps$); otherwise it is calledadaptive.

Research in property testing is often categorized according to the type of functions and properties being considered. In particular, algebraic property testing focuses on the case that the domain and range are associated with some algebraic structures (e.g., groups, fields, and vector spaces) and studies algebraic properties such as being a polynomial of low degree (see, e.g., [BLR,RS]). In the context of testing graph properties (see, e.g., [GGR]), the functions represent graphs or rather allow certain queries to such graphs (e.g., in the adjacency matrix model, graphs are represented by their adjacency relation and queries correspond to pairs of vertices where the answers indicate whether or not the two vertices are adjacent in the graph).

A somewhat related model is one in which the tester obtains random pairs $(e,f(e))$, where each sample $e$ is drawn (independently) from the aforementioned distribution. Such random ($f$-labeled) example can be either obtained on top of the queries to $f$ or instead of them. This is also the context of testing distributions, where the examples are actually unlabeled and the aim is testing properties of the underlying distribution (rather than properties of the labeling which is null here).

A third ramification refers to the related notions
of *tolerant testing* and *distance approximation*
(cf. [PRR]).
In the latter, the algorithm is required to estimate the distance
of the input (i.e., $f$) from the predetermined set of instances
having the property (i.e., $\Pi$). Tolerant testing usually
means only a crude distance approximation that guarantees that
inputs close to $\Pi$ (rather than only inputs in $\Pi$) are accepted
while inputs that are far from $\Pi$ are rejected (as usual).

The importance of representation is forcefully demonstrated in the gap between the complexity of testing numerous natural graph properties in two natural representations: the adjacency matrix representation (cf. [GGR]) and the incidence lists representation (cf. [GR1]).

Things get to the extreme in the study of locally testable codes, which may be viewed as evolving around testing whether the input is ``well formed'' with respect to some fixed error correcting code. Interestingly, the general study of locally testable codes seeks an arbitrary succinct representation (i.e., a code of good rate) such that well-formed inputs (i.e., codewords) are far apart and testing well-formness is easy (i.e., there exists a low complexity codeword test).

While [BLR,RS] focused on algebraic properties, the focus of [GGR] was on graph properties. From this perspective the choice of representation became less obvious, and oracle access was viewed as allowing local inspection of the graph rather than being the graph itself. The distinction between objects and their representations became more clear when an alternative representation of graphs was studied in [GR1,GR2]. At this point, query complexity that is polynomially related to the size of the object (e.g., its square root) was no longer considered inhibiting. This shift in scale is discussed next.

Recall that initially property testing was viewed as referring to functions that are implicitly defined by some succinct programs (as in the context of program checking) or by ``transcendental'' entities (as in the context of PAC learning). From this perspective the yardstick for efficiency is being polynomial in the length of the query, which means being polylogarithmic in the size of the object. However, when viewing property testing as being applied to (huge) objects that may exist in explicit form in reality, it is evident that any sub-linear complexity may be beneficial.

The realization that property testing may mean any algorithm that does not inspect its entire input seems crucial to the study of testing distributions, which emerged with [BFRSW]. In general, property testing became identified as a study of a special type of sublinear-time algorithms.

Another consequence of the aforementioned shift in scale is the decoupling of the representation from the query types. In the context of graph properties, this culminated in the model of [KKR].

Nevertheless, the study of testing properties within query complexity that only depends on the proximity parameter (and is thus totally independent of the size of the object) remains an appealing and natural direction. A remarkable result in this direction is the characterization of graph properties that are testable within such complexity in the adjacency matrix model [AFNS].

- [AFNS] N. Alon, E. Fischer, I. Newman, and A. Shapira. A Combinatorial Characterization of the Testable Graph Properties: It's All About Regularity. In 38th STOC, pages 251-260, 2006.
- [BFRSW] T. Batu, L. Fortnow, R. Rubinfeld, W.D. Smith and P. White. Testing that Distributions are Close. In 41st FOCS, pages 259-269, 2000.
- [BLR] M. Blum, M. Luby and R. Rubinfeld. Self-Testing/Correcting with Applications to Numerical Problems. JCSS, Vol. 47, No. 3, pages 549-595, 1993. Extended abstract in 22nd STOC, 1990.
- [GGR] O. Goldreich, S. Goldwasser, and D. Ron. Property testing and its connection to learning and approximation. JACM, pages 653-750, July 1998. Extended abstract in 37th FOCS, 1996.
- [GR1] O. Goldreich and D. Ron. Property Testing in Bounded Degree Graphs. Algorithmica, Vol. 32 (2), pages 302-343, 2002. Extended abstract in 29th STOC, 1997.
- [GR2] O. Goldreich and D. Ron. A Sublinear Bipartitness Tester for Bounded Degree Graphs. Combinatorica, Vol. 19 (3), pages 335-373, 1999. Extended abstract in 30th STOC, 1998.
- [KKR] T. Kaufman, M. Krivelevich, and D. Ron. Tight Bounds for Testing Bipartiteness in General Graphs. In RANDOM'03, pages 341-353, 2003.
- [PRR] M. Parnas, D. Ron, and R. Rubinfeld: Tolerant Property Testing and Distance Approximation. JCSS, Vol. 72 (6), pages 1012-1042, 2006. Preliminary version in ECCC, 2004.
- [R1] D. Ron. Property Testing: A Learning Theory Perspective. Foundations and Trends in Machine Learning, Vol. 1 (3), pages 307-402, 2008.
- [R2] D. Ron. Algorithmic and Analysis Techniques in Property Testing. Foundations and Trends in TCS, to appear.
- [RS] R. Rubinfeld and M. Sudan. Robust Characterization of Polynomials with Applications to Program Testing. SICOMP, Vol. 25 (2), pages 252-271, 1996.

- Editorial material: Preface (including table of contents), a brief introduction to property testing, and comments on the workshop's program.
- Surveys
- Eli Ben-Sasson: Limiting the rate of locally testable codes
- Eric Blais: Testing juntas
- Artur Czumaj and Christian Sohler: Sublinear-time algorithms
- Oded Goldreich: Short Locally Testable Codes and Proofs - A Survey in Two Parts
- Oded Goldreich: Introduction to Testing Graph Properties
- Ilan Newman: a survey on property testing of massively parametrized problems
- Krzysztof Onak: Sublinear Graph Approximation Algorithms
- Sofya Raskhodnikova: Transitive-Closure Spanners with Applications to Monotonicity Testing
- Rocco Servedio: Testing by Implicit Learning
- Madhu Sudan: Invariance in Property testing

- Extended abstracts
- Noga Alon: On constant time approximation of invariants of bounded degree graphs
- Victor Chen: Testing Linear-Invariant Non-Linear Properties
- Victor Chen: A Hypergraph Dictatorship Test with Perfect Completeness
- Artur Czumaj: Testing monotone continuous distributions on high-dimensional real cubes
- Oded Goldreich: Algorithmic Aspects of Property Testing in the Dense Graphs Model.
- Prahladh Harsha: Composition of low-error 2-query PCPs using decodable PCPs
- Tali Kaufman: Symmetric LDPC codes and local testing
- Swastik Kopparty: Optimal Testing of Reed-Muller Codes
- Michael Krivelevich: Hierarchy Theorems for Property Testing.
- Michael Krivelevich: Comparing the strength of query types in property testing
- Kevin Matulef: Testing (subclasses of) Linear Threshold Functions
- Krzysztof Onak: External Sampling
- Krzysztof Onak: The Query Complexity of Edit Distance
- Ronitt Rubinfeld: Maintaining a large matching or a small vertex cover
- Michael Saks: Local Monotonicity Reconstruction
- Shubhangi Saraf: Some recent results on testing of sparse linear codes
- Asaf Shapira: Testing Linear Invariant Properties
- Christian Sohler: Testing Euclidean Spanners

- D. Ron, three tutorials. The two later tutorials provide different perspectives (and superseed the earlier one).

©Copyright 2010 by Oded Goldreich.

Permission to make copies of part or all of this work for personal
or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial
advantage and that new copies bear this notice and the full
citation on the first page. Abstracting with credit is permitted.

Back to Oded Goldreich's homepage.