For properties of functions that ranges over the reals, it very natural to consider distance between functions as the $L_1$-distance (rather than the $L_0$/Hamming distance). Indeed, for Boolean functions both measures collide, and for non-structured ranges Hamming distance may be more natural, but for ranges equipts with a natural distance measure the current direction is indeed begging. The focus of the paper is the $L_1$-distance, and other $L_p$-distances are studied as a generalization.

The focus of the paper is on functions from the hypergrid to the reals, and on properties such as Monotonicity and Lipschitz. In the case of Monotonicity it is shown (see Lem. 2.2) that any nonadaptive one-sided error tester for Monotonicity of Boolean functions performs as well as a $L_1$-tester for Monotonicity of functions that range in the interval $[0,1]$. For Lipschitz, it is shown that the $L_1$-distance is easier than the Hamming distance in the following sense (see Table 1.2): There exists an $L_1$-tester of functions that range in the interval $[0,r]$, where $r>1$, that perform better than any corresponding tester that works with the Hamming distance.

We initiate a systematic study of sublinear algorithms for approximately testing properties of real-valued data with respect to $L_p$ distances for $p = 1, 2$. Such algorithms distinguish datasets which either have (or are close to having) a certain property from datasets which are far from having it with respect to $L_p$ distance. For applications involving noisy real-valued data, using $L_p$ distances allows algorithms to withstand noise of bounded $L_p$ norm. While the classical property testing framework developed with respect to Hamming distance has been studied extensively, testing with respect to $L_p$ distances has received little attention.

We use our framework to design simple and fast algorithms for classic problems, such as testing monotonicity, convexity and the Lipschitz property, and also distance approximation to monotonicity. In particular, for functions over the hypergrid domains $[n]^d$, the complexity of our algorithms for all these properties does not depend on the linear dimension $n$. This is impossible in the standard model. Most of our algorithms require minimal assumptions on the choice of sampled data: either uniform or easily samplable random queries suffice. We also show connections between the $L_p$-testing model and the standard framework of property testing with respect to Hamming distance. Some of our results improve existing bounds for Hamming distance.

Appeared in STOC 2014, pages 164-173. See also full version.

Back to list of Oded's choices.