Triangular Rank and Dimension Reduction for $L_1$ Metrics

by Ilan Newman and Yuri Rabinovich

Oded's comments

In the abstract below, it may be more "in line" with the VC-literature to think of $F$ as a class of Boolean functions over $U$. We say that a distribution $P$ is faithfully represented by a distribution $P'$ if for every $f\in F$ the probability that $f$ evaluates to 1 under $P'$ is close to the probability this happens under $P$, where close may be interpreted as having small additive error or having small multiplicative error. While prior work focuses on additive error appropixation, the current one focuses on multiplicative error approximation.

The question is how large should be the support size for such a $P'$, and in the case of additive error this is closely related to the VC-dimension of $F$, denoted $d(F)$; size $s=\tildeO(1/\epsilon^2)\cdot d(F)$ suffices (for additive approximation of $\epsilon$). In fact, $P'$ may be uniform over a set of size $s$, and a sample of $s$ elements selected according to $P$ will do (with high probability over the selection of this sample).

The current work identifies a parameter, denoted $r(F)$, which is closely related to the support size that is required for a multiplicative error approximation (by factor of $1\pm\epsilon$): size $\tildeO(1/\epsilon^2)\cdot r(F)^2$ suffices, whereas $r(F)$ is a definite lower bound. (While $r(F) \geq d(F)$, an example is shown where $d(F)=1$ and $r(F)=|U|$.)

It seems that the *exact* dependence of the sample size (i.e., support size) on the proximity parameter (i.e., $\epsilon$) is not known also in the case of additive error, but in the current work also the relation to the "size" parameter (i.e., $r(F)$) is not known.

The original abstract

It is well known that if a family $ F$ of sets over a universe $U$ has a $VC$-dimension $d$, then for every probability distribution $P$ over $U$ there exists a sample of $U$ of size $d \cdot {\rm poly}(1/\epsilon)$ that faithfully represents $P$ on $F$ up to an $\epsilon$ additive error.

We ask whether there exists a property of $F$ analogous to $VC$-dimension which would ensure the existence of a small sample faithfully representing $P$ on $F$ up to $(1+ \epsilon)$ {\bf multiplicative} error. The answer turns out to be positive, and the key parameter is the {\bf triangular rank} of $F$. In particular, we show that if the triangular rank of $F$ is $r$, then there exists a sample of size $\min\{ r \log|F|, r^2\} \cdot {\rm poly}(1/\epsilon)$ that faithfully represents $P$ on $F$ up to $(1+ \epsilon)$ multiplicative factor.

To demonstrate the usefulness of such approximations, we consider the well-studied problem of dimension-reduction for $L_1$ metrics, and show that any $L_1$ metric on $n$ points can be $ (1+ \epsilon)$ approximated by a sum of $n \cdot {\rm poly}(1/\epsilon)$ cut-metrics.

Back to list of Oded's choices.