Oded's choices (Nr. 404)

ReLU networks and polytopes

by Amir Yehudayof (based on joint works with Bakaev, Brunck, Hertrich and Stadew)

Oded's comments

Being far from the area, I was happy to be educated about its complexity theoretic aspects.

We consider neural networks (NN) with linear and relu gates mapping $\R^n$ to $\R$, where $\relu:\R\to\R$ is defined as $\relu(z) = \max(z,0)$ (and linear functions map $\R^+$ to $\R$). (One often considers affine rather linear gates, but the results hold)

C(n) denote class of n-var functions computable by sch NNs, and let D(f) denote the depth of a NN computing f; that is, D(n) equals the max value of D(f) taken over all functions $f$ in $C(n)$. Prior know results include

THM: $D(n) \leq \ceil{\log_2(n+1)}$. This can be proved by showing that $D(n) \leq D(\max_{n+1})$, where $\max_m$ is the $m$-var max function, and $\max_{2n} = \max_2(\max_n,\max_n)$ (whereas $D(\max_2)=1$ since $\max(x,y) = x+\relu(y-x)$)).
The upper bound is tight in several cases (i.e., restrictions on the linear functions used in the NN). E.g., if these linear functions use integer coefficients. It was also known that $D(n) \geq 2$.

It was conjectured that the upper bound is tight in general. The new work shows that this conjecture is wrong. In fact, $D(n) \leq \log_3(n+1)$.

Interestingly, the new upper bound uses NN with decimal numbers/fractions, whereas it is known that if the NN uses decimal (rational) numbers, then $\log_3 n$ is a lower bound.

In contrast, it is known that any function in $C(n)$ can be approximated by a depth two NN.

Original abstract

ReLU networks are a standard example of neural networks. We will discuss the expressivity of neural networks, focusing on their depth. The plan is to start with a general introduction and a survey of basic results, including the duality between ReLU networks and a model for constructing polytopes. We will then discuss polytopes and discuss how their properties can help us understand the depth of neural networks.

Back to list of Oded's choices.