The universal approximation theorem just says that you can construct a giant NN ...

argonaut · on Feb 6, 2016

Yes, in order to generalize better you need deeper nets. That was my whole point. But how deep? And what are the parameters of each layer? Grad students just pull those numbers from intuition. And it goes without saying that an infinitely deep net (whatever that means) would not generalize on little data, and would get even harder to train the deeper it gets. If it means what I think it means, you're basically claiming that recurrent neural nets can easily represent anything, but RNNs exist today, and they don't do the magic you're claiming they do.

The forward pass of a net is not theoretically interesting. It's the training of the net that has no theory. The training has nothing to do with digital circuits.

It goes without saying that you've handwaved some (perfectly fine) ideas about composing functions and such. And then claim "it isn't some strange mystery." That's my point. You've argued some ideas from intuition. There is little theoretical rigor around this, however.

Houshalter · on Feb 6, 2016

>an infinitely deep net (whatever that means) would not generalize on little data, and would get even harder to train the deeper it gets.

Not with proper priors/regularization.

>You've argued some ideas from intuition. There is little theoretical rigor around this, however.

There's this paper which goes more into theoretical depth on the idea: http://arxiv.org/abs/1412.0233

argonaut · on Feb 7, 2016

It goes into theoretical detail to show one fact, that local optima are close to the global optimum. It does not prove anything else.