2 Comments
User's avatar
Yexi's avatar

Great read! The idea that simplicity scales better than complexity applies deeply to machine learning. Classical ML relied on strict theoretical guarantees, but deep learning thrives by embracing emergent simplicity—often defying traditional assumptions:

* PAC Learnability: Generalization bounds are often too loose to be useful in deep learning, yet large overparameterized networks generalize well, showing that rigid theory doesn’t always scale.

* Curse of Dimensionality: Classical ML feared high dimensions, but deep networks exploit structured data (e.g., CNNs, transformers) rather than suffering from dimensionality.

* One Hidden Layer vs. Depth: A single-layer NN can approximate any function but needs exponentially many neurons. Depth allows efficient, compositional learning—scaling better than a shallow but massive model.

Deep learning succeeds not by eliminating complexity but by harnessing it efficiently, much like how simple yet scalable systems thrive in other domains. Would love to hear your thoughts on this parallel!

Expand full comment
Forest's avatar

I think the cultural shift from theoretical and empirical science to experimental science (lesson 1) definitely helped.

But to be more accurate, I think it is the right simplicity that works, and discovering the right simplicity required lots of human insights and creativity. DL wouldn’t have taken off if not because of convolutional architecture or non-sigmoid activation discovered by human. (Lesson 3)

And yes in other fields especially mathematics there are lots of such examples too.

Expand full comment