What are Deep Learning Methods Really Learning?

It is not exactly clear what deep learning methods are really learning. Sure, they are highly effective and are learning something, but I’m still trying to get my head around exactly what they are learning.

Consider your run-of-the-mill deep neural network. “Learning” is nothing more than an optimization procedure. We are trying to produce an optimized mathematical formula that takes in a set of training examples and then can, as accurately as possible, map the inputs (i.e. attributes, features, etc.) of those examples to the outputs (i.e. class, target variable, etc.). We then use this formula to classify a new set of examples.

gradient_descent_png — Gradient Descent. Is this really all there is to learning?

At its core, deep learning is about input-process-output. It is not true learning in the sense of the word (the way we humans do). True learning entails understanding, and understanding is nonexistent during deep learning.

You can memorize a book, chapter-by-chapter, word-for-word; but that doesn’t mean you are learning. You still would not understand the plot. Similarly, in deep learning there is no understanding. Deep learning “memorizes” a mapping between inputs and outputs without any real understanding of the why behind those relationships. And in my view, the why is a huge part of learning. True learning (in the human sense of the word) without understanding is not learning. Perhaps then we should call deep learning something different? Deep optimization perhaps??? Guess that didn’t sound as marketable and sexy as deep learning.

If you look out in nature — the human brain or the brain of any living organism — nothing out there learns in a way that even remotely resembles backpropagation. Neural networks are about classification error, but real learning — the way humans learn — is deeper than that (pun intended).

A neural network, for example, has a completely different concept of what it is to be a dog. That concept could involve where certain groups of pixels are placed and may have nothing to do with the actual structure of the animal. Where a human would see legs, arms, torso, etc., a deep learning algorithm may abstract a completely different set of things. This has led a rise in adversarial attacks, where an attacker is able to determine what representation provides the highest probability for an image to be classified as anything and is able to insert noise that causes things to be misclassified.

Another point to consider is that neural networks generate something. It may be a relationship that we did not previously understand, but it may also just be nonsense that happens to work. The abstractions may result in some representational form that is at its most basic just complete nonsense. If there is no real understanding of what the abstractions that the algorithm makes, then it is hard to confirm that it’s actually doing anything.

The significance of the lack of understanding of what deep learning is, is yet to be seen; but here are just a few of the consequences if the hype gets unchecked:

Wasted resources as venture capitalists throw money at anything that has to do with deep learning.
Wasted resources as non-expert government agencies fund any research project that has the term “deep learning” in it.
A boat load of computer science graduates around the world that, all of a sudden, have found their “passion” in deep learning.
Disappointed companies as deep learning does not have the expected impact on their bottom line.
Another AI winter.

Remember, there is the marketing element in there too. Using anthropomorphic terms like machine “learning” and deep “learning” is a much better sell to a general audience than machine mathematical optimization or deep optimization. Researchers gotta sell their ideas too!

Bottom Line: Artificial intelligence is not yet intelligent, and deep learning is not yet deep (yay! we still have work to do!)…nor is it learning in the true sense of the word. Deep learning certainly will continue to have an enormous impact on the world, but there needs to be more awareness and discussion of not just the enormous potential of deep learning but also its limitations so non-technical stakeholders can make more informed decisions.