The Limitations of Deep Learning

Despite the explosion in the popularity of deep learning, we are still a long way from the kind of artificial general intelligence that is the inherent long term goal of computer science. Let us take a look at what the limitations are of deep learning. I’m going to pull from a paper written by Professor Gary Marcus of New York University about this topic. The paper is entitled: Deep Learning: A Critical Appraisal.

Deep Learning Methods Require Lots and Lots of Data

child_playing_with_horseshoe

Humans do not need hundreds, thousands, or even millions of examples of horseshoe crabs in order to learn what a horseshoe crab is. In fact, a 3 year old child could look at one example of a horseshoe crab, be told it is a horseshoe crab, and immediately identify other horseshoe crabs, even if those new horseshoe crabs do not look exactly like that first horseshoe crab example. 

Deep learning models on the other hand, need truck loads of examples of horseshoe crabs in order to distinguish horseshoe crabs from say spiders, or other similar-looking creatures.

Deep Learning Is Not Deep

shallow_water

The word “deep” in deep learning just refers to the structure of the mathematical model built during the training phase of the algorithm. The learning is not really deep, in the true sense of the word. A deep learning algorithm does not understand the why behind its output. 

Deep Learning Does Not Quickly Adapt When Things Change

blue-school-bus

Deep learning works well when test data looks like training data. However, it cannot easily cope with novel situations, such as when the domain changes from the one that it was trained on.

For example, if a deep neural network learns that school buses are always yellow, but, all of a sudden, school buses become blue, deep learning models will need to be retrained. A five year old would have no problem recognizing the vehicle as a blue school bus.

Deep Learning Does Not Do Well With Inference

semantics

Deep learning algorithms cannot easily tell the difference between phrases like: “John promised Mary to leave” and “John promised to leave Mary.”

Deep Learning Is Not Transparent

black-box

Part of the confusion and lack of clarity of deep learning is that it is difficult to understand for the general public. For example, there are many formulas and recalculations that are completed in order to create the neural network that forms the learning model for our deep learning program. 

These neural networks could also be biased depending on how the programmer sets up the formulas…for example, weighting some features more than others instead of considering the collaborative effect of all the features.

It also does not help that the layers of the neural network are called “hidden layers” which makes deep learning and its methodologies sound even more mysterious.  

Also, the mathematics….ahh the mathematics. Try explaining the whole sigmoid and gradient descent thing to a child. Better yet, imagine a doctor explaining to a patient suffering from cancer that the cancer diagnosis was determined based on the output of an artificial feedforward neural network trained with backpropagation. How would a patient respond to this? How would a health insurance company respond?

Neural networks are relatively black boxes that look like magic to the untrained eye, containing hundreds if not millions of parameters. In fields like medicine and finance, humans want to know exactly (and simply) why a particular decision was made. You cannot just say, “because my deep neural network said so.” 

Deep Learning Cannot Take Full Advantage of the Five Basic Senses

five_senses
The Five Senses: Look, Sound, Smell, Taste, Feel

Humans have five basic senses: vision, hearing, smell, taste, touch. When I learn what a rooster is, I can see it, hear it, smell it, taste it, and touch it. Deep learning at this stage focuses mainly on the vision part and lacks the other four senses. Those other senses can be important.

For example, imagine a driverless car. A human could hear a train coming long before it sees the train. The driver would then stop. A driverless car on the other hand would need to see the train first before making the decision to stop.

Humans use all five senses to learn, and these five senses play an important role in getting a bigger picture of recognizing objects and understanding what makes a rooster a rooster, or a train, a train.

Deep Learning Is Not yet Able to Know Something With 100% Certainty

trash_recycle_bin_recycle

I could look at a trash can inside my kitchen, know it is a trash can, and tell you with 100% certainty that it is a trash can. A deep neural network on the other hand, no matter how many parameters or data it has been trained on, works in the world of probabilities and numbers. It will never have 100% confidence that the trash can is a trash can. 

A trained deep neural network, for example, might be 96.3% sure it is a trash can but not 100% sure. There is always that miniscule probability it could be something else, like say a tree stump or a recycle bin. 

I’m in front of my laptop now. It is 100% my laptop. A deep learning algorithm, on the other hand, might say it is 99.3% sure that the object currently in front of me is my laptop. It would need a human to validate that the object is, in fact, a laptop.

This limitation this has to do with the way a neural network algorithm “learns.” 

The reality is that a program does not look at a dog and intuitively know that it is a dog like a human does. Instead, deep learning is more like a statistical analysis of patterns that are observed from the sample data points. So a deep learning program might only identify that an image is a dog based on the shapes in the image and the fact that statistical patterns tell you that these shapes will mean that the image is likely a dog.  

But the fact is that the deep learning program cannot be 100% sure that an image is a dog; for example if there is a weird picture of a fox or bear that looks a lot like a dog. Due to this uncertainty (the fact that the deep learning algorithm cannot be 100% sure), it is difficult for humans to trust deep learning especially when applied to critical, possibly life-threatening applications.  

For example, if a deep learning algorithm cannot always correctly identify that an object is an obstacle when processing the images from a driverless car, then it would be concerning for people to trust this program to drive for us.

Furthermore, there may be an inherent bias in how the deep learning algorithm identifies patterns. For example, if a programmer decided that a road is only definable by separating lines for lanes, then the algorithm may be biased to using lines for identifying roads. This would mean that the algorithm misses identifying roads that don’t have clear lane markings or even dirt roads. Such a bias means that a deep learning program could miss important identifying features/patterns. 

dirt_road_road_journey
Is this a road or a walking path to a beach? Humans would easily be able to recognize this as a road due to the tire tracks. Computers in driverless cars would need to have seen something similar to this in order to recognize it as a road.

In short, it is difficult for a deep learning algorithm to account for every possible example and every variation, which means that it would not be 100% correct.     

Significance of the Misunderstanding of Deep Learning

In my previous post on deep learning, I posted about some of the potential fallout that could occur due to the misunderstanding of what deep learning is really learning. I mentioned some possible significant consequences:

  • Wasted resources as venture capitalists throw money at anything that has to do with deep learning. 
  • Wasted resources as non-expert government agencies fund any research project that has the term “deep learning” in it.
  • A boat load of computer science graduates around the world that, all of a sudden, have found their “passion” in deep learning.
  • Disappointed companies as deep learning does not have the expected impact on their bottom line.
  • Another AI winter.

Let’s look at the last bullet point. Rodney Brooks, former MIT professor and co-founder of iRobot and Rethink Robotics, predicts that we will enter a new AI winter in 2020

AI winter is a period of reduced funding and interest in artificial intelligence research that comes at the tail end of an AI hype cycle. Each AI hype cycle begins with some major breakthrough. Then for the next 5-10 years after that breakthrough, all sorts of papers get written on AI, companies that are doing “X + [insert some new hot, AI technology]” get funded, computer science students around the world change their career paths, and the media goes into a feeding frenzy about how the new breakthrough will change the world.

Executives at big companies around the world then shout out quotations like this: “AI is more profound than … electricity or fire” –  Sundar Pichai (CEO of Google). 

Experts in AI then chime in, “This time is different!” 

When you hear comments like this, ask yourself “what are they selling?”.

Deep learning is a tool like any other tool…like a wrench for a car mechanic or a serrated knife for a master chef. Deep learning can currently solve specific problems really well but others not so well. 

Machine learning, the field that encompasses deep learning, is about automating the process of finding relationships based on empirical data. It is a powerful tool that has an enormous amount of potential, but it is not a panacea and is still a long way away from replacing the human brain. 

I do agree with Mr. Pichai that when we have true artificial general intelligence that such a breakthrough would be as profound as electricity or fire. We are not there yet. Much more work needs to be done (and that is a great thing for us scientists and engineers). The future is bright.

How Can We Help Others Gain a Better Understanding of What These Models Are Learning?

The example that is often marketed to explain deep learning is a neural network that first takes the inputs and learns lines, curves, and other shapes. Each successive layer abstracts and combines the data more and more until we see letters and fully formed images. This method of explaining deep learning seems like a good example of how we might gain an understanding of what exactly these algorithms actually learn.

I think that one reason why people might not trust deep learning is because they don’t understand how it works and even when they do, we cannot see the hidden layers in the neural network. When we look at the neural network for deep learning, we have multiple layers including hidden layers that are compose some of the neurons.

neural_network
Neural Network

With deep learning, we allow the program to learn and distinguish the key features from our sample data set. The problem is that we may not be easily able to understand the features that are distinguished and how they might be related to each other as defined by the hidden layers in the algorithm’s learning model.

I think that one strategy for improving, or at least understanding what deep learning is doing is to unpack these abstracted layers in order to hand tune the results into something that is more relevant.

What are Deep Learning Methods Really Learning?

It is not exactly clear what deep learning methods are really learning. Sure, they are highly effective and are learning something, but I’m still trying to get my head around exactly what they are learning.

Consider your run-of-the-mill deep neural network. “Learning” is nothing more than an optimization procedure. We are trying to produce an optimized mathematical formula that takes in a set of training examples and then can, as accurately as possible, map the inputs (i.e. attributes, features, etc.) of those examples to the outputs (i.e. class, target variable, etc.). We then use this formula to classify a new set of examples.

gradient_descent_png
Gradient Descent. Is this really all there is to learning?

At its core, deep learning is about input-process-output. It is not true learning in the sense of the word (the way we humans do). True learning entails understanding, and understanding is nonexistent during deep learning. 

You can memorize a book, chapter-by-chapter, word-for-word; but that doesn’t mean you are learning. You still would not understand the plot. Similarly, in deep learning there is no understanding. Deep learning “memorizes” a mapping between inputs and outputs without any real understanding of the why behind those relationships. And in my view, the why is a huge part of learning. True learning (in the human sense of the word) without understanding is not learning. Perhaps then we should call deep learning something different? Deep optimization perhaps??? Guess that didn’t sound as marketable and sexy as deep learning.

If you look out in nature — the human brain or the brain of any living organism — nothing out there learns in a way that even remotely resembles backpropagation. Neural networks are about classification error, but real learning — the way humans learn — is deeper than that (pun intended). 

A neural network, for example, has a completely different concept of what it is to be a dog. That concept could involve where certain groups of pixels are placed and may have nothing to do with the actual structure of the animal. Where a human would see legs, arms, torso, etc., a deep learning algorithm may abstract a completely different set of things. This has led a rise in adversarial attacks, where an attacker is able to determine what representation provides the highest probability for an image to be classified as anything and is able to insert noise that causes things to be misclassified.

Another point to consider is that neural networks generate something. It may be a relationship that we did not previously understand, but it may also just be nonsense that happens to work.  The abstractions may result in some representational form that is at its most basic just complete nonsense. If there is no real understanding of what the abstractions that the algorithm makes, then it is hard to confirm that it’s actually doing anything.

neural_network
A Basic Neural Network

The significance of the lack of understanding of what deep learning is, is yet to be seen; but here are just a few of the consequences if the hype gets unchecked:

  • Wasted resources as venture capitalists throw money at anything that has to do with deep learning. 
  • Wasted resources as non-expert government agencies fund any research project that has the term “deep learning” in it.
  • A boat load of computer science graduates around the world that, all of a sudden, have found their “passion” in deep learning.
  • Disappointed companies as deep learning does not have the expected impact on their bottom line.
  • Another AI winter.

Remember, there is the marketing element in there too. Using anthropomorphic terms like machine “learning” and deep “learning” is a much better sell to a general audience than machine mathematical optimization or deep optimization. Researchers gotta sell their ideas too!

Bottom Line: Artificial intelligence is not yet intelligent, and deep learning is not yet deep (yay! we still have work to do!)…nor is it learning in the true sense of the word. Deep learning certainly will continue to have an enormous impact on the world, but there needs to be more awareness and discussion of not just the enormous potential of deep learning but also its limitations so non-technical stakeholders can make more informed decisions.