The Difference Between Generative and Discriminative Classifiers

In this post, I will explain the difference between generative classifiers and discriminative classifiers.

Let us suppose we have a class that we want to predict H (hypothesis) and a set of attributes E (evidence). The goal of classification is to create a model based on E and H that can predict the class H given a set of new, unseen attributes E. However, both classifier types, generative and discriminative, go about this classification process differently.

Generative Classifiers

Classification algorithms such as Naïve Bayes are known as generative classifiers. Generative classifiers take in training data and create probability estimates. Specifically, they estimate the following:

P(H): The probability of the hypothesis (e.g. spam or not spam). This value is the class prior probability (e.g. probability an e-mail is spam before taking any evidence into account).
P(E|H): The probability of the evidence given the hypothesis (e.g. probability an e-mail contains the phrase “Buy Now” given that an e-mail is spam). This value is known as the likelihood.

Once the probability estimates above have been computed, the model then uses Bayes Rule to make predictions, choosing the most likely class, based on which class maximizes the expression P(E|H) * P(H).

Discriminative Classifiers

Rather than estimate likelihoods, discriminative classifiers like Logistic Regression estimate P(H|E) directly. A decision boundary is created that creates a dividing line/plane between instances of one class and instances of another class. New, unseen instances are classified based on which side of the line/plane they fall. In this way, a direct mapping is generated from attributes E to class labels H.

An Example Using an Analogy

Here is an analogy that demonstrates the difference between generative and discriminative classifiers. Suppose we live in a world in which there are only two classes of animals, cats and rabbits. We want to build a robot that can automatically classify a new animal as either a cat or a rabbit. How would we train this robot using a discriminative algorithm like Logistic Regression?

With a discriminative algorithm, we would feed the model a set of training data containing instances of cats and instances of rabbits. The discriminative algorithm would try to find a straight line/plane (a decision boundary) that separates instances of cats from instances of rabbits. This line would be created by examining the differences in the attributes (e.g. herbivore vs. carnivore, long oval ears vs. small triangular ears, hopping vs. walking, etc.)

Once the training step is complete, the discriminative algorithm is then ready to classify new unseen animals. It will look at new, unseen animals and check which side of the decision boundary the animal should go. The animal is classified based on the side of the decision boundary it falls into.

In contrast, a generative learning algorithm like Naïve Bayes will take in training data and develop a model of what a cat and rabbit should look like. Once trained, a new, unseen animal is compared to the model of a cat and the model of a rabbit. It is then classified based on whether it looks more like the cat instances the model was trained on or the rabbit instances the model was trained on.

Past research has shown that discriminative classifiers like Logistic Regression generally perform better on classification tasks than generative classifiers like Naïve Bayes (Y. Ng & Jordan, 2001).

As a final note, generative classifiers are called generative because we can use the probabilistic information of the data to generate more instances. In other words, given a class y, you can generate its respective attributes x.

References

Y. Ng, A., & Jordan, M. (2001). On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes. NIPS’01 Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic , 841-848.

How to Set Up Anaconda for Windows 10

In this post, I will show you how to set up Anaconda. Anaconda is a free, open-source distribution of Python (and R). The goal of Anaconda is to be a free “one-stop-shop” for all your Python data science and machine learning needs. It contains the key packages you need to build cool machine learning projects.

Requirements

Here are the requirements:

Set up Anaconda.
Set up Jupyter Notebook.
Install important libraries.
Learn basic Anaconda commands.

Directions

Install Anaconda

Go to the Anaconda website and click “Download.”

Choose the latest version of Python. In my case, that is Python 3.7. Click “Download” to download the Anaconda installer for your operating system type (i.e. Windows, macOS, or Linux).

Follow the instructions to install the program:

Installing on Windows (this is what I’m using)
Installing on macOS
Installing on Linux

Verify Anaconda is installed by searching for “Anaconda Navigator” on your computer.

Open Anaconda Navigator.

Follow the instructions here for creating a “Hello World” program. You can use Spyder, Jupyter Notebooks, or the Anaconda Prompt (terminal). If you use Jupyter Notebooks, you will need to open the notebooks in Firefox, Google Chrome or another web browser.

Check to make sure that you have IPython installed. Use the following command (in an Anaconda Prompt window) to check:

where ipython

Make sure that you have pip installed. Pip is the package management system for Python.

where pip

Make sure that you have conda installed. Conda is Anaconda’s package management system.

where conda

Install Some Libraries

Install OpenCV

To install OpenCV, use the following command in the Anaconda Prompt:

pip install opencv-contrib-python

Type the following command to get a list of packages and make sure opencv-contrib-python is installed.

conda list

Install dlib

Install cmake.

pip install cmake

Install the C++ library called dlib.

pip install dlib

Type the following command and take a look at the list to see if dlib is successfully installed:

conda list

Install Tesseract

Go to Tesseract at UB Mannheim.

Download the Tesseract for your system.

Set it up by following the prompts.

Once Tesseract OCR is downloaded, find it on your system.

Copy the name of the file it is located in. In my case, that is:

C:\Program Files\Tesseract-OCR

Search for “Environment Variables” on your computer.

Under “System Variables,” click “Path,” and then click Edit.

Add the path: C:\Program Files\Tesseract-OCR

Click OK a few times to close all windows.

Open up the Anaconda Prompt.

Type this command to see if tesseract is installed on your system.

where tesseract

Now, apply the Python binding to the packages using the following commands:

pip install tesseract

pip install pytesseract

Install TensorFlow

Type the following command in the Anaconda Prompt:

pip install tensorflow

Install TensorFlow hub using this command:

pip install tensorflow-hub

Now install tflearn.

pip install tflearn

Now, install the Keras neural network library.

pip install keras

Install the Rest of the Libraries

Type the following commands to install the rest of the libraries:

pip install pillow

pip install SimpleITK

Learn Basic Anaconda Commands

Changing Directories

If you ever want to change directories to the D drive instead of the C drive, open Anaconda Prompt on your computer and type the following commands, in order

D:

cd D:\XXXX\XXXX\XXXX\XXXX

where D:\XXXX\XXXX\XXXX\XXXX is the file path.

Listing Directory Contents

Type the dir command to list the contents of a directory.

dir

Creating a Jupyter Notebook

Open Anaconda Prompt on your computer and type the following command:

jupyter notebook

Converting a Jupyter Notebook into Python Files

If you want to convert a Jupyter Notebook into Python files, change to that directory and type the following command:

jupyter nbconvert --to script *.ipynb

Congratulations if you made it this far! You have all the libraries installed that you need to do fundamental image processing and computer vision work in Python.

Artificial General Intelligence and Deep Learning’s Long-Run Viability

Learning and true intelligence are more than classification and regression (which is what deep learning is really good at). Deep learning is deficient in perhaps one of the most important types of intelligence, emotional intelligence. Deep learning cannot negotiate trade agreements between countries, resolve conflict between a couple going through a divorce, or craft legislation to reduce student debt.

The long-run viability of deep learning as well as progress towards machines that are more intelligent will depend both on improvements in computing power as well as our ability to understand and subsequently quantify general intelligence.

Concerning the quantification of general intelligence, this really is the inherent goal of not just artificial intelligence, but of computer science as a whole. We want a machine that can automate tasks that humans do and to be as independent as possible while doing it. We want to teach a computer to think the way a human thinks, to understand the way a human understands, and to infer the way a human infers. At present, the state-of-the-art is a long way from achieving this goal.

Deep learning is only as good as the data fed to it: garbage in, garbage out. Deep learning has a lot of limitations and is a long way from the kind of artificial general intelligence envisioned in those science fiction movies.

At this point, the state of the art in deep learning is identifying images, detecting fraud, processing audio, and performing advertising. The only thing that we really know for sure about deep learning at this point is that it performs some tasks at a very high level, and we aren’t 100% sure why it does. We understand the probability and the algorithm behind the representations that we are making, but we don’t understand what the network is actually learning like we would understand a classification tree or a run of Naive Bayes.

Understanding what abstractions that the network is learning is essential to move forward in the field. If at some point the theory is proven to be invalid, then all the emerging techniques will be rendered useless.

In any case, even if some theoretical basis is proven, it may or may not show a strong relationship that warrants continued development in these fields.

Deep Learning Needs Generalists Not Just Specialists

statue_davinci_leonardo_315924 — The Renaissance Man, Leonardo da Vinci

My prediction is that the real breakthroughs on the route to artificial general intelligence will come from the T-shaped generalists, those professionals who have the breadth and depth to see the unseen interconnectedness.

In order for deep learning and AI to be viable over the long run, researchers in this field will need to be able to think outside the box and be well-versed in the fundamentals of multiple disciplines, not just their own silo. Mathematicians need to learn a little bit about computer science, neuroscience, and psychology. Computer scientists need to learn a little bit about neuroscience and psychology. Neuroscientists need to cross over and learn a little bit about computer science as well as mathematics.

While learning about other fields is no guarantee that a researcher will make a breakthrough, staying in a narrow area and not venturing out is a guarantee of not making a breakthrough. As they say, to the hammer, everything looks like a nail.

As a final note, I’m keeping my eye out on quantum computing, and, more importantly, quantum machine learning and quantum neural networks. Quantum machine learning entails executing machine learning algorithms on a quantum computer. It will be interesting to see how getting away from the 1s and 0s of a standard computer will impact the field, and if it will usher in a new era of artificial intelligence. Only time will tell.