Real-Time Object Recognition Using a Webcam and Deep Learning

*** This tutorial is two years old and may no longer work properly. You can find an updated tutorial for object recognition at this link***

In this tutorial, we will develop a program that can recognize objects in a real-time video stream on a built-in laptop webcam using deep learning.

object-detection-recognition-video-demo

Object recognition involves two main tasks:

  1. Object Detection (Where are the objects?): Locate objects in a photo or video frame
  2. Image Classification (What are the objects?): Predict the type of each object in a photo or video frame

Humans can do both tasks effortlessly, but computers cannot.

Computers require a lot of processing power to take full advantage of the state-of-the-art algorithms that enable object recognition in real time. However, in recent years, the technology has matured, and real-time object recognition is now possible with only a laptop computer and a webcam.

Real-time object recognition systems are currently being used in a number of real-world applications, including the following:

  • Self-driving cars: detection of pedestrians, cars, traffic lights, bicycles, motorcycles, trees, sidewalks, etc.
  • Surveillance: catching thieves, counting people, identifying suspicious behavior, child detection.
  • Traffic monitoring: identifying traffic jams, catching drivers that are breaking the speed limit.
  • Security: face detection, identity verification on a smartphone.
  • Robotics: robotic surgery, agriculture, household chores, warehouses, autonomous delivery.
  • Sports: ball tracking in baseball, golf, and football.
  • Agriculture: disease detection in fruits.
  • Food: food identification.

There are a lot of steps in this tutorial. Have fun, be patient, and be persistent. Don’t give up! If something doesn’t work the first time around, try again. You will learn a lot more by fighting through to the end of this project. Stay relentless!

By the end of this tutorial, you will have the rock-solid confidence to detect and recognize objects in real time on your laptop’s GPU (Graphics Processing Unit) using deep learning.

Let’s get started!

Table of Contents

You Will Need

Install TensorFlow CPU

We need to get all the required software set up on our computer. I will be following this really helpful tutorial.

Open an Anaconda command prompt terminal.

1-open-command-promptJPG

Type the command below to create a virtual environment named tensorflow_cpu that has Python 3.6 installed. 

conda create -n tensorflow_cpu pip python=3.6

Press y and then ENTER.

A virtual environment is like an independent Python workspace which has its own set of libraries and Python version installed. For example, you might have a project that needs to run using an older version of Python, like Python 2.7. You might have another project that requires Python 3.7. You can create separate virtual environments for these projects.

Now, let’s activate the virtual environment by using this command:

conda activate tensorflow_cpu
2-activate-virtual-environmetJPG

Type the following command to install TensorFlow CPU.

pip install --ignore-installed --upgrade tensorflow==1.9

Wait for Tensorflow CPU to finish installing. Once it is finished installing, launch Python by typing the following command:

python
3-launch-pythonJPG

Type:

import tensorflow as tf

Here is what my screen looks like now:

4-import-tensorflowJPG

Now type the following:

hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()

You should see a message that says: “Your CPU supports instructions that this TensorFlow binary….”. Just ignore that. Your TensorFlow will still run fine.

Now run this command to complete the test of the installation:

print(sess.run(hello))
6-test-installationJPG

Press CTRL+Z. Then press ENTER to exit.

Type:

exit

That’s it for TensorFlow CPU. Now let’s install TensorFlow GPU.

Return to Table of Contents

Install TensorFlow GPU

Your system must have the following requirements:

  • Nvidia GPU (GTX 650 or newer…I’ll show you later how to find out what Nvidia GPU version is in your computer)
  • CUDA Toolkit v9.0 (we will install this later in this tutorial)
  • CuDNN v7.0.5 (we will install this later in this tutorial)
  • Anaconda with Python 3.7+

Here is a good tutorial that walks through the installation, but I’ll outline all the steps below.

Install CUDA Toolkit v9.0

The first thing we need to do is to install the CUDA Toolkit v9.0. Go to this link.

Select your operating system. In my case, I will select Windows, x86_64, Version 10, and exe (local).

7-select-target-platformJPG

Download the Base Installer as well as all the patches. I downloaded all these files to my Desktop. It will take a while to download, so just wait while your computer downloads everything.

8-base-installerJPG
9-patchesJPG

Open the folder where the downloads were saved to.

10-downloaded-to-desktopJPG

Double-click on the Base Installer program, the largest of the files that you downloaded from the website.

Click Yes to allow the program to make changes to your device.

Click OK to extract the files to your computer.

11-click-okJPG
12-extract-filesJPG

I saw this error window. Just click Continue.

13-error-message-continueJPG

Click Agree and Continue.

14-agree-and-continueJPG

If you saw that error window earlier… “…you may not be able to run CUDA applications with this driver…,” select the Custom (Advanced) install option and click Next. Otherwise, do the Express installation and follow all the prompts.

15-custom-advancedJPG

Uncheck the Driver components, PhysX, and Visual Studio Integration options. Then click Next.

Click Next.

16-installation-location-click-nextJPG

Wait for everything to install.

17-prepare-for-installationJPG

Click Close.

18-installer-finishedJPG

Delete  C:\Program Files\NVIDIA Corporation\Installer2.

19-delete-installer-2JPG

Double-click on Patch 1.

Click Yes to allow changes to your computer.

Click OK.

20-click-okJPG

Click Agree and Continue.

21-agree-and-continueJPG

Go to Custom (Advanced) and click Next.

22-custom-advancedJPG

Click Next.

23-click-nextJPG

Click Close.

The process is the same for Patch 2. Double-click on Patch 2 now.

Click Yes to allow changes to your computer.

Click OK.

Click Agree and Continue.

Go to Custom (Advanced) and click Next.

Click Next.

24-click-nextJPG

Click Close.

25-click-closeJPG

The process is the same for Patch 3. Double-click on Patch 3 now.

Click Yes to allow changes to your computer.

Click OK.

Click Agree and Continue.

Go to Custom (Advanced) and click Next.

Click Next.

Click Close.

The process is the same for Patch 4. Double-click on Patch 4 now.

Click Yes to allow changes to your computer.

Click OK.

Click Agree and Continue.

Go to Custom (Advanced) and click Next.

Click Next.

After you’ve installed Patch 4, your screen should look like this:

26-installed-patch-4JPG

Click Close.

To verify your CUDA installation, go to the command terminal on your computer, and type:

nvcc --version
26-verify-cuda-versionJPG

Return to Table of Contents

Install the NVIDIA CUDA Deep Neural Network library (cuDNN)

Now that we installed the CUDA 9.0 base installer and its four patches, we need to install the NVIDIA CUDA Deep Neural Network library (cuDNN). Official instructions for installing are on this page, but I’ll walk you through the process below.

Go to https://developer.nvidia.com/rdp/cudnn-download

Create a user profile if needed and log in.

27-become-a-memberJPG

Go to this page: https://developer.nvidia.com/rdp/cudnn-download

Agree to the terms of the cuDNN Software License Agreement.

28-agree-to-termsJPG

We have CUDA 9.0, so we need to click cuDNN v7.6.4 (September 27, 2019), for CUDA 9.0.

29-download-cudnnJPG

I have Windows 10, so I will download cuDNN Library for Windows 10.

30-cudnn-windows10JPG

In my case, the zip file downloaded to my Desktop. I will unzip that zip file now, which will create a new folder of the same name…just without the .zip part. These are your cuDNN files. We’ll come back to these in a second.

31-unzipJPG

Before we get going, let’s double check what GPU we have. If you are on a Windows machine, search for the “Device Manager.”

32-my-gpuJPG

Once you have the Device Manager open, you should see an option near the top for “Display Adapters.” Click the drop-down arrow next to that, and you should see the name of your GPU. Mine is NVIDIA GeForce GTX 1060.

33-nvidia-control-panelJPG

If you are on Windows, you can also check what NVIDIA graphics driver you have by right-clicking on your Desktop and clicking the NVIDIA Control Panel. My version is 430.86. This version fits the requirements for cuDNN.

Ok, now that we have verified that our system meets the requirements, lets navigate to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0, your CUDA Toolkit directory.

34-navigate-to-cuda-toolkit-directoryJPG

Now go to your cuDNN files, that new folder that was created when you did the unzipping. Inside that folder, you should see a folder named cuda. Click on it.

35-named-cudaJPG

Click bin.

36-click-binJPG

Copy cudnn64_7.dll to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin. Your computer might ask you to allow Administrative Privileges. Just click Continue when you see that prompt.

Now go back to your cuDNN files. Inside the cuda folder, click on include. You should see a file named cudnn.h.

37-click-includeJPG
38-cudnnhJPG

Copy that file to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\include. Your computer might ask you to allow Administrative Privileges. Just click Continue when you see that prompt.

Now go back to your cuDNN files. Inside the cuda folder, click on lib -> x64. You should see a file named cudnn.lib. 

39-cudnn-libJPG

Copy that file to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64. Your computer might ask you to allow Administrative Privileges. Just click Continue when you see that prompt.

If you are using Windows, do a search on your computer for Environment Variables. An option should pop up to allow you to edit the Environment Variables on your computer.

Click on Environment Variables.

40-environment-variablesJPG

Make sure you CUDA_PATH variable is set to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0.

41-cuda-path-setJPG

I recommend restarting your computer now.

Return to Table of Contents

Install TensorFlow GPU

Now we need to install TensorFlow GPU. Open a new Anaconda terminal window. 

42-anaconda-terminal-windowJPG

Create a new Conda virtual environment named tensorflow_gpu by typing this command:

conda create -n tensorflow_gpu pip python=3.6

Type y and press Enter.

43-conda-virtual-environmentJPG

Activate the virtual environment.

conda activate tensorflow_gpu
44-activate-virtual-envJPG

Install TensorFlow GPU for Python.

pip install --ignore-installed --upgrade tensorflow-gpu==1.9

Wait for TensorFlow GPU to install.

Now let’s test the installation. Launch the Python interpreter.

python

Type this command.

import tensorflow as tf

If you don’t see an error, TensorFlow GPU is successfully installed.

45-test-installation-1JPG

Now type this:

hello = tf.constant('Hello, TensorFlow!')
46-now-type-thisJPG

And run this command. It might take a few minutes to run, so just wait until it finishes:

sess = tf.Session()
47-all-finishedJPG

Now type this command to complete the test of the installation:

print(sess.run(hello))
48-complete-the-testJPG

You can further confirm whether TensorFlow can access the GPU, by typing the following into the Python interpreter (just copy and paste into the terminal window while the Python interpreter is running).

tf.test.is_gpu_available(
    cuda_only=True,
    min_cuda_compute_capability=None
)
49-further-testJPG

To exit the Python interpreter, type:

exit()
50-exit-the-editorJPG

And press Enter.

Return to Table of Contents

Install TensorFlow Models

Now that we have everything setup, let’s install some useful libraries. I will show you the steps for doing this in my TensorFlow GPU virtual environment, but the steps are the same for the TensorFlow CPU virtual environment.

Open a new Anaconda terminal window. Let’s take a look at the list of virtual environments that we can activate.

conda env list
51-conda-env-listJPG

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

Install the libraries. Type this command:

conda install pillow lxml jupyter matplotlib opencv cython

Press y to proceed.

Once that is finished, you need to create a folder somewhere that has the TensorFlow Models  (e.g. C:\Users\addis\Documents\TensorFlow). If you have a D drive, you can also save it there as well.

In your Anaconda terminal window, move to the TensorFlow directory you just created. You will use the cd command to change to that directory. For example:

cd C:\Users\addis\Documents\TensorFlow

Go to the TensorFlow models page on GitHub: https://github.com/tensorflow/models.

Click the button to download the zip file of the repository. It is a large file, so it will take a while to download.

52-download-zipJPG

Move the zip folder to the TensorFlow directory you created earlier and extract the contents.

Rename the extracted folder to models instead of models-master. Your TensorFlow directory hierarchy should look like this:

TensorFlow

  • models
    • official
    • research
    • samples
    • tutorials

Return to Table of Contents

Install Protobuf

Now we need to install Protobuf, which is used by the TensorFlow Object Detection API to configure the training and model parameters.

Go to this page: https://github.com/protocolbuffers/protobuf/releases

Download the latest *-win32.zip release (assuming you are on a Windows machine).

53-download-latestJPG

Create a folder in C:\Program Files named it Google Protobuf.

Extract the contents of the downloaded *-win32.zip, inside C:\Program Files\Google Protobuf

54-extract-contentsJPG

Search for Environment Variables on your system. A window should pop up that says System Properties.

55-system-propertiesJPG

Click Environment Variables.

Go down to the Path variable and click Edit.

56-path-variable-editJPG

Click New.

57-click-newJPG

Add C:\Program Files\Google Protobuf\bin

You can also add it the Path System variable.

Click OK a few times to close out all the windows.

Open a new Anaconda terminal window.

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your \TensorFlow\models\research\ directory and run the following command:

for /f %i in ('dir /b object_detection\protos\*.proto') do protoc object_detection\protos\%i --python_out=.

Now go back to the Environment Variables on your system. Create a New Environment Variable named PYTHONPATH (if you don’t have one already). Replace C:\Python27amd64 if you don’t have Python installed there. Also, replace <your_path> with the path to your TensorFlow folder.

C:\Python27amd64;C:\<your_path>\TensorFlow\models\research\object_detection
58-pythonpathJPG

For example:

C:\Python27amd64;C:\Users\addis\Documents\TensorFlow

Now add these two paths to your PYTHONPATH environment variable:

C:\<your_path>\TensorFlow\models\research\
C:\<your_path>\TensorFlow\models\research\slim

Return to Table of Contents

Install COCO API

Now, we are going to install the COCO API. You don’t need to worry about what this is at this stage. I’ll explain it later.

Download the Visual Studios Build Tools here: Visual C++ 2015 build tools from here: https://go.microsoft.com/fwlink/?LinkId=691126

Choose the default installation.

59-visual-studioJPG

After it has installed, restart your computer.

60-setup-completedJPG

Open a new Anaconda terminal window.

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your \TensorFlow\models\research\ directory and run the following command to install pycocotools (everything below goes on one line):

pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
61-pycocotools-installedJPG

If it doesn’t work, install git: https://git-scm.com/download/win

Follow all the default settings for installing Git. You will have to click Next several times.

Once you have finished installing Git, run this command (everything goes on one line):

pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI

Return to Table of Contents

Test the Installation

Open a new Anaconda terminal window.

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your \TensorFlow\models\research\object_detection\builders directory and run the following command to test your installation.

python model_builder_test.py

You should see an OK message.

62-test-your-installationJPG

Return to Table of Contents

Install LabelImg

Now we will install LabelImg, a graphical image annotation tool for labeling object bounding boxes in images.

Open a new Anaconda/Command Prompt window.

Create a new virtual environment named labelImg by typing the following command:

conda create -n labelImg

Activate the virtual environment.

conda activate labelImg

Install pyqt.

conda install pyqt=5

Click y to proceed.

Go to your TensorFlow folder, and create a new folder named addons.

63-new-folder-named-addonsJPG

Change to that directory using the cd command.

Type the following command to clone the repository:

git clone https://github.com/tzutalin/labelImg.git

Wait while labelImg downloads.

You should now have a folder named addons\labelImg under your TensorFlow folder.

Type exit to exit the terminal.

Open a new terminal window.

Activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your TensorFlow\addons\labelImg directory.

Type the following commands, one right after the other.

conda install pyqt=5
conda install lxml
pyrcc5 -o libs/resources.py resources.qrc
exit

Test the LabelImg Installation

Open a new terminal window.

Activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your TensorFlow\addons\labelImg directory.

Type the following commands:

python labelImg.py

If you see this window, you have successfully installed LabelImg. Here is a tutorial on how to label your own images. Congratulations!

64-label-imgJPG

Return to Table of Contents

Recognize Objects Using Your WebCam

Approach

Note: This section gets really technical. If you know the basics of computer vision and deep learning, it will make sense. Otherwise, it will not. You can skip this section and head straight to the Implementation section if you are not interested in what is going on under the hood of the object recognition application we are developing.

In this project, we use OpenCV and TensorFlow to create a system capable of automatically recognizing objects in a webcam. Each detected object is outlined with a bounding box labeled with the predicted object type as well as a detection score.

The detection score is the probability that a bounding box contains the object of a particular type (e.g. the confidence a model has that an object identified as a “backpack” is actually a backpack).

The particular SSD with Inception v2 model used in this project is the ssd_inception_v2_coco model. The ssd_inception_v2_coco model uses the Single Shot MultiBox Detector (SSD) for its architecture and the Inception v2 framework for feature extraction.

Single Shot MultiBox Detector (SSD)

Most state-of-the-art object detection methods involve the following stages:

  1. Hypothesize bounding boxes 
  2. Resample pixels or features for each box
  3. Apply a classifier

The Single Shot MultiBox Detector (SSD) eliminates the multi-stage process above and performs all object detection computations using just a single deep neural network.

Inception v2

Most state-of-the-art object detection methods based on convolutional neural networks at the time of the invention of Inception v2 added increasingly more convolution layers or neurons per layer in order to achieve greater accuracy. The problem with this approach is that it is computationally expensive and prone to overfitting. The Inception v2 architecture (as well as the Inception v3 architecture) was proposed in order to address these shortcomings.

Rather than stacking multiple kernel filter sizes sequentially within a convolutional neural network, the approach of the inception-based model is to perform a convolution on an input with multiple kernels all operating at the same layer of the network. By factorizing convolutions and using aggressive regularization, the authors were able to improve computational efficiency. Inception v2 factorizes the traditional 7 x 7 convolution into 3 x 3 convolutions.

Szegedy, Vanhoucke, Ioffe, Shlens, & Wojna, (2015) conducted an empirically-based demonstration in their landmark Inception v2 paper, which showed that factorizing convolutions and using aggressive dimensionality reduction can substantially lower computational cost while maintaining accuracy.

Data Set

The ssd_inception_v2_coco model used in this project is pretrained on the Common Objects in Context (COCO) data set (COCO data set), a large-scale data set that contains 1.5 million object instances and more than 200,000 labeled images. The COCO data required 70,000 crowd worker hours to gather, annotate, and organize images of objects in natural environments.

Software Dependencies

The following libraries form the object recognition backbone of the application implemented in this project:

  • OpenCV, a library of programming functions for computer vision.
  • Pillow, a library for manipulating images.
  • Numpy, a library for scientific computing.
  • Matplotlib, a library for creating graphs and visualizations.
  • TensorFlow Object Detection API, an open source framework developed by Google that enables the development, training, and deployment of pre-trained object detection models.

Return to Table of Contents

Implementation

Now to the fun part, we will now recognize objects using our computer webcam.

Copy the following program, and save it to your TensorFlow\models\research\object_detection directory as object_detection_test.py .

# Import all the key libraries
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import cv2

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from utils import label_map_util
from utils import visualization_utils as vis_util

# Define the video stream
cap = cv2.VideoCapture(0)  

# Which model are we downloading?
# The models are listed here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
MODEL_NAME = 'ssd_inception_v2_coco_2018_01_28'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to the frozen detection graph. 
# This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add the correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')

# Number of classes to detect
NUM_CLASSES = 90

# Download Model
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
    file_name = os.path.basename(file.name)
    if 'frozen_inference_graph.pb' in file_name:
        tar_file.extract(file, os.getcwd())

# Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

# Loading label map
# Label maps map indices to category names, so that when our convolution network 
# predicts `5`, we know that this corresponds to `airplane`.  Here we use internal 
# utility functions, but anything that returns a dictionary mapping integers to 
# appropriate string labels would be fine
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)


# Helper code
def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)
	
# Detection
with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        while True:

            # Read frame from camera
            ret, image_np = cap.read()
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            # Extract image tensor
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            # Extract detection boxes
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            # Extract detection scores
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            # Extract detection classes
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            # Extract number of detectionsd
            num_detections = detection_graph.get_tensor_by_name(
                'num_detections:0')
            # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores),
                category_index,
                use_normalized_coordinates=True,
                line_thickness=8)

            # Display output
            cv2.imshow('object detection', cv2.resize(image_np, (800, 600)))

            if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break


print("We are finished! That was fun!")

Open a new terminal window.

Activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your TensorFlow\models\research\object_detection directory.

At the time of this writing, we need to use Numpy version 1.16.4. Type the following command to see what version of Numpy you have on your system.

pip show numpy

If it is not 1.16.4, execute the following commands:

pip uninstall numpy
pip install numpy==1.16.4

Now run, your program:

python object_detection_test.py

In about 30 to 90 seconds, you should see your webcam power up and object recognition take action. That’s it! Congratulations for making it to the end of this tutorial!

object_detection_resultsJPG

Keep building!

Return to Table of Contents

How to Install TensorFlow 2 on Windows 10

In this post, I will show you how to install TensorFlow 2 on Windows 10. TensorFlow2 is a free software library used for machine learning applications. It comes integrated with Keras, a neural-network library written in Python. If you want to work with neural networks and deep learning, TensorFlow 2 should be your software of choice because of its popularity both in academia and in industry. Let’s get started!

Table of Contents

You Will Need 

Directions

Install TensorFlow 2

Here are the official instructions for downloading TensorFlow 2, but I will walk you through the process step-by-step.

Open an Anaconda command prompt terminal.

1-open-promptJPG

Type the command below to create a virtual environment named tf_2 with the latest version of Python installed. A virtual environment is like an independent Python workspace which has its own set of libraries and Python version installed. For example, you might have a project that needs to run using an older version of Python, like Python 2.7. You might have another project that requires Python 3.7. You can create separate virtual environments for these projects.

conda create -n tf_2 python

Press y and then ENTER.

2-type-yJPG

Wait for the software to download.

3-activate-tensorflow-2JPG

Once the download is finished, activate the virtual environment using this command:

conda activate tf_2

Check which version of Python you have installed on your system. I have Python 3.8.0.

python --version
4-python-versionJPG

Choose a TensorFlow package. I’ll install TensorFlow CPU. Let’s type the following command:

5-choose-a-packageJPG
pip install --upgrade tensorflow

You might see this error:

ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)

ERROR: No matching distribution found for tensorflow

If you do, you need to downgrade your version of Python. TensorFlow is not yet compatible with your newest version of Python.

conda install python=3.6

Press y and then ENTER.

Check which version of Python you have installed on your system. I have Python 3.6.9 now.

python --version
6-downgrade-pythonJPG

Now install TensorFlow 2.

pip install --upgrade tensorflow

Wait for Tensorflow CPU to finish installing. Once it is finished installing, verify the installation by typing:

python -c "import tensorflow as tf; x = [[2.]]; print('tensorflow version', tf.__version__); print('hello, {}'.format(tf.matmul(x, x)))"

Here is the output:

9-voilaJPG

You should see your TensorFlow version in the output.

You might see this message:

“I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2”

Don’t worry, TensorFlow is working just fine. To get rid of that message, you can set the environment variables inside the virtual environment. Type the following command:

set TF_CPP_MIN_LOG_LEVEL=2

Now run this command:

python -c "import tensorflow as tf; x = [[2.]]; print('tensorflow version', tf.__version__); print('hello, {}'.format(tf.matmul(x, x)))"

Voila! Message gone. 

9-voilaJPG-1

Return to Table of Contents

Create a Basic Neural Network Using TensorFlow 2

To really see what TensorFlow 2 can do, let’s do the following:

  • Build a neural network that classifies images of clothing.
  • Train this neural network.
  • And, finally, evaluate the accuracy of the model.

We are going to roughly follow the TensorFlow beginner tutorial.

First, install the Matplotlib library.

pip install matplotlib

I’m now going to open up a text editor and type a Python program. I will save it to my D drive as fashion_mnist.py. Here is the code:

from __future__ import absolute_import, division, print_function, unicode_literals

# Import the key libraries
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

# Rename tf.keras.layers
layers = tf.keras.layers

# Print the TensorFlow version
print(tf.__version__)

# Load and prepare the MNIST dataset. 
# Convert the samples from integers to floating-point numbers:
mnist = tf.keras.datasets.fashion_mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Let's plot the data so we can see it
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal',
 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
plt.figure(figsize=(10,10))

for i in range(25):
 plt.subplot(5,5,i+1)
 plt.xticks([])
 plt.yticks([])
 plt.grid(False)
 plt.imshow(x_train[i], cmap=plt.cm.binary)
 plt.xlabel(class_names[y_train[i]])
plt.show()

Within your virtual environment in the Anaconda terminal, navigate to where you saved your code. I will type.

D:

Then:

cd D:\<YOUR_PATH>\install_tensorflow2

Type dir to see if the Python (.py) file is in that directory.

Now run the code:

python fashion_mnist.py

You should see this graphic pop up.

10-fashion-datasetJPG

In the terminal window, press CTRL+C on your keyboard to stop the code from running.

Let’s add to our code. Open up the Python file again in the text editor and type the following code. If you are new to neural networks, don’t worry what everything means at this stage.

from __future__ import absolute_import, division, print_function, unicode_literals

# Import the key libraries
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

# Rename tf.keras.layers
layers = tf.keras.layers

# Print the TensorFlow version
print(tf.__version__)

# Load and prepare the MNIST dataset. 
# Convert the samples from integers to floating-point numbers:
mnist = tf.keras.datasets.fashion_mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Let's plot the data so we can see it
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal',
 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
plt.figure(figsize=(10,10))

for i in range(25):
 plt.subplot(5,5,i+1)
 plt.xticks([])
 plt.yticks([])
 plt.grid(False)
 plt.imshow(x_train[i], cmap=plt.cm.binary)
 plt.xlabel(class_names[y_train[i]])
plt.show()

# Build the neural network layer-by-layer
model = tf.keras.Sequential()
model.add(layers.Flatten()) # Make the input layer one-dimensional
model.add(layers.Dense(64, activation='relu')) # Layer has 64 nodes; Uses ReLU
model.add(layers.Dense(64, activation='relu')) # Layer has 64 nodes; Uses ReLU
model.add(layers.Dense(10, activation='softmax')) # Layer has 64 nodes; Uses Softmax

# Choose an optimizer and loss function for training:
model.compile(optimizer='adam',
 loss='sparse_categorical_crossentropy',
 metrics=['accuracy'])
 
# Train and evaluate the model's accuracy
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test,  y_test, verbose=2)

Run the code:

python fashion_mnist.py

When you see the plot of the clothes appear, just close that window so that the neural network build and run.

Here is the output.

11-accuracyJPG

The accuracy of classifying the clothing items was 87.5%. Pretty cool huh! Congratulations! You’ve built and run your first neural network on TensorFlow 2.

To deactivate the virtual environment, type:

conda deactivate

Then to exit the terminal, type:

exit

At this stage, I encourage you to go through the TensorFlow tutorials to get more practice using this really powerful tool.

Return to Table of Contents

Difference Between Supervised and Unsupervised Learning

In this post, I will explain the difference between supervised and unsupervised learning.

Table of Contents

What is Supervised Learning?

new_home_for_sale_4

Imagine you have a computer. The computer is really good at doing math and making complex calculations. You want to “train” your computer to predict the price for any home in the United States. So you search around the Internet and find a dataset. The dataset contains the following information  for 100,000 houses that sold during the last 30 days in various cities across the United States:

  1. Square footage
  2. Number of bathrooms
  3. Number of bedrooms
  4. Number of garages
  5. Year the house was constructed
  6. Average size of the house’s windows
  7. Sale price

Variables 1 through 6 above are the dataset’s features (also known as input variables, attributes, independent variables, etc.). Variable 7, the house sale price, is the output variable (or target variable) that we want our computer to get good at predicting. 

So the question is…how do we train our computer to be a good house price predictor?  We need to write a software program. The program needs to take as input the 100,000-house dataset that I mentioned earlier. This program needs to then find a mathematical relationship between variables 1-6 (features) and variable 7 (output variable). Once the program has found a relationship between the features (input) and the output, it can predict the sale price of a house it has never seen before.

1-ipo

Let’s take a look at an analogy. Supervised learning is like baking a cake. 

Suppose you had a cake machine that was able to cook many different types of cake from the same set of ingredients. All you have to do as the cake chef is to just throw the ingredients into the machine, and the cake machine will automatically make the cake.

The “features,” the inputs to the cake machine, are the following ingredients:

  1. Butter
  2. Sugar
  3. Vanilla Extract
  4. Flour
  5. Chocolate
  6. Eggs
  7. Salt

The output is the type of cake:

  1. Vanilla Cake
  2. Pound Cake
  3. Chocolate Cake
  4. Dark Chocolate Cake
cake_cakes_sweet_bake

Different amounts of each ingredient will produce different types of cake. How does the cake machine know what type of cake to produce given a set of ingredients? 

Fortunately, a machine learning engineer has written a software program (containing a supervised learning algorithm) that is running inside the cake machine. The program was pre-trained on a dataset containing 1 million cakes. Each entry (i.e. example or training instance) in this gigantic dataset contained two key pieces of data: 

  1. How much of each ingredient was used in the making of that given cake
  2. The type of cake that was produced

During the training phase of the program, the program found a fairly accurate mathematical relationship between the amount of each ingredient and the cake type. And now, when the cake machine is provided with a new set of ingredients by the chef, it automatically “knows” what type of cake to produce. Pretty cool huh!

2-ipo-2

What I described above is called supervised learning. It is called supervised learning because the input data (which the supervised learning algorithm used to train) is already labeled with the “correct” answers (e.g. the type of cake in our example above; or the sale price values for those 100,000 homes from the earlier example I presented in this post.). We supervised the learning algorithm by “telling” it what the output (cake type) should be for 1 million different sets of input values (ingredients). 

The fundamental idea of a supervised learning algorithm is to learn a mathematical relationship between inputs and outputs so that it can predict the output value given an entirely new set of input values. 

Let’s take a look at a common supervised learning algorithm: linear regression. The goal of linear regression is to find a line that best fits the relationship between input and output. For example, the learning algorithm for linear regression could be trained on square footage and sale price data for 100,000 homes. It would learn the mathematical relationship (e.g. a straight line in the form y = mx + b) between square footage and the sale price of a home. 

3-square-footage

With this relationship (i.e. line of best fit) in hand, the algorithm can now easily predict the sale price of any home just by being provided with the home’s square footage value. 

For example, let’s say we wanted to find the price of a house that is 2000 ft2. We feed 2,000 ft2 into the algorithm. The algorithm predicts a sale price of $500,000.

4-price-square-footage

As you can imagine, before we make any predictions using a supervised learning algorithm, it is imperative to train it on a lot of data. Lots and lots of data. The more the merrier.

In the example above, to get that best fit line, we want to feed it with as many examples of houses as possible during training. The more data it has, the better its predictions will be when given new data. This, my friends, is supervised learning, and it is incredibly powerful. In fact, supervised learning is the bread and butter of most of the state-of-the-art machine learning techniques today, such as deep learning.

Now, let’s look at unsupervised learning.

Return to Table of Contents

What is Unsupervised Learning?

Let’s suppose you have the following dataset for a group of 13 people. For each person, we have the following features:

  • Height (in inches)
  • Hair Length (in inches)

Let’s plot the data to see what it looks like:

5-height

In contrast to supervised learning, in this case there is no output value that we are trying to predict (e.g. house sale price, cake type, etc.). All we have are features (inputs) with no corresponding output variables. In machine learning jargon, we say that the data points are unlabeled

So instead of trying to force the dataset to fit a straight line or some sort of predetermined mathematical model, we let an unsupervised learning algorithm find a pattern all on its own. And here is what we get:

6-hair-length-vs-height

Aha! It looks like the algorithm found some sort of pattern in the data. The data is clustered into two distinct groups. What these clusters mean, we do not know because the data points are unlabeled. However, what we do suspect given our prior knowledge of this dataset, is that the blue dots are males, and the red dots are females given the attributes are height and hair length. 

What I have described above is known as unsupervised learning. It is called unsupervised because the input dataset is unlabeled. There is no output variable we are trying to predict. There is no prior mathematical model we are trying to fit the data to. All we want to do is let the algorithm find some sort of structure or pattern in the data. We let the data speak for itself. 

Any time you are given a dataset and want to group similar data points into clusters, you’re going to want to use an unsupervised learning algorithm.

Return to Table of Contents