How to Annotate Images Using OpenCV

In this project, we will learn how to annotate images using OpenCV — a popular and powerful open source library for image processing and computer vision. OpenCV is a cross-platform library with wrappers for Python, Ruby, C#, JavaScript, and other languages designed for real-time image processing. OpenCV has methods for image I/O, filtering, motion tracking, segmentation, 3D reconstruction, as well as machine learning techniques such as boosting, support vector machines, and deep learning.

Requirements

  • Design a software application using Python and OpenCV that allows users to click in an image, annotate a number of points within an image, and export the annotated points into a CSV file.
    • Code must be implemented in Python and using OpenCV
    • Students should NOT use Jupyter Notebooks for this project
    • The input image and output CSV files will be provided as parameters.
      • Example: python annotate_images.py cat_dog.jpg cat_dog.csv

You Will Need 

  • Python 3.7

Input Images

baby
cat_dog
prague

Directions

To run the program, open up an Anaconda Prompt terminal

Go to the proper directory.

Type python annotate_images.py cat_dog.jpg cat_dog.csv to run the program.

Here is the code:

import cv2 # Import the OpenCV library
import numpy as np # Import Numpy library
import pandas as pd # Import Pandas library
import sys # Enables the passing of arguments

# Project: Annotate Images Using OpenCV
# Author: Addison Sears-Collins
# Date created: 9/11/2019
# Python version: 3.7
# Description: This program allows users to click in an image, annotate a 
#   number of points within an image, and export the annotated points into
#   a CSV file.

# Define the file name of the image
INPUT_IMAGE = sys.argv[1] # "cat_dog.jpg"
IMAGE_NAME = INPUT_IMAGE[:INPUT_IMAGE.index(".")]
OUTPUT_IMAGE = IMAGE_NAME + "_annotated.jpg"
output_csv_file = sys.argv[2]

# Load the image and store into a variable
# -1 means load unchanged
image = cv2.imread(INPUT_IMAGE, -1)

# Create lists to store all x, y, and annotation values
x_vals = []
y_vals = []
annotation_vals = []

# Dictionary containing some colors
colors = {'blue': (255, 0, 0), 'green': (0, 255, 0), 'red': (0, 0, 255), 
          'yellow': (0, 255, 255),'magenta': (255, 0, 255), 
          'cyan': (255, 255, 0), 'white': (255, 255, 255), 'black': (0, 0, 0), 
          'gray': (125, 125, 125), 
          'rand': np.random.randint(0, high=256, size=(3,)).tolist(), 
          'dark_gray': (50, 50, 50), 'light_gray': (220, 220, 220)}

def draw_circle(event, x, y, flags, param):
    """
    Draws dots on double clicking of the left mouse button
    """
    # Store the height and width of the image
    height = image.shape[0]
    width = image.shape[1]

    if event == cv2.EVENT_LBUTTONDBLCLK:
        # Draw the dot
        cv2.circle(image, (x, y), 5, colors['magenta'], -1)

        # Annotate the image
        txt = input("Describe this pixel using one word (e.g. dog) and press ENTER: ")

        # Append values to the list
        x_vals.append(x)
        y_vals.append(y)
        annotation_vals.append(txt)

        # Print the coordinates and the annotation to the console
        print("x = " + str(x) + "  y = " + str(y) + "  Annotation = " + txt + "\n")

        # Set the position of the text part of the annotation
        text_x_pos = None
        text_y_pos = y

        if x < (width/2):
            text_x_pos = int(x + (width * 0.075))
        else:
            text_x_pos = int(x - (width * 0.075))
 
        # Write text on the image
        cv2.putText(image, txt, (text_x_pos,text_y_pos), cv2.FONT_HERSHEY_SIMPLEX, 1, colors['magenta'], 2)

        cv2.imwrite(OUTPUT_IMAGE, image)

        # Prompt user for another annotation
        print("Double click another pixel or press 'q' to quit...\n")

print("Welcome to the Image Annotation Program!\n")
print("Double click anywhere inside the image to annotate that point...\n")

# We create a named window where the mouse callback will be established
cv2.namedWindow('Image mouse')

# We set the mouse callback function to 'draw_circle':
cv2.setMouseCallback('Image mouse', draw_circle)

while True:
    # Show image 'Image mouse':
    cv2.imshow('Image mouse', image)

    # Continue until 'q' is pressed:
    if cv2.waitKey(20) &amp; 0xFF == ord('q'):
        break

# Create a dictionary using lists
data = {'X':x_vals,'Y':y_vals,'Annotation':annotation_vals}

# Create the Pandas DataFrame
df = pd.DataFrame(data)
print()
print(df)
print()

# Export the dataframe to a csv file
df.to_csv(path_or_buf = output_csv_file, index = None, header=True) 

# Destroy all generated windows:
cv2.destroyAllWindows()

Output Images

baby_annotated
cat_dog_annotated
prague_annotated

CSV Output

Here is the output for the csv file for the baby photo above:

baby-csv

How to Create an Image Histogram Using OpenCV

Given an image as input, how do we get the corresponding histogram using OpenCV? First, let us take a look at what a histogram is, then let us take a look at how to create one given an image. 

What is a Histogram?

A histogram is another way of looking at an image. It is a graph that shows pixel brightness values on the x-axis (e.g. 0 [black] to 255 [white] for grayscale images) and the corresponding number (i.e. frequency) of pixels (for each brightness value) on the y-axis. 

How to Create an Image Histogram Using OpenCV

There are two links I particularly like that show how to create the image histogram given an input image.

  1. Geeks for Geeks
  2. OpenCV Python Tutorials

I like these tutorials because they lead the reader through all the essentials of how to find and analyze image histograms, step-by-step. This process boils down to the following code:

# Import the required libraries
import cv2  # Open CV
from matplotlib import pyplot as plt  #Matplotlib for plotting  
 
# Read the input image
img = cv2.imread('example.jpg',0) 
 
# Calculate the frequency of pixels in the brightness range 0 - 255
histr = cv2.calcHist([img],[0],None,[256],[0,256]) 
 
# Plot the histogram and display
plt.plot(histr) 
plt.show() 

Why Use CMOS Instead of CCD Sensors in Mobile Phones

If I were designing a new state-of-the-art mobile phone, I would choose Complementary metal–oxide–semiconductor (CMOS). CMOS has several advantages over a Charge-coupled device (CCD), which I will explain below.

Processing Speed

In CCD, photosites are passive, whereas in CMOS they are not…leading to slower processing speed and information transfer.

A photosite is denoted as a single color pixel in a CCD or CMOS sensor. In a CCD sensor, light is captured and converted into a charge. The charge accumulates in the photosites, is transferred to a voltage converter, and is then amplified. This whole process happens one row at a time. This video below demonstrates this process. 

However, with a CMOS sensor, the charge to voltage conversion and the amplification of the voltage occurs inside each photosite. Because the work on an image happens locally, processing and information transfer is faster than a CCD sensor.

Space Requirements

CMOS enables integration of timers and analog-to-digital converters, which conserves space.

CCD is an older technology, and it is not possible to integrate peripheral components like analog-to-digital converters and timers on a single chip. CMOS does enable integration of these components onto a single chip, which conserves space. 

For a mobile phone that needs to be limited to a certain size, space must be conserved, which gives CMOS an advantage for use in mobile phones.

Power Consumption

CCD consumes more power than CMOS. CCD needs a variety of power supplies for the timing clocks. Also, it requires a voltage of 7V to 10V.

A CMOS sensor requires just one power supply and requires a voltage of 3.3V to 5V, roughly 50% less than a CCD sensor. This lower power consumption means extended battery life.

CMOS Prevents Blooming

In a CCD sensor, when an image is overexposed, electrons pile up in the areas of the brightest part of the image and overflow to other photosites, which creates unwanted light streaks. The structure of CMOS sensors prevents this problem.

Cost

CMOS chips can be produced on virtually any standard silicon production line, whereas this is not the case for CCD chips. As a result, production cost is lower for CMOS chips. These cost savings result in better profit margins for mobile phone companies.

How to Draw the Letter ‘E’ on an Image Using Scikit-Image

Requirements

Develop a program in Python to draw an E at the center of an input image.

  • Program must be developed using Python 3.x.
  • Program must use scikit-image library — a simple and popular open source library for image processing in Python.
  • The input image must be a color image.
  • The letter must be at the center of the image and must be created by updating pixels, not by using any of the drawing functions.
  • The final output must be a side-by-side image created using matplotlib.
  • Must test the same code with two different images or two different sizes.

You Will Need 

Directions

Find any two images/photos. 

Create a new Jupyter Notebook. 

Here are the critical reference points for the letter E. These points mark the corners of the four rectangles that make up the letter E.

letter_e

Here is the pdf of my Jupyter notebook.

Here is the raw code for the program in Python:

#!/usr/bin/env python
# coding: utf-8

# # Project 1 – Introduction to Python scikit-image
# 
# ## Author
# Addison Sears-Collins
# ## Date Created
# 9/4/2019
# ## Python Version
# 3.7
# ## Description
# This program draws an E at the center of an input image.
# ## Purpose
# The purpose of this assignment is to introduce the basic functions of the Python scikit-image
# library -- a simple and popular open source library for image processing in Python. The scikitimage
# extends scipy.ndimage to provide a set of image processing routines including I/O, color
# and geometric transformations, segmentation, and other basic features.
# ## File Path

# In[1]:


# Move to the directory where the input images are located
get_ipython().run_line_magic('cd', 'D:\\Dropbox\\work')

# List the files in that directory
get_ipython().run_line_magic('ls', '')


# ## Code

# In[2]:


# Import scikit-image
import skimage

# Import module to read and write images in various formats
from skimage import io

# Import matplotlib functionality
import matplotlib.pyplot as plt

# Import numpy
import numpy as np

# Set the color of the E
# [red, green, blue]
COLOR_OF_E = [255, 0, 0]


# In[3]:


# Show the critical points of E
from IPython.display import Image
Image(filename = "e_critical_points.PNG", width = 200, height = 200)


# In[4]:


def e_generator(y_dim, x_dim):
    """
    Generates the coordinates of the E
    :param y_dim int: The y dimensions of the input image
    :param x_dim int: The x dimensions of the input image
    :return: The critical coordinates
    :rtype: list
    """
    # Set all the critical points
    A =  [int(0.407 * y_dim), int(0.423 *  x_dim)]
    B =  [int(0.407 * y_dim), int(0.589 *  x_dim)]
    C =  [int(0.488 * y_dim), int(0.423 *  x_dim)]
    D =  [int(0.488 * y_dim), int(0.589 *  x_dim)]
    E =  [int(0.572 * y_dim), int(0.423 *  x_dim)]
    F =  [int(0.572 * y_dim), int(0.581 *  x_dim)]
    G =  [int(0.657 * y_dim), int(0.423 *  x_dim)]
    H =  [int(0.657 * y_dim), int(0.581 *  x_dim)]
    I =  [int(0.735 * y_dim), int(0.423 *  x_dim)]
    J =  [int(0.735 * y_dim), int(0.589 *  x_dim)]
    K =  [int(0.819 * y_dim), int(0.423 *  x_dim)]
    L =  [int(0.819 * y_dim), int(0.589 *  x_dim)]
    M =  [int(0.407 * y_dim), int(0.47 *  x_dim)]
    N =  [int(0.819 * y_dim), int(0.47 *  x_dim)]
    
    return A,B,C,D,E,F,G,H,I,J,K,L,M,N


# In[5]:


def plot_image_with_e(image, A, B, C, D, E, F, G, H, I, J, K, L, M, N):
    """
    Plots an E on an input image
    :param image: The input image
    :param A, B, etc. list: The coordinates of the critical points
    :return: image_with_e
    :rtype: image
    """
    # Copy the image
    image_with_e = np.copy(image)

    # Top horizontal rectangle
    image_with_e[A[0]:C[0], A[1]:B[1], :] = COLOR_OF_E 

    # Middle horizontal rectangle
    image_with_e[E[0]:G[0], E[1]:F[1], :] = COLOR_OF_E

    # Bottom horizontal rectangle
    image_with_e[I[0]:K[0], I[1]:J[1], :] = COLOR_OF_E

    # Vertical connector rectangle
    image_with_e[A[0]:K[0], A[1]:M[1], :] = COLOR_OF_E

    # Display image
    plt.imshow(image_with_e);

    return image_with_e


# In[6]:


def print_image_details(image):
    """
    Prints the details of an input image
    :param image: The input image
    """
    print("Size: ", image.size)
    print("Shape: ", image.shape)
    print("Type: ", image.dtype)
    print("Max: ", image.max())
    print("Min: ", image.min())


# In[7]:


def compare(original_image, annotated_image):
    """
    Compare two images side-by-side
    :param original_image: The original input image
    :param annotated_image: The annotated-version of the original input image
    """
    # Compare the two images side-by-side
    f, (ax0, ax1) = plt.subplots(1, 2, figsize=(20,10))

    ax0.imshow(original_image)
    ax0.set_title('Original', fontsize = 18)
    ax0.axis('off')

    ax1.imshow(annotated_image)
    ax1.set_title('Annotated', fontsize = 18)
    ax1.axis('off')


# In[8]:


# Load the test image
image = io.imread("test_image.jpg")

# Store the y and x dimensions of the input image
y_dimensions = image.shape[0]
x_dimensions = image.shape[1]

# Print the image details
print_image_details(image)

# Display the image
plt.imshow(image);


# In[9]:


# Set all the critical points of the image
A,B,C,D,E,F,G,H,I,J,K,L,M,N = e_generator(y_dimensions, x_dimensions)

# Plot the image with E and store it
image_with_e = plot_image_with_e(image, A, B, C, D, E, F, G, H, I, J, K, L, M, N)

# Save the output image
plt.imsave('test_image_annotated.jpg', image_with_e)


# In[10]:


compare(image, image_with_e)


# In[11]:


# Load the first image
image = io.imread("architecture_roof_buildings_baked.jpg")

# Store the y and x dimensions of the input image
y_dimensions = image.shape[0]
x_dimensions = image.shape[1]

# Print the image details
print_image_details(image)

# Display the image
plt.imshow(image);


# In[12]:


# Set all the critical points of the image
A,B,C,D,E,F,G,H,I,J,K,L,M,N = e_generator(y_dimensions, x_dimensions)

# Plot the image with E and store it
image_with_e = plot_image_with_e(image, A, B, C, D, E, F, G, H, I, J, K, L, M, N)

# Save the output image
plt.imsave('architecture_roof_buildings_baked_annotated.jpg', image_with_e)


# In[13]:


compare(image, image_with_e)


# In[14]:


# Load the second image
image = io.imread("statue.jpg")

# Store the y and x dimensions of the input image
y_dimensions = image.shape[0]
x_dimensions = image.shape[1]

# Print the image details
print_image_details(image)

# Display the image
plt.imshow(image);


# In[15]:


# Set all the critical points of the image
A,B,C,D,E,F,G,H,I,J,K,L,M,N = e_generator(y_dimensions, x_dimensions)

# Plot the image with E and store it
image_with_e = plot_image_with_e(image, A, B, C, D, E, F, G, H, I, J, K, L, M, N)

# Save the output image
plt.imsave('statue_annotated.jpg', image_with_e)


# In[16]:


compare(image, image_with_e)


# In[ ]:

Example

Before

statue

After

statue_annotated

Biometric Fingerprint Scanner | Image Processing Applications

In this post, I will discuss an application that relies heavily on image processing.

Biometric Fingerprint Scanner

Description

I am currently in an apartment complex that has no doorman. Instead, in order to enter the property, you have to scan your fingerprint on the fingerprint scanner next to the entry gate on the main walkway that enters the complex.

How It Works

In order to use the fingerprint scanner, I had to go to the administration for the apartment complex and have all of my fingerprints scanned. The receptionist took my hand and individually scanned each finger on both hands until clear digital images were produced of all my fingers. There were a number of times when the receptionist tried to scan one of my fingers, and the fingerprints failed to be read by the scanner. In these situations, I had to rotate my finger to the left and to the right until the scanner beeped, indicating that it had registered a clear digital image of my fingerprint. This whole process took about 20 minutes. 

After registering all of my fingerprints, I went to the front entry door to test if I was able to enter using my finger. I typically use my thumb to enter since it provides the biggest fingerprint image and is easier for the machine to read.

Once everything was all set up, I was able enter the building freely, using only my finger.

Strengths

The strength of the fingerprint scanning system is that it is totally keyless entry. I do not need to carry multiple keys in order to enter the building, the swimming pool, and the gym. Traditionally, I had to have separate keys for each of the doors that entered into common areas of the community. Now with the biometric fingerprint scanner all I needed to do was scan my fingerprint on any of the doors, and I could access anywhere in the complex. 

Keyless entry also comes in handy because I often lose my keys, or I forget my keys inside the house. Your fingers, fortunately, go wherever you go.

Another strength of using a biometric fingerprint scanner is that it is more environmentally friendly. For the creation of a key, metal needs to be extracted from the Earth somewhere. 

One final strength of the biometric fingerprint scanner is that it is easy when I have guests come to town. I do not need to create a spare key or give them a copy of my key. All I need to do is take them down to the administration and have their fingerprints registered.

Weaknesses

One of the main weaknesses of this keyless entry is that it is not sanitary. Not everybody has the cleanest hands. When everybody in the entire complex is touching the fingerprint scanner, bacteria can really build up. Facial recognition would be a good alternative to solving this problem because I wouldn’t actually have to touch anything upon entry. 

I’m also not sure how secure the fingerprint scanner is. Imagine somebody who has been evicted out of their apartment for failure to pay rent. There has to be an ironclad management to make sure that as soon as somebody is evicted from their apartment, his or her fingerprints are automatically removed from the system. 

Another weakness is that the fingerprint scanner is not flawless. Often I have to try five or even sometimes six times using different angles of my thumb and forefinger to register a successful reading to enter the door. The fingerprint scanner is highly sensitive to the way in which you put your finger on the scanner. A slight twist to the left or right might not register a successful reading. 

Also, for some odd reason, when I return from a long vacation, the fingerprint scanner never reads my fingerprints accurately. This happens because the administration likes to reset the fingerprint scanner every so often. When this happens, I have to go to the administration to re-register my fingerprints.

While I like the biometric fingerprint scanner, other techniques are a lot more foolproof. For example, typing in a PIN code will work virtually 100% of the time I try to open the door. Whereas with a fingerprint scanner, putting my fingerprint on the scanner opens the door at most 80 to 90% of the time on the first try.

How to Set Up Anaconda for Windows 10

In this post, I will show you how to set up Anaconda. Anaconda is a free, open-source distribution of Python (and R). The goal of Anaconda is to be a free “one-stop-shop” for all your Python data science and machine learning needs. It contains the key packages you need to build cool machine learning projects.

Requirements

Here are the requirements:

  • Set up Anaconda.
  • Set up Jupyter Notebook.
  • Install important libraries.
  • Learn basic Anaconda commands.

Directions

Install Anaconda

Go to the Anaconda website and click “Download.”

setting-up-anaconda-1

Choose the latest version of Python. In my case, that is Python 3.7. Click “Download” to download the Anaconda installer for your operating system type (i.e. Windows, macOS, or Linux). 

setting-up-anaconda-2

Follow the instructions to install the program:

setting-up-anaconda-3
setting-up-anaconda-4
setting-up-anaconda-5
setting-up-anaconda-6

Verify Anaconda is installed by searching for “Anaconda Navigator” on your computer.

Open Anaconda Navigator.

setting-up-anaconda-7

Follow the instructions here for creating a “Hello World” program. You can use Spyder, Jupyter Notebooks, or the Anaconda Prompt (terminal). If you use Jupyter Notebooks, you will need to open the notebooks in Firefox, Google Chrome or another web browser.

Check to make sure that you have IPython installed. Use the following command (in an Anaconda Prompt window) to check:

where ipython
setting-up-anaconda-8

Make sure that you have pip installed. Pip is the package management system for Python.

where pip
setting-up-anaconda-9

Make sure that you have conda installed. Conda is Anaconda’s package management system.

where conda
setting-up-anaconda-10

Install Some Libraries

Install OpenCV

To install OpenCV, use the following command in the Anaconda Prompt:

pip install opencv-contrib-python

Type the following command to get a list of packages and make sure opencv-contrib-python is installed.

conda list

Install dlib

Install cmake.

pip install cmake

Install the C++ library called dlib

pip install dlib
setting-up-anaconda-11

Type the following command and take a look at the list to see if dlib is successfully installed:

conda list

Install Tesseract

Go to Tesseract at UB Mannheim.

Download the Tesseract for your system.

Set it up by following the prompts.

setting-up-anaconda-12

Once Tesseract OCR is downloaded, find it on your system.

Copy the name of the file it is located in. In my case, that is:

C:\Program Files\Tesseract-OCR

Search for “Environment Variables” on your computer.

setting-up-anaconda-13

Under “System Variables,” click “Path,” and then click Edit.

Add the path: C:\Program Files\Tesseract-OCR

setting-up-anaconda-14

Click OK a few times to close all windows.

Open up the Anaconda Prompt.

Type this command to see if tesseract is installed on your system.

where tesseract

Now, apply the Python binding to the packages using the following commands:

pip install tesseract
pip install pytesseract

Install TensorFlow

Type the following command in the Anaconda Prompt:

pip install tensorflow

Install TensorFlow hub using this command:

pip install tensorflow-hub

Now install tflearn.

pip install tflearn

Now, install the Keras neural network library.

pip install keras

Install the Rest of the Libraries

Type the following commands to install the rest of the libraries:

pip install pillow
pip install SimpleITK

Learn Basic Anaconda Commands

Changing Directories

If you ever want to change directories to the D drive instead of the C drive, open Anaconda Prompt on your computer and type the following commands, in order

D:
cd D:\XXXX\XXXX\XXXX\XXXX

where D:\XXXX\XXXX\XXXX\XXXX is the file path.

Listing Directory Contents

Type the dir command to list the contents of a directory.

dir

Creating a Jupyter Notebook

Open Anaconda Prompt on your computer and type the following command:

jupyter notebook

Converting a Jupyter Notebook into Python Files

If you want to convert a Jupyter Notebook into Python files, change to that directory and type the following command:

jupyter nbconvert --to script *.ipynb

Congratulations if you made it this far! You have all the libraries installed that you need to do fundamental image processing and computer vision work in Python.

K-Nearest Neighbors Algorithm From Scratch | Machine Learning

In this post, I will walk you through the k-nearest neighbors algorithm (k-NN classification and k-NN regression), step-by-step. We will develop the code for the algorithm from scratch using Python. We will then run the algorithm on a real-world data set, the image segmentation data set from the UCI Machine Learning Repository. The purpose of the data set is to classify the instances into seven different outdoor images (e.g. sky, foliage, cement, window, path, grass, etc.) based on pixel data. This data set gives a taste of computer vision mixed with machine learning.

Without further ado, let’s get started!

Table of Contents

What is the K-Nearest Neighbors Algorithm?

The K-Nearest Neighbors Algorithm is a supervised learning algorithm (the target variable we want to predict is available) that is used for both classification and regression problems. Let’s see how it works for both types of problems.

K-Nearest Neighbors Classification

For the k-NN classification algorithm, we have N training instances. Let us suppose that these training instances fall into one of two classes, A and B.

1-knn

We want to classify a new instance C. We do this by identifying the k nearest neighbors of C and classifying C based on the class that had the most nearest neighbors to C (out of a total of k nearest neighbors).

For example, let k = 3. The three nearest neighbors of C are A, A, and B. Since A had the most nearest neighbors, C belongs to class A.

How do we select k?

A common method for choosing k is to take the square root of N, the number of instances and subtract 1 if that number is odd. However, in this project, we will tune the value for k and record the results to see which value for k achieved optimal performance as measured by either classification accuracy or mean squared error.

How do we determine the k nearest neighbors?

A common method for measuring distance is to use the Euclidean distance formula. That is, given points p =(p1,p2,…,pn) and q = (q1,q2,…,qn), the distance d between p and q is:

d(p,q) = √((q1 – p1)2 + (q2-p2)2 + …+(qn – pn)2)

This is a version of the Pythagorean Theorem.

In short, the algorithm for k-NN classification is as follows. For each test instance, we:

  1. Compute the distance to every training instance
  2. Select the k closest instances and their class
  3. Output the class that occurs most frequently among the k closest instances

Special notes on k-NN classification:

  • Select an odd value for k for two-class problems in order to avoid ties.
  • Complexity of the algorithm depends on finding the nearest neighbor for each training instance.
  • k-NN needs labeled data sets.

k-NN is known as a lazy learning algorithm because a model is not built until the time a test instance arrives. Thus, there is no prior training on the training data as would be the case with an eager learning algorithm.

Also k-NN is a non-parametric algorithm because it does not assume any particular form of the data set (unlike algorithms like linear regression, logistic regression, etc.). The only assumption is that Euclidean distance can be consistently calculated between points.

K-Nearest Neighbors Regression

k-NN regression is a minor modification of k-NN classification. In k-NN regression, we are trying to predict a real number instead of a class. Instead of classifying a test instance based on the most frequently occurring class among the k nearest neighbors, we take the average of the target variable of the k nearest neighbors.

For example, let’s suppose we wanted to predict the age of someone given their weight. We have the following data:

2-knn

The question is: how old is someone given they weigh 175 pounds?

Suppose k = 3. We find the three nearest neighbors to 175. They are:

  • (170,50)
  • (180,52)
  • (156,43)

We take the average of the target variable:

Average = (50 + 52 + 43) / 3 = 48.3

This is our answer.

In short, the algorithm for k-NN regression is as follows. For each test instance, we:

  1. Compute the distance to every training instance
  2. Select the k closest instances and the values of their target variables
  3. Output the mean of the values of the target variables

Return to Table of Contents

K-Nearest Neighbors Algorithm Design

The first thing we need to do is preprocess the image segmentation data set so that it meets the input requirements of both algorithms. Below is the required data format. I downloaded the data into Microsoft Excel and then made sure I had the following columns:

Columns (0 through N)

  • 0: Instance ID
  • 1: Attribute 1
  • 2: Attribute 2
  • 3: Attribute 3
  •  …
  • N: Actual Class

Modification of Attribute Values

The image segmentation data set contains 19 attributes, 210 instances, and 7 classes. This data set was created from a database of seven outdoor images. Below are some of the modifications I made.

Classes were made numerical:

  • BRICKFACE = 1.0
  • SKY = 2.0
  • FOLIAGE = 3.0
  • CEMENT = 4.0
  • WINDOW = 5.0
  • PATH = 6.0
  • GRASS = 7.0

Region-pixel-count was removed since it was 9 for all instances.

I also normalized the attributes so that they are all between 0 and 1.

Here is what the first few columns of your data set should look like after all that preprocessing:

dataset-image-segment

Save the file as a csv file (comma-delimited), and load it into the program below (Python).

Here is a link to the final data set I used.

Return to Table of Contents

K-Nearest Neighbors Algorithm in Python, Coded From Scratch

Here is the full code for the k-nearest neighbors algorithm (Note that I used five-fold stratified cross-validation to produce the final classification accuracy statistics). You might want to copy and paste it into a document since it is pretty large and hard to see on a single web page:

import numpy as np # Import Numpy library

# File name: knn.py
# Author: Addison Sears-Collins
# Date created: 6/20/2019
# Python version: 3.7
# Description: Implementation of the k-nearest neighbors algorithm (k-NN)
# from scratch

# Required Data Set Format for Classification Problems:
# Must be all numerical
# Columns (0 through N)
# 0: Instance ID
# 1: Attribute 1 
# 2: Attribute 2
# 3: Attribute 3 
# ...
# N: Actual Class

# Required Data Set Format for Regression Problems:
# Must be all numerical
# Columns (0 through N)
# 0: Instance ID
# 1: Attribute 1 
# 2: Attribute 2
# 3: Attribute 3 
# ...
# N: Actual Class
# N + 1: Stratification Bin

class Knn:

    # Constructor
    #   Parameters:
    #     k: k value for k-NN
    #     problem_type: ('r' for regression or 'c' for classification)
    def __init__(self, k, problem_type):
        self.__k = k
        self.__problem_type = problem_type

    # Parameters:
    #   training_set: The folds used for training (2d numpy array)
    #   test_instance: The test instance we need to find the neighbors for 
    #                  (numpy array)
    # Returns: 
    #   The k most similar neighbors from the training set for a given 
    #   test instance. It will be a 2D numpy array where the first column 
    #   will hold the Actual class value of the training instance and the 
    #   second column will store the distance to the test instance...
    #   (actual class value, distance). 
    def get_neighbors(self, training_set, test_instance):

        # Record the number of training instances in the training set
        no_of_training_instances = np.size(training_set,0)

        # Record the number of columns in the training set
        no_of_training_columns = np.size(training_set,1)

        # Record the column index of the actual class of the training_set
        actual_class_column = None
        # If classification problem
        if self.__problem_type == "c":
            actual_class_column = no_of_training_columns - 1
        # If regression problem
        else:
            actual_class_column = no_of_training_columns - 2

        # Create an empty 2D array called actual_class_and_distance. This 
        # array should be the same length as the number of training instances. 
        # The first column will hold the Actual Class value of the training 
        # instance, and the second column will store the distance to the 
        # test instance...(actual class value, distance). 
        actual_class_and_distance = np.zeros((no_of_training_instances, 2))

        neighbors = None

        # For each row (training instance) in the training set
        for row in range(0, no_of_training_instances):

            # Record the actual class value in the 
            # actual_class_and_distance array (column 0)
            actual_class_and_distance[row,0] = training_set[
                row,actual_class_column]

            # Initialize a temporary training instance copied from this 
            # training instance
            temp_training_instance = np.copy(training_set[row,:])

            # Initialize a temporary test instance copied from this 
            # test_instance
            temp_test_instance = np.copy(test_instance)

            # If this is a classification problem
            if self.__problem_type == "c":
            
                # Update temporary training instance with Instance ID 
                # and the Actual Class pieces removed
                temp_training_instance = np.delete(temp_training_instance,[
                    0,actual_class_column])

                # Update temporary test instance with the Instance ID 
                # and the Actual Class pieces removed
                temp_test_instance = np.delete(temp_test_instance,[
                    0,actual_class_column])

            # If this is a regression problem
            else:

                # Update temporary training instance with Instance ID, Actual 
                # Class, and Stratification Bin pieces removed
                temp_training_instance = np.delete(temp_training_instance,[
                    0,actual_class_column,actual_class_column+1])

                # Update temporary test instance with the Instance ID, Actual
                # Class, and Stratification Bin pieces removed
                temp_test_instance = np.delete(temp_test_instance,[
                    0,actual_class_column,actual_class_column+1])

            # Calculate the euclidean distance from the temporary test
            # instance to the temporary training instance
            distance = np.linalg.norm(
                temp_test_instance - temp_training_instance)

            # Record the distance in the actual_class_and_distance 
            # array (column 1)
            actual_class_and_distance[row,1] = distance

        # Sort the actual_class_and_distance 2D array by the 
        # distance column (column 1) in ascending order.
        actual_class_and_distance = actual_class_and_distance[
                actual_class_and_distance[:,1].argsort()]

        k = self.__k

        # Extract the first k rows of the actual_class_and_distance array
        neighbors = actual_class_and_distance[:k,:]

        return neighbors

    # Generate a prediction based on the most frequent class or averaged 
    # target variable value of the neighbors
    # Parameters: 
    #   neighbors - 1D array (actual_class_value)
    # Returns:
    #   predicted_class_value
    def make_prediction(self, neighbors):
    
        prediction = None

        # If this is a classification problem
        if self.__problem_type == "c":
            
            #  Prediction is the most frequent value in column 0 of
            #  the neighbors array
            neighborsint = neighbors.astype(int)
            prediction = np.bincount(neighborsint).argmax()
            
        # If this is a regression problem
        else:

            # Prediction is the average of the neighbors array
            prediction = np.mean(neighbors)

        return prediction

    # Parameters: 
    #   actual_class_array
    #   predicted_class_array
    # Returns: 
    #   accuracy: Either classification accuracy (for 
    #   classification problems) or 
    #   mean squared error (for regression problems)
    def get_accuracy(self, actual_class_array, predicted_class_array):

        # Initialize accuracy variable
        accuracy = None

        # Initialize decision variable
        decision = None

        counter = None

        actual_class_array_size = actual_class_array.size

        # If this is a classification problem
        if self.__problem_type == "c":
            
            counter = 0

            # For each element in the actual class array
            for row in range(0,actual_class_array_size):

                # If actual class value is equal to value in predicted
                # class array
                if actual_class_array[row] == predicted_class_array[row]:
                    decision = "correct"
                    counter += 1
                else:
                    decision = "incorrect"

            classification_accuracy = counter / (actual_class_array_size)

            accuracy = classification_accuracy

        # If this is a regression problem
        else:

            # Initialize an empty squared error array. 
            # Needs to be same length as number of test instances
            squared_error_array = np.empty(actual_class_array_size)

            squared_error = None

            # For each element in the actual class array
            for row in range(0,actual_class_array_size):

                # Calculate the squared error
                squared_error = (abs((actual_class_array[
                        row] - predicted_class_array[row]))) 
                squared_error *= squared_error
                squared_error_array[row] = squared_error

            mean_squared_error = np.mean(squared_error_array)
            
            accuracy = mean_squared_error
        
        return accuracy

Here is the driver program that calls and executes the code above:

import pandas as pd # Import Pandas library 
import numpy as np # Import Numpy library
from five_fold_stratified_cv import FiveFoldStratCv
from knn import Knn

# File name: knn_driver.py
# Author: Addison Sears-Collins
# Date created: 6/20/2019
# Python version: 3.7
# Description: Driver for the knn.py program 
# (K-Nearest Neighbors)

# Required Data Set Format for Classification Problems:
# Must be all numerical
# Columns (0 through N)
# 0: Instance ID
# 1: Attribute 1 
# 2: Attribute 2
# 3: Attribute 3 
# ...
# N: Actual Class

# Required Data Set Format for Regression Problems:
# Must be all numerical
# Columns (0 through N)
# 0: Instance ID
# 1: Attribute 1 
# 2: Attribute 2
# 3: Attribute 3 
# ...
# N: Actual Class
# N + 1: Stratification Bin

################# INPUT YOUR OWN VALUES IN THIS SECTION ######################
ALGORITHM_NAME = "K-Nearest Neighbor"
SEPARATOR = ","  # Separator for the data set (e.g. "\t" for tab data)
##############################################################################

def main():

    print("Welcome to the " +  ALGORITHM_NAME + " program!")
    print()
    print("Running " + ALGORITHM_NAME + ". Please wait...")
    print()

    # k value that will be tuned
    k = eval(input("Enter a value for k: ") )

    # "c" for classification or "r" for regression
    problem = input("Press  \"c\" for classification or \"r\" for regression: ") 

    # Directory where data set is located
    data_path = input("Enter the path to your input file: ") 

    # Read the full text file and store records in a Pandas dataframe
    pd_data_set = pd.read_csv(data_path, sep=SEPARATOR)

    # Convert dataframe into a Numpy array
    np_data_set = pd_data_set.to_numpy(copy=True)

    # Show functioning of the program
    trace_runs_file = input("Enter the name of your trace runs file: ") 

    # Open a new file to save trace runs
    outfile_tr = open(trace_runs_file,"w") 

    # Testing statistics
    test_stats_file = input("Enter the name of your test statistics file: ") 

    # Open a test_stats_file 
    outfile_ts = open(test_stats_file,"w")

    # Create an object of class FiveFoldStratCv
    fivefolds1 = FiveFoldStratCv(np_data_set,problem)

    # The number of folds in the cross-validation
    NO_OF_FOLDS = 5 

    # Create an object of class Knn
    knn1 = Knn(k,problem) 

    # Generate the five stratified folds
    fold0, fold1, fold2, fold3, fold4 = fivefolds1.get_five_folds()

    training_dataset = None
    test_dataset = None

    # Create an empty array of length 5 to store the accuracy_statistics 
    # (classification accuracy for classification problems or mean squared
    # error for regression problems)
    accuracy_statistics = np.zeros(NO_OF_FOLDS)

    # Run k-NN the designated number of times as indicated by the number of folds
    for experiment in range(0, NO_OF_FOLDS):

        print()
        print("Running Experiment " + str(experiment + 1) + " ...")
        print()
        outfile_tr.write("Running Experiment " + str(experiment + 1) + " ...\n")
        outfile_tr.write("\n")

        # Each fold will have a chance to be the test data set
        if experiment == 0:
            test_dataset = fold0
            training_dataset = np.concatenate((
                fold1, fold2, fold3, fold4), axis=0)
        elif experiment == 1:
            test_dataset = fold1
            training_dataset = np.concatenate((
                fold0, fold2, fold3, fold4), axis=0)
        elif experiment == 2:
            test_dataset = fold2
            training_dataset = np.concatenate((
                fold0, fold1, fold3, fold4), axis=0)
        elif experiment == 3:
            test_dataset = fold3
            training_dataset = np.concatenate((
                fold0, fold1, fold2, fold4), axis=0)
        else:
            test_dataset = fold4
            training_dataset = np.concatenate((
                fold0, fold1, fold2, fold3), axis=0)

        # Actual class column index of the test dataset
        actual_class_column = None           
     
        # If classification problem
        if problem == "c":
            actual_class_column = np.size(test_dataset,1) - 1
        # If regression problem
        else:
            actual_class_column = np.size(test_dataset,1) - 2

        # Create an array of the actual_class_values of the test instances
        actual_class_values = test_dataset[:,actual_class_column]

        no_of_test_instances = np.size(test_dataset,0)

        # Make an empty array called predicted_class_values which will 
        # store the predicted class values. It should be the same length 
        # as the number of test instances
        predicted_class_values = np.zeros(no_of_test_instances)

        # For each row in the test data set
        for row in range(0, no_of_test_instances):
  
            # Neighbor array is a 2D array containing the neighbors 
            # (actual class value, distance) 
            # Get the k nearest neighbors for each test instance
            this_instance = test_dataset[row,:]
            neighbor_array = knn1.get_neighbors(training_dataset,this_instance)

            # Extract the actual class values
            neighbors_arr = neighbor_array[:,0]
  
            # Predicted class value stored in the variable prediction
            prediction = knn1.make_prediction(neighbors_arr)
  
            # Record the prediction in the predicted_class_values array
            predicted_class_values[row] = prediction

        # Calculate the classification accuracy of the predictions 
        # (k-NN classification) or the mean squared error (k-NN regression)
        accuracy = knn1.get_accuracy(actual_class_values,predicted_class_values)

        # Store the accuracy in the accuracy_statistics array
        accuracy_statistics[experiment] = accuracy

        # If classification problem
        if problem == "c":
            temp_acc = accuracy * 100
            outfile_tr.write("Classification Accuracy: " + str(temp_acc) + "%\n")
            outfile_tr.write("\n")
            print("Classification Accuracy: " + str(temp_acc) + "%\n")
        # If regression problem
        else:
            outfile_tr.write("Mean Squared Error: " + str(accuracy) + "\n")
            outfile_tr.write("\n")
            print("Mean Squared Error: " + str(accuracy) + "\n")

    outfile_tr.write("Experiments Completed.\n")
    print("Experiments Completed.")
    print()

    # Write to a file
    outfile_ts.write("----------------------------------------------------------\n")
    outfile_ts.write(ALGORITHM_NAME + " Summary Statistics\n")
    outfile_ts.write("----------------------------------------------------------\n")
    outfile_ts.write("Data Set : " + data_path + "\n")
    outfile_ts.write("\n")
    outfile_ts.write("Accuracy Statistics for All 5 Experiments:")
    outfile_ts.write(np.array2string(
        accuracy_statistics, precision=2, separator=',',
        suppress_small=True))
    outfile_ts.write("\n")

    # Write the relevant stats to a file
    outfile_ts.write("\n")

    if problem == "c":
        outfile_ts.write("Problem Type : Classification" + "\n")
    else:
        outfile_ts.write("Problem Type : Regression" + "\n")

    outfile_ts.write("\n")
    outfile_ts.write("Value for k : " + str(k) + "\n")
    outfile_ts.write("\n")

    accuracy = np.mean(accuracy_statistics)

    if problem == "c":
        accuracy *= 100
        outfile_ts.write("Classification Accuracy : " + str(accuracy) + "%\n")
    else: 
        outfile_ts.write("Mean Squared Error : " + str(accuracy) + "\n")

    # Print to the console
    print()
    print("----------------------------------------------------------")
    print(ALGORITHM_NAME + " Summary Statistics")
    print("----------------------------------------------------------")
    print("Data Set : " + data_path)
    print()
    print()
    print("Accuracy Statistics for All 5 Experiments:")
    print(accuracy_statistics)
    print()
    # Write the relevant stats to a file
    print()

    if problem == "c":
        print("Problem Type : Classification")
    else:
        print("Problem Type : Regression")

    print()
    print("Value for k : " + str(k))
    print()
    if problem == "c":
        print("Classification Accuracy : " + str(accuracy) + "%")
    else: 
        print("Mean Squared Error : " + str(accuracy))

    print()

    # Close the files
    outfile_tr.close()
    outfile_ts.close()

main()

Return to Table of Contents

Output Statistics of the K-Nearest Neighbors Algorithm on the Image Segmentation Data Set

This section shows the results for the runs of the k-nearest neighbors algorithm on the image segmentation data set. I used a k-value of 4, but you can feel free to change this and see what accuracy value you get.

Test Statistics

knn_stats

Trace Runs

Here are the trace runs of the algorithm:

trace-runs-k-nearest-neighbors

Return to Table of Contents

Condensed K-Nearest Neighbors Algorithm

The time and space complexity of the regular k-nearest neighbors algorithm described above is directly proportional to the number of instances in the training set. This could potentially present a problem with massively large data sets. This is where the condensed k-nearest neighbors algorithm (ck-NN) comes in handy.

The idea behind ck-NN is to reduce the size of the training set by selecting the smallest subset of the training data set that results in no loss of classification accuracy. By systematically removing ineffective instances, we reduce the computation time as well as the storage requirement.

Here is how condensed k-NN works:

We start with an empty bin called STORE.

1. Place the first instance in STORE

2. Check whether the second instance can be classified correctly by 1-nearest neighbor using the instance in STORE as the reference.

3. Repeat step 2 for all other instances in the data set.

4. Repeat step 2 on the data set, doing continuous passes over the data set until either

OR

5. Use the instances in STORE as the input for the k-NN classification algorithm.

Here is the code in Python for ck-NN:

import numpy as np # Import Numpy library
from knn import Knn

# File name: cknn.py
# Author: Addison Sears-Collins
# Date created: 6/21/2019
# Python version: 3.7
# Description: Implementation of the condensed k-nearest neighbors 
# algorithm (k-NN) from scratch

# Required Data Set Format for Classification Problems:
# Must be all numerical
# Columns (0 through N)
# 0: Instance ID
# 1: Attribute 1 
# 2: Attribute 2
# 3: Attribute 3 
# ...
# N: Actual Class

class CondensedKnn:

    # Constructor
    #   Parameters:
    #     training_set: The training set that we need to prune
    #     problem_type: ('c' for classification)
    def __init__(self, training_set, problem_type="c"):
        self.__training_set = training_set
        self.__problem_type = problem_type

    # Parameters:
    #   None.
    # Returns: 
    #   A training set that has irrelevant instances removed 
    def get_trainingset(self):
        
        # Initialize a condensed training set. Copy it from the actual 
        # training set
        condensed_training_set = np.copy(self.__training_set)

        # Record the number of instances in the condensed training_set
        no_of_training_instances = np.size(condensed_training_set,0)

        # Record the number of columns in the condensed training set
        no_of_training_columns = np.size(condensed_training_set,1)

        # Print the initial number of instances in the condensed training set
        print("\nBefore condensing: " + str(
            no_of_training_instances) + " training instances\n")

        # Record the column index of the actual class of the training_set
        actual_class_column = no_of_training_columns - 1

        # Initialize an array named store with the first instance of the 
        # condensed training_set
        store = np.copy(condensed_training_set[0,:])

        # Create a 2D array
        new_row = np.copy(condensed_training_set[0,:])
        store = np.vstack([store,new_row])

        # For the second instance to the last instance in the condensed 
        # training_set
        row = 1
        while row < no_of_training_instances:
            
            # Record the actual class value
            actual_class_value = condensed_training_set[
                row,actual_class_column]

            # Create an object of class Knn
            knn1 = Knn(1,self.__problem_type) 

            # Neighbor array is a 2D array containing the neighbors 
            # (actual class value, distance) 
            # Get the nearest neighbor for each instance
            this_instance = condensed_training_set[row,:]
            neighbor_array = knn1.get_neighbors(store,this_instance)

            # Extract the actual class values
            neighbors_arr = neighbor_array[:,0]
  
            # Predicted class value stored in the variable prediction
            prediction = knn1.make_prediction(neighbors_arr)

            # If actual class value is not equal to the prediction
            # Append that instance to the store array
            # Remove this instance from the condensed training_set
            if actual_class_value != prediction:
                new_row = np.copy(this_instance)
                store = np.vstack([store,new_row])

                condensed_training_set = np.delete(
                    condensed_training_set, row, 0)
                no_of_training_instances -= 1
            
            row += 1

        # Declare the stopping criteria. We stop when either one complete
        # pass is made through the condensed training set with no more 
        # transfers of instances to store or there are no more instances 
        # remaining in the condensed training data set
  
        no_more_transfers_to_store = False
        no_more_instances_left = None
  
        # Update the number of instances in the condensed training_set
        no_of_training_instances = np.size(condensed_training_set,0)
        
        if no_of_training_instances > 0:
            no_more_instances_left = False
        else:
            no_more_instances_left = True

        while not(no_more_transfers_to_store) and not(no_more_instances_left):
            # Reset the number of transfers_made to 0
            transfers_made = 0

            # For the second instance to the last instance in the condensed 
            # training_set
            row = 0
            while row < no_of_training_instances:

                # Record the actual class value
                actual_class_value = condensed_training_set[
                    row,actual_class_column]

                # Create an object of class Knn
                knn1 = Knn(1,self.__problem_type) 

                # Neighbor array is a 2D array containing the neighbors 
                # (actual class value, distance) 
                # Get the nearest neighbor for each instance
                this_instance = condensed_training_set[row,:]
                neighbor_array = knn1.get_neighbors(store,this_instance)

                # Extract the actual class values
                neighbors_arr = neighbor_array[:,0]
  
                # Predicted class value stored in the variable prediction
                prediction = knn1.make_prediction(neighbors_arr)

                # If actual class value is not equal to the prediction
                # Append that instance to the store array
                # Remove this instance from the condensed training_set
                if actual_class_value != prediction:
                    new_row = np.copy(this_instance)
                    store = np.vstack([store,new_row])

                    condensed_training_set = np.delete(
                        condensed_training_set, row, 0)
                    no_of_training_instances -= 1
                    transfers_made += 1
            
                row += 1
        
            # Update the number of instances in the condensed training_set
            no_of_training_instances = np.size(condensed_training_set,0)

            if no_of_training_instances > 0:
                no_more_instances_left = False
            else:
                no_more_instances_left = True

            if transfers_made > 0:
                no_more_transfers_to_store = False
            else: 
                no_more_transfers_to_store = True

        # Delete row 0 from the store
        store = np.delete(store,0,0)

        # Print the final number of instances in the store
        print("After condensing: " + str(
            np.size(store,0)) + " training instances\n")
        return store

Here is the code (Python) for the driver that executes the program above. It uses the regular k-NN code I presented earlier in this post as well as the five-fold stratified cross-validation code:

import pandas as pd # Import Pandas library 
import numpy as np # Import Numpy library
from five_fold_stratified_cv import FiveFoldStratCv
from knn import Knn
from cknn import CondensedKnn

# File name: cknn_driver.py
# Author: Addison Sears-Collins
# Date created: 6/21/2019
# Python version: 3.7
# Description: Driver for the cknn.py program 
# (Condensed K-Nearest Neighbors)

# Required Data Set Format for Classification Problems:
# Must be all numerical
# Columns (0 through N)
# 0: Instance ID
# 1: Attribute 1 
# 2: Attribute 2
# 3: Attribute 3 
# ...
# N: Actual Class

################# INPUT YOUR OWN VALUES IN THIS SECTION ######################
ALGORITHM_NAME = "Condensed K-Nearest Neighbor"
SEPARATOR = ","  # Separator for the data set (e.g. "\t" for tab data)
##############################################################################

def main():

    print("Welcome to the " +  ALGORITHM_NAME + " program!")
    print()
    print("Running " + ALGORITHM_NAME + ". Please wait...")
    print()

    # k value that will be tuned
    k = eval(input("Enter a value for k: ") )

    # "c" for classification
    problem = input("Press  \"c\" for classification: ") 

    # Directory where data set is located
    data_path = input("Enter the path to your input file: ") 

    # Read the full text file and store records in a Pandas dataframe
    pd_data_set = pd.read_csv(data_path, sep=SEPARATOR)

    # Convert dataframe into a Numpy array
    np_data_set = pd_data_set.to_numpy(copy=True)

    # Show functioning of the program
    trace_runs_file = input("Enter the name of your trace runs file: ") 

    # Open a new file to save trace runs
    outfile_tr = open(trace_runs_file,"w") 

    # Testing statistics
    test_stats_file = input("Enter the name of your test statistics file: ") 

    # Open a test_stats_file 
    outfile_ts = open(test_stats_file,"w")

    # Create an object of class FiveFoldStratCv
    fivefolds1 = FiveFoldStratCv(np_data_set,problem)

    # The number of folds in the cross-validation
    NO_OF_FOLDS = 5 

    # Create an object of class Knn
    knn1 = Knn(k,problem) 

    # Generate the five stratified folds
    fold0, fold1, fold2, fold3, fold4 = fivefolds1.get_five_folds()

    training_dataset = None
    test_dataset = None

    # Create an empty array of length 5 to store the accuracy_statistics 
    # (classification accuracy for classification problems or mean squared
    # error for regression problems)
    accuracy_statistics = np.zeros(NO_OF_FOLDS)

    # Run k-NN the designated number of times as indicated by the number of folds
    for experiment in range(0, NO_OF_FOLDS):

        print()
        print("Running Experiment " + str(experiment + 1) + " ...")
        print()
        outfile_tr.write("Running Experiment " + str(experiment + 1) + " ...\n")
        outfile_tr.write("\n")

        # Each fold will have a chance to be the test data set
        if experiment == 0:
            test_dataset = fold0
            training_dataset = np.concatenate((
                fold1, fold2, fold3, fold4), axis=0)
        elif experiment == 1:
            test_dataset = fold1
            training_dataset = np.concatenate((
                fold0, fold2, fold3, fold4), axis=0)
        elif experiment == 2:
            test_dataset = fold2
            training_dataset = np.concatenate((
                fold0, fold1, fold3, fold4), axis=0)
        elif experiment == 3:
            test_dataset = fold3
            training_dataset = np.concatenate((
                fold0, fold1, fold2, fold4), axis=0)
        else:
            test_dataset = fold4
            training_dataset = np.concatenate((
                fold0, fold1, fold2, fold3), axis=0)
        
        # Create an object of class CondensedKnn
        cknn1 = CondensedKnn(training_dataset,problem)

        # Get a new, smaller training set with fewer instances
        training_dataset = cknn1.get_trainingset()

        # Actual class column index of the test dataset
        actual_class_column = np.size(test_dataset,1) - 1

        # Create an array of the actual_class_values of the test instances
        actual_class_values = test_dataset[:,actual_class_column]

        no_of_test_instances = np.size(test_dataset,0)

        # Make an empty array called predicted_class_values which will 
        # store the predicted class values. It should be the same length 
        # as the number of test instances
        predicted_class_values = np.zeros(no_of_test_instances)

        # For each row in the test data set
        for row in range(0, no_of_test_instances):
  
            # Neighbor array is a 2D array containing the neighbors 
            # (actual class value, distance) 
            # Get the k nearest neighbors for each test instance
            this_instance = test_dataset[row,:]
            neighbor_array = knn1.get_neighbors(training_dataset,this_instance)

            # Extract the actual class values
            neighbors_arr = neighbor_array[:,0]
  
            # Predicted class value stored in the variable prediction
            prediction = knn1.make_prediction(neighbors_arr)
  
            # Record the prediction in the predicted_class_values array
            predicted_class_values[row] = prediction

        # Calculate the classification accuracy of the predictions 
        # (k-NN classification) or the mean squared error (k-NN regression)
        accuracy = knn1.get_accuracy(actual_class_values,predicted_class_values)

        # Store the accuracy in the accuracy_statistics array
        accuracy_statistics[experiment] = accuracy

        # Stats
        temp_acc = accuracy * 100
        outfile_tr.write("Classification Accuracy: " + str(temp_acc) + "%\n")
        outfile_tr.write("\n")
        print("Classification Accuracy: " + str(temp_acc) + "%\n")


    outfile_tr.write("Experiments Completed.\n")
    print("Experiments Completed.")
    print()

    # Write to a file
    outfile_ts.write("----------------------------------------------------------\n")
    outfile_ts.write(ALGORITHM_NAME + " Summary Statistics\n")
    outfile_ts.write("----------------------------------------------------------\n")
    outfile_ts.write("Data Set : " + data_path + "\n")
    outfile_ts.write("\n")
    outfile_ts.write("Accuracy Statistics for All 5 Experiments:")
    outfile_ts.write(np.array2string(
        accuracy_statistics, precision=2, separator=',',
        suppress_small=True))
    outfile_ts.write("\n")

    # Write the relevant stats to a file
    outfile_ts.write("\n")

    outfile_ts.write("Problem Type : Classification" + "\n")

    outfile_ts.write("\n")
    outfile_ts.write("Value for k : " + str(k) + "\n")
    outfile_ts.write("\n")

    accuracy = np.mean(accuracy_statistics)

    accuracy *= 100
    outfile_ts.write("Classification Accuracy : " + str(accuracy) + "%\n")
   
    # Print to the console
    print()
    print("----------------------------------------------------------")
    print(ALGORITHM_NAME + " Summary Statistics")
    print("----------------------------------------------------------")
    print("Data Set : " + data_path)
    print()
    print()
    print("Accuracy Statistics for All 5 Experiments:")
    print(accuracy_statistics)
    print()
    # Write the relevant stats to a file
    print()

    print("Problem Type : Classification")
    print()
    print("Value for k : " + str(k))
    print()
    print("Classification Accuracy : " + str(accuracy) + "%")
    print()

    # Close the files
    outfile_tr.close()
    outfile_ts.close()

main()

Here are the results for those runs as well as a comparison to regular k-NN:

cknn-stats
knn-cknn-comparison

Classification accuracy was greater than 80% for both the k-NN and ck-NN runs. The classification accuracy for the k-NN runs was about 4 percentage points greater than ck-NN. However, the variability of the classification accuracy for the five experiments in the ck-NN runs was greater than for the k-NN runs.

In a real-world setting for classification, I would select the k-NN algorithm over ck-NN because its performance is more consistent. In addition, while ck-NN might result in improved run times for the actual k-NN runs, there is still the prior overhead associated with having to prune the training set of irrelevant instances.

Return to Table of Contents

How to Make an Object Tracking Robot Using Raspberry Pi

In this post, I will show you how to give your wheeled robot the ability to follow a colored ball. You will get your first taste of computer vision and image processing.

Requirements

Here are the requirements:

  • Build a wheeled robot powered by Raspberry Pi that must identify and follow a yellow rubber ball using OpenCV, a library of programming functions for real-time computer vision and image processing.

You Will Need

The following components are used in this project. You will need:

Directions

Connecting the Raspberry Pi Camera Module

Make sure the Raspberry Pi is turned OFF.

Open the Camera Serial Interface on the Raspberry Pi. It is located next to the 3.5mm audio jack. Pull it upwards delicately from either side.

Insert the ribbon of the camera module into the Camera Serial Interface. Make sure the silver contacts face away from the 3.5mm audio jack.

2019-06-03-194539

Hold the ribbon in place while pushing down on the Camera Serial Interface port. Make sure it is closed.

2019-06-03-194547

Mount the camera to the front of the robot.

2019-06-03-200735

Power up Raspberry Pi.

Open up a configuration window:

sudo raspi-config

Interfacing Options –> ENTER –> Camera –> ENTER –> Yes

The camera is enabled.

Now, we need to set the resolution.

Advanced Options –> Resolution –> DMT Mode 82 1920×1080 60Hz 16: 9 –> ENTER –> Finish

Restart the Raspberry Pi by typing the following in a terminal window.

sudo reboot

Testing the Raspberry Pi Camera Module.

We need to take a test photo with our newly installed camera module.

Open a terminal window. Type the following command:

raspistill -o test_photo.jpg

Go to your home directory to see if the test photo is there. Here is the photo that mine took (back of my head).

test-camera

Setting Up Object Tracking

Now, we need to configure our system so the robot can track a yellow rubber ball.

Download the dependencies for OpenCV, a library of programming functions for real-time computer vision and image processing.

Type the following command into a terminal window:

sudo apt-get update
sudo apt-get install libblas-dev libatlas-base-dev libjasper-dev libqtgui4 libqt4-test

Y –> ENTER.

Wait a few moments while everything installs.

Install OpenCV using pip.

sudo pip3 install opencv-python
installing-open-cvPNG

Install the PiCamera library.

sudo apt-get install python3-picamera

Determining the HSV Value of the Yellow Ball

We need to select an appropriate HSV (hue, saturation, value) value for the yellow ball. HSV is an alternative color representation that is frequently used instead of the RGB (Red Green Blue) color model I covered in my light and sound wheeled robot post.

Here is the HSV table.

512px-HSV_color_solid_cylinder_saturation_gray

Since the ball is yellow, I’ll choose 60 as my starting number.

Open IDLE in your Raspberry Pi, and create a new file in your robot directory. Name it color_tester.py.

Here is the code for the program:

# import the necessary packages
from picamera.array import PiRGBArray
from picamera import PiCamera
import time
import cv2
import numpy as np


# initialize the camera and grab a reference to the raw camera capture
camera = PiCamera()
camera.resolution = (640, 480)
camera.framerate = 32
rawCapture = PiRGBArray(camera, size=(640, 480))

while True:
	while True:
		try:
			hue_value = int(input("Hue value between 10 and 245: "))
			if (hue_value < 10) or (hue_value > 245):
				raise ValueError
		except ValueError:
			print("That isn't an integer between 10 and 245, try again")
		else:
			break

	lower_red = np.array([hue_value-10,100,100])
	upper_red = np.array([hue_value+10, 255, 255])

	for frame in camera.capture_continuous(rawCapture, format="bgr", use_video_port=True):
		image = frame.array

		hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

		color_mask = cv2.inRange(hsv, lower_red, upper_red)

		result = cv2.bitwise_and(image, image, mask= color_mask)

		cv2.imshow("Camera Output", image)
		cv2.imshow("HSV", hsv)
		cv2.imshow("Color Mask", color_mask)
		cv2.imshow("Final Result", result)

		rawCapture.truncate(0)

		k = cv2.waitKey(5) #&amp; 0xFF
		if "q" == chr(k &amp; 255):
			break

Place your ball about a yard in front of the camera.

2019-06-03-203156

Run the newly created program.

python3 color_tester.py

Choose 60.

You will see four windows.

  • Window 1. RGB representation
  • Window 2: HSV representation
  • Window 3: Show the portions of the frame that match a hue value of 60.
  • Window 4: Entire frame minus all portions that do NOT have a 60 hue value.
open-cv-robot-yellow-ballPNG

To try a different hue value, select any of the four windows above. Press Q to halt the output of the video.

Go to the terminal window, and try a new hue valve. I’ll try 29 this time. It worked!

You keep trying different numbers until Window 4 shows mostly your ball and nothing else. Be patient and try LOTS of numbers.

try-try-againPNG

Write down the hue value you ended up with on a sheet of paper.

Press CTRL-C in the terminal window to stop running color_tester.py.

Coding the Ball-Following Program

Open IDLE. Create a new file in your robot directory named:

ball_following_yellow. py

Here is the code (Credit to Matt Timmons-Brown, the author of a really good book on Raspberry Pi robotics: (Learn Robotics with Raspberry Pi):

from picamera.array import PiRGBArray
from picamera import PiCamera
import cv2
import numpy as np
import gpiozero

camera = PiCamera()
image_width = 640
image_height = 480
camera.resolution = (image_width, image_height)
camera.framerate = 32
rawCapture = PiRGBArray(camera, size=(image_width, image_height))
center_image_x = image_width / 2
center_image_y = image_height / 2
minimum_area = 250
maximum_area = 100000

robot = gpiozero.Robot(left=(22,27), right=(17,18))
forward_speed = 1.0
turn_speed = 0.8

HUE_VAL = 29

lower_color = np.array([HUE_VAL-10,100,100])
upper_color = np.array([HUE_VAL+10, 255, 255])

for frame in camera.capture_continuous(rawCapture, format="bgr", use_video_port=True):
	image = frame.array

	hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

	color_mask = cv2.inRange(hsv, lower_color, upper_color)

	image2, countours, hierarchy = cv2.findContours(color_mask, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

	object_area = 0
	object_x = 0
	object_y = 0

	for contour in countours:
		x, y, width, height = cv2.boundingRect(contour)
		found_area = width * height
		center_x = x + (width / 2)
		center_y = y + (height / 2)
		if object_area < found_area:
			object_area = found_area
			object_x = center_x
			object_y = center_y
	if object_area > 0:
		ball_location = [object_area, object_x, object_y]
	else:
		ball_location = None

	if ball_location:
		if (ball_location[0] > minimum_area) and (ball_location[0] < maximum_area):
			if ball_location[1] > (center_image_x + (image_width/3)):
				robot.right(turn_speed)
				print("Turning right")
			elif ball_location[1] < (center_image_x - (image_width/3)):
				robot.left(turn_speed)
				print("Turning left")
			else:
				robot.forward(forward_speed)
				print("Forward")
		elif (ball_location[0] < minimum_area):
			robot.left(turn_speed)
			print("Target isn't large enough, searching")
		else:
			robot.stop()
			print("Target large enough, stopping")
	else:
		robot.left(turn_speed)
		print("Target not found, searching")

	rawCapture.truncate(0)

Running the Ball – Following Program

Place your robot in an open space on the floor with the yellow rubber ball.

Run the program.

python3 ball_following_yellow.py

Whenever you want to stop the program, type CTRL-C.

Video