Advantages of the K-Nearest Neighbors Algorithm

For the k-nearest neighbors algorithm (k-NN), we store a set of instances and then classify each new instance based on the most common class among that instance’s k neighboring instances, where k is some positive integer.

Here are some reasons why we would consider using k-nearest neighbors instead of another model type:

Simplicity

k-NN is straightforward and easy to understand. If you need an algorithm that you can easily explain to a non-technical boss or project/product manager, k-NN is a good starting point.

Non-parametric

Instead of making a bunch of assumptions about the data (e.g. linearity, conditional independence, etc.), you let the data speak for itself rather than fixing a distribution to the data. The only thing that you need to do is set the value for k, the number of neighbors that you want to consider when classifying a new, unseen instance. The only thing you assume is that neighboring instances are similar to each other.

Noisy Data

The k-NN algorithm is robust to data that contains a lot of noise.

No Training Needed

k-NN is a lazy algorithm in that there is no training step. The lack of a training step means that you get to classify new instances right from the start.

Flexibility

If you need an algorithm that can do both regression and classification, k-NN can do that for you.

Multi-class Problems

If you need to classify data in which there are multiple classes instead of just two classes, k-NN can handle that. 

Continuous Improvement

In order to update k-NN, all you need to do is add the training instance. Since there is no training needed, k-NN does not need to build a completely new model for each training instance added. The more instances you have, the better k-NN can classify unseen instances.

Linear Separability and the XOR Problem

Classification algorithms like Logistic Regression and Naive Bayes only work on linearly separable problems.

What Does Linearly Separable Mean?

Consider a data set with two attributes x1 and x2 and two classes 0 and 1. Let class 0 = o and class 1 = x.

linearly-separable

A straight line (or plane) can be used to separate the two classes (i.e. the x’s from the o’s). In other words, a single decision surface can be used as the boundary between both classes.

When the data is nonlinearly separable, we cannot use linear models. We must instead use models that can handle nonlinearly separable data.

What Does Nonlinearly Separable Mean?

Here is an example. Consider the XOR function acting on training data that has two attributes x1 and x2 and a class (0 or 1).

The function x1 XOR x2 has the following truth table:

truth_table

Now let us graph this function. For the class, black circles = 1 and white circles = 0.

nonlinearly-separable

We could try all day to try to place a line on the graph that can separate the classes, and we still would not succeed. This problem is known as the XOR problem and is a clear example of when linear classifiers like Naive Bayes and Logistic Regression would not work, and we would need to consider other types of machine learning models that can handle nonlinearly separable data.

Difference Between Recursion and Iteration

If you are learning algorithms or data structures, you might have come across the terms recursion and iteration. Knowing the difference between the two can be confusing, so I will explain both terms using a real-world example.

Eating My Morning Bowl of Oatmeal

eating-oatmeal

Everyday before I go to work, I take a shower, get dressed, and then head over to the kitchen to fix myself a big bowl of oatmeal. Breakfast is my favorite meal of the day and the most important. Many days I feel like skipping breakfast so that I can go on with my day. However, on those rare occasions when I decide to skip breakfast, I have usually regretted it. I spend the morning hungry and cannot fully concentrate on the tasks at hand.

The problem of eating my morning bowl of oatmeal can be implemented both iteratively and recursively. Iteration is the act of repeating a process. So, in the case of oatmeal, an iterative way to solve the problem of eating my bowl of oatmeal would be to do the following:

  1. Take one bite of the oatmeal
  2. Repeat Step 1 until the bowl is empty.

Recursion, on the other hand, is when a method calls itself. It involves breaking a problem into smaller versions of the same problem until you reach a trivial stopping case.

Let’s consider the problem of eating a bowl of oatmeal. I can solve the problem of eating the entire bowl of oatmeal recursively by breaking it into two subproblems as follows:

  • Take one bite of oatmeal
  • Eat the rest of the bowl of oatmeal

Each step of this recursive procedure gets smaller and smaller until all I am left with is one more spoonful of oatmeal. Once I finish that last spoonful, I am done, and I can go on about my day.

Here is how I would implement the iterative and recursive procedures for eating a bowl of oatmeal in Java pseudocode:

Iterative Implementation

public static main void main(String[] args) {

    while (!bowl.empty()) {
        bowl.takeOneBite(); 
        // Keep taking bites over and over 
        // again until the bowl of oatmeal 
        // is empty
    }
}

Recursive Implementation

public static void eatOatmeal(Bowl bowl) {

    if (bowl.empty()) {  
    // This is our stopping case. 
    // We stop taking bites once 
    // the bowl is empty

        return;  
    }
    else {
        bowl.takeOneBite(); 
        eatOatmeal(bowl); // Recursive call
    }
}