Detect the Corners of Objects Using Harris Corner Detector

In this tutorial, we will write a program to detect corners on random objects.

Our goal is to build an early prototype of a vision system that can make it easier and faster for robots to identify potential grasp points on unknown objects. 

Real-World Applications

The first real-world application that comes to mind is Dr. Pieter Abbeel (a famous robotics professor at UC Berkley) and his laundry-folding robot.

Dr. Abbeel used the Harris Corner Detector algorithm as a baseline for evaluating how well the robot was able to identify potential grasp points on a variety of laundry items. You can check out his paper at this link.

laundry-folding-robot
Source: YouTube (Credit to Berkeley AI Research lab)

If the robot can identify where to grasp a towel, for example, inverse kinematics can then be calculated so the robot can move his arm so that the gripper (i.e. end effector) grasps the corner of the towel to begin the folding process.

Prerequisites

Requirements

Here are the requirements:

  • Create a program that can detect corners on an unknown object using the Harris Corner Detector algorithm.

Detecting Corners on Jeans

jeans

We’ll start with this pair of jeans above. Save that image to some folder on your computer.

Now, in the same folder you saved that image above (we’ll call the file jeans.jpg), open up a new Python program.

Name the program harris_corner_detection.py.

Write the following code:

# Project: Detect the Corners of Objects Using Harris Corner Detector
# Author: Addison Sears-Collins
# Date created: October 7, 2020
# Reference: https://stackoverflow.com/questions/7263621/how-to-find-corners-on-a-image-using-opencv/50556944

import cv2 # OpenCV library
import numpy as np # NumPy scientific computing library
import math # Mathematical functions

# The file name of your image goes here
fileName = 'jeans.jpg'

# Read the image file
img = cv2.imread(fileName)

# Convert the image to grayscale 
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)   

######## The code in this block is optional #########
## Turn the image into a black and white image and remove noise
## using opening and closing

#gray = cv2.threshold(gray, 75, 255, cv2.THRESH_BINARY)[1]

#kernel = np.ones((5,5),np.uint8)

#gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, kernel)
#gray = cv2.morphologyEx(gray, cv2.MORPH_CLOSE, kernel)

## To create a black and white image, it is also possible to use OpenCV's
## background subtraction methods to locate the object in a real-time video stream 
## and remove shadows.
## See the following links for examples 
## (e.g. Absolute Difference method, BackgroundSubtractorMOG2, etc.):
## https://automaticaddison.com/real-time-object-tracking-using-opencv-and-a-webcam/
## https://automaticaddison.com/motion-detection-using-opencv-on-raspberry-pi-4/

############### End optional block ##################

# Apply a bilateral filter. 
# This filter smooths the image, reduces noise, while preserving the edges
bi = cv2.bilateralFilter(gray, 5, 75, 75)

# Apply Harris Corner detection.
# The four parameters are:
#   The input image
#   The size of the neighborhood considered for corner detection
#   Aperture parameter of the Sobel derivative used.
#   Harris detector free parameter 
#   --You can tweak this parameter to get better results 
#   --0.02 for tshirt, 0.04 for washcloth, 0.02 for jeans, 0.05 for contour_thresh_jeans
#   Source: https://docs.opencv.org/3.4/dc/d0d/tutorial_py_features_harris.html
dst = cv2.cornerHarris(bi, 2, 3, 0.02)

# Dilate the result to mark the corners
dst = cv2.dilate(dst,None)

# Create a mask to identify corners
mask = np.zeros_like(gray)

# All pixels above a certain threshold are converted to white         
mask[dst>0.01*dst.max()] = 255

# Convert corners from white to red.
#img[dst > 0.01 * dst.max()] = [0, 0, 255]

# Create an array that lists all the pixels that are corners
coordinates = np.argwhere(mask)

# Convert array of arrays to lists of lists
coordinates_list = [l.tolist() for l in list(coordinates)]

# Convert list to tuples
coordinates_tuples = [tuple(l) for l in coordinates_list]

# Create a distance threshold
thresh = 50

# Compute the distance from each corner to every other corner. 
def distance(pt1, pt2):
    (x1, y1), (x2, y2) = pt1, pt2
    dist = math.sqrt( (x2 - x1)**2 + (y2 - y1)**2 )
    return dist

# Keep corners that satisfy the distance threshold
coordinates_tuples_copy = coordinates_tuples
i = 1    
for pt1 in coordinates_tuples:
    for pt2 in coordinates_tuples[i::1]:
        if(distance(pt1, pt2) < thresh):
            coordinates_tuples_copy.remove(pt2)      
    i+=1

# Place the corners on a copy of the original image
img2 = img.copy()
for pt in coordinates_tuples:
    print(tuple(reversed(pt))) # Print corners to the screen
    cv2.circle(img2, tuple(reversed(pt)), 10, (0, 0, 255), -1)
cv2.imshow('Image with 4 corners', img2) 
cv2.imwrite('harris_corners_jeans.jpg', img2) 

# Exit OpenCV
if cv2.waitKey(0) & 0xff == 27:
    cv2.destroyAllWindows()

Run the code. Here is what you should see:

harris_corners_jeans

Now, if you uncomment the optional block in the code, you will see a window that pops up that shows a binary black and white image of the jeans. The purpose of the optional block of code is to remove some of the noise that is present in the input image.

harris_corners_contour_thresh_jeans

Notice how when we remove the noise, we get a lot more potential grasp points (i.e. corners).

Detecting Corners on a T-Shirt

To detect corners on a t-shirt, you’ll need to tweak the fourth parameter on this line of the code. I use 0.02 typically, but you can try another value like 0.04:

# 0.02 for tshirt, 0.04 for washcloth, 0.02 for jeans, 0.05 for contour_thresh_jeans
#   Source: https://docs.opencv.org/3.4/dc/d0d/tutorial_py_features_harris.html
dst = cv2.cornerHarris(bi, 2, 3, 0.02)

Let’s detect the corners on an ordinary t-shirt.

Here is the input image (tshirt.jpg):

tshirt

Change the fileName variable in your code so that it is assigned the name of the image (‘tshirt.jpg’).

Here is the output image:

harris_corners_tshirt

Detecting Corners on a Washcloth

Input Image (washcloth.jpg):

washcloth

Output image:

harris_corners_washcloth

That’s it. Keep building!

How To Multiply Two Quaternions Together Using Python

Here is how we multiply two quaternions together using Python. Multiplying two quaternions together has the effect of performing one rotation around an axis and then performing another rotation about around an axis.

import numpy as np
import random

def quaternion_multiply(Q0,Q1):
    """
    Multiplies two quaternions.

    Input
    :param Q0: A 4 element array containing the first quaternion (q01,q11,q21,q31) 
    :param Q1: A 4 element array containing the second quaternion (q02,q12,q22,q32) 

    Output
    :return: A 4 element array containing the final quaternion (q03,q13,q23,q33) 

    """
    # Extract the values from Q0
    w0 = Q0[0]
    x0 = Q0[1]
    y0 = Q0[2]
    z0 = Q0[3]
	
    # Extract the values from Q1
    w1 = Q1[0]
    x1 = Q1[1]
    y1 = Q1[2]
    z1 = Q1[3]
	
    # Computer the product of the two quaternions, term by term
    Q0Q1_w = w0 * w1 - x0 * x1 - y0 * y1 - z0 * z1
    Q0Q1_x = w0 * x1 + x0 * w1 + y0 * z1 - z0 * y1
    Q0Q1_y = w0 * y1 - x0 * z1 + y0 * w1 + z0 * x1
    Q0Q1_z = w0 * z1 + x0 * y1 - y0 * x1 + z0 * w1
	
    # Create a 4 element array containing the final quaternion
    final_quaternion = np.array([Q0Q1_w, Q0Q1_x, Q0Q1_y, Q0Q1_z])
	
    # Return a 4 element array containing the final quaternion (q02,q12,q22,q32) 
    return final_quaternion
    
Q0 = np.random.rand(4) # First quaternion
Q1 = np.random.rand(4) # Second quaternion
Q = quaternion_multiply(Q0, Q1)
print("{0} x {1} = {2}".format(Q0, Q1, Q))

How to Convert a Quaternion to a Rotation Matrix

In this tutorial, I’ll show you how to convert a quaternion to a three-dimensional rotation matrix. At the end of this post, I have provided the Python code to perform the conversion.

What is a Quaternion?

A quaternion is one of several mathematical ways to represent the orientation and rotation of an object in three dimensions. Another way is to use Euler angle-based rotation matrices like I did on this post and this post (i.e. roll, pitch, and yaw), as well as the cover image of this tutorial.

Quaternions are often used instead of Euler angle rotation matrices because “compared to rotation matrices they are more compact, more numerically stable, and more efficient” (Source: Wikipedia).

Note that a quaternion describes just the rotation of a coordinate frame (i.e. some object in 3D space) about an arbitrary axis, but it doesn’t tell you anything about that object’s position.

The Use of Quaternions in Robotics

Quaternions are the default method of representing orientations and rotations in ROS, the most popular platform for robotics software development.

In robotics, we are always trying to rotate stuff. For example, we might observe an object in a camera. In order to get a robotic arm to grab the object, we need to rotate the camera reference frame to the robot reference frame so that the robot “knows” the location of the object in its own coordinate frame.

Once the rotation from camera pixel coordinates to robot base frame coordinates is complete, the robotic arm can then move its motors to the appropriate angles to pick up the object.

How to Represent Quaternions

Quaternions are an extension of complex numbers. However instead of two values (e.g. a + bi or x + yi…same thing) that represent a point (or vector), we have four values (a, b, c, d):

q = a + bi + cj + dk

complex_numbers
Visualizing a point (a, b) as a complex number on a two-dimensional Argand diagram. Source: Wikipedia

The four values in a quaternion consist of one scalar and a 3-element unit vector.

Instead of a, b, c, and d, you will commonly see:

q = w + xi + yj + zk or q = q0 + q1i + q2j + q3k

  • q0 is a scalar value that represents an angle of rotation
  • q1, q2, and q3 correspond to an axis of rotation about which the angle of rotation is performed.

Other ways you can write a quaternion are as follows:

  • q = (q0, q1, q2, q3)
  • q = (q0, q) = q0 + q

The cool thing about quaternions is they work just like complex numbers. In two dimensions, you can rotate a vector using complex number multiplication. You can do the same with quaternions. The math is more complicated with four terms instead of two, but the principle is the same.

Let’s take a look at a two-dimensional example of complex number multiplication so that you can understand the concept of multiplying imaginary (complex) numbers to rotate a vector. Quaternions add a couple more variables to extend this concept to represent rotation in the 3D space.

2D Example

Suppose we have a vector on a 2D plane with the following specifications: (x = 3, y = 1).

This vector can be represented in complex numbers as:

3 + i (e.g. using the x +yi form of complex numbers)

Let’s rotate this vector 45 degrees (which is π/4 in radians).

To rotate 45 degrees, we multiply the number by:

cos(π/4) + sin(π/4)i (De Moivre’s formula)

So, we have sqrt means (“take the square root of”):

(1/sqrt(2)+ i/sqrt(2)) * (3 + i) = sqrt(2) + 2sqrt(2)i

And since:

sqrt(2) = 1.414

our new vector is:

(x = 1.414, y = 4.242)

As I mentioned earlier, the math for multiplying real quaternions together is more complex than this, but the principle is the same. Multiply an orientation (represented as a quaternion) by a rotation (represented as a quaternion) to get the new orientation.

Convert a Quaternion to a Rotation Matrix

Given a quaternion, you can find the corresponding three dimensional rotation matrix using the following formula.

quaternion-to-rotation-matrix
Source: Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace and Virtual Reality by J. B. Kuipers (Chapter 5,  Section 5.14 “Quaternions to Matrices”, pg. 125)

Python Code

In Python code, we have:

import numpy as np

def quaternion_rotation_matrix(Q):
    """
    Covert a quaternion into a full three-dimensional rotation matrix.

    Input
    :param Q: A 4 element array representing the quaternion (q0,q1,q2,q3) 

    Output
    :return: A 3x3 element matrix representing the full 3D rotation matrix. 
             This rotation matrix converts a point in the local reference 
             frame to a point in the global reference frame.
    """
    # Extract the values from Q
    q0 = Q[0]
    q1 = Q[1]
    q2 = Q[2]
    q3 = Q[3]
	
    # First row of the rotation matrix
    r00 = 2 * (q0 * q0 + q1 * q1) - 1
    r01 = 2 * (q1 * q2 - q0 * q3)
    r02 = 2 * (q1 * q3 + q0 * q2)
	
    # Second row of the rotation matrix
    r10 = 2 * (q1 * q2 + q0 * q3)
    r11 = 2 * (q0 * q0 + q2 * q2) - 1
    r12 = 2 * (q2 * q3 - q0 * q1)
	
    # Third row of the rotation matrix
    r20 = 2 * (q1 * q3 - q0 * q2)
    r21 = 2 * (q2 * q3 + q0 * q1)
    r22 = 2 * (q0 * q0 + q3 * q3) - 1
	
    # 3x3 rotation matrix
    rot_matrix = np.array([[r00, r01, r02],
                           [r10, r11, r12],
                           [r20, r21, r22]])
						   
    return rot_matrix