How the Canny Edge Detector Works

In this post, I will explain how the Canny Edge Detector works. The Canny Edge Detector is a popular edge detection algorithm developed by John F. Canny in 1986. The goal of the Canny Edge Detector is to:

  • Minimize Error: Edges that are detected by the algorithm as edges should be real edges and not noise.
  • Good Localization: Minimize the distance between detected edge pixels and real edge pixels.
  • Minimal Responses to Single Edges: In other words, areas of the image that are not marked as edges should not be edges.
me-over-lake-tahoe-kingsbury

How the Canny Edge Detector Works

The Canny Edge Detector Process is as follows:

  1. Gaussian Filter: Smooth the input image with a Gaussian filter to remove noise (using a discrete Gaussian kernel).
  2. Calculate Intensity Gradients: Identify the areas in the image with the strongest intensity gradients (using a Sobel, Prewitt, or Roberts kernel).
  3. Non-maximum Suppression: Apply non-maximum suppression to thin out the edges. We want to remove unwanted pixels that might not be part of an edge.
  4. Thresholding with Hysteresis:  Hysteresis or double thresholding involves:
    • Accepting pixels as edges if the intensity gradient value exceeds an upper threshold.
    • Rejecting pixels as edges if the intensity gradient value is below a lower threshold.
    • If a pixel is between the two thresholds, accept it only if it is adjacent to a pixel that is above the upper threshold.

Mathematical Formulation of the Canny Edge Detector

More formally, in step 1 of the Canny Edge Detector, we smooth an image by convolving the image with a Gaussian kernel. An example calculation showing the convolving mathematical operation is shown in the Sobel Operator discussion. Below is an example 5×5 Gaussian kernel that can be used.

1-gaussian-kernel

We must go through each 5×5 region in the image and apply the convolving operation between a 5×5 portion of the input image (with the pixel of interest as the center cell, or anchor) and the 5×5 kernel above. The result is then summed to give us the new intensity value for that pixel.

After smoothing the image using the Gaussian kernel, we then calculate the intensity gradients. A common method is to use the Sobel Operator.

Here are the two kernels used in the Sobel algorithm:

2-direction-kernel

The gradient approximations at pixel (x,y) given a 3×3 portion of the source image Ii are calculated as follows:

Gx = x-direction kernel * (3x3 portion of image A with (x,y) as the center cell)
Gy = y-direction kernel * (3x3 portion of image A with (x,y) as the center cell)

* above is not normal matrix multiplication. * denotes the convolution operation.

We then combine the values above to calculate the magnitude of the gradient:

magnitude(G) = square_root(Gx2 + Gy2)

The direction of the gradient Ɵ is:

Ɵ = atan(Gy / Gx)

where atan is the arctangent operator.

Once we have the gradient magnitude and direction, we perform non-maximum suppression by scanning the entire image to get rid of pixels that might not be part of an edge. Non-maximum suppression works by finding pixels that are local maxima in the direction of the gradient (gradient direction is perpendicular to edges).

If, for example, we have three pixels that are next to each other: pixels a, b, and then c. Pixel b is larger in intensity than both a and c where pixels a and c are in the gradient direction of b. Therefore, pixel b is marked as an edge. Otherwise, if pixel b was not a local maximum, it would be set to 0 (i.e. black), meaning it would not be an edge pixel.

a ——> b <edge> ——> c

Non-maximum suppression is not perfect because some edges might actually be noise and not real edges. To solve this, Canny Edge Detector goes one step further and applies thresholding to remove the weakest edges and keep the strongest ones. Edge pixels that are borderline weak or strong are only considered strong if they are connected to strong edge pixels.

Canny Edge Detector Code

This tutorial has the Python code for the Canny Edge Detector.

Conclusion

In this discussion, we covered the Canny Edge Detector. The Canny Edge Detector is just one of many edge detection algorithms.

The most common edge detection algorithms fall into the following categories:

  • Gradient Operators
    • Roberts Cross Operator
    • Sobel Operator
    • Prewitt Operator
  • Canny Edge Detector
  • Laplacian of Gaussian
  • Haralick Operator

Which edge detection algorithm you choose depends on what you are trying to achieve with your application.

Keep building!

How the Laplacian of Gaussian Filter Works

In this post, I will explain how the Laplacian of Gaussian (LoG) filter works. Laplacian of Gaussian is a popular edge detection algorithm.

Edge detection is an important part of image processing and computer vision applications. It is used to detect objects, locate boundaries, and extract features. Edge detection is about identifying sudden, local changes in the intensity values of the pixels in an image.

me-at-lake-tahoe

How LoG Works

Edge detection algorithms like the Sobel Operator work on the first derivative of an image. In other words, if we have a graph of the intensity values for each pixel in an image, the Sobel Operator takes a look at where the slope of the graph of the intensity reaches a peak, and that peak is marked as an edge.

For our 10×1 pixel image, the blue curve below is a plot of the intensity values, and the orange curve is the plot of the first derivative of the blue curve. In layman’s terms, the orange curve is a plot of the slope.

1-edge-1
2-pixel-intensity-graph-slope

The orange curve peaks in the middle, so we know that is likely an edge. When we look at the original source image, we confirm that yes, it is an edge.

One limitation with the approach above is that the first derivative of an image might be subject to a lot of noise. Local peaks in the slope of the intensity values might be due to shadows or tiny color changes that are not edges at all.

An alternative to using the first derivative of an image is to use the second derivative, which is the slope of the first derivative curve (i.e. that orange curve above). Such a curve looks something like this (see the gray curve below):

3-pixel-intensity-graph

An edge occurs where the graph of the second derivative crosses zero. This second derivative-based method is called the Laplacian algorithm.

The Laplacian algorithm is also subject to noise. For example, consider a photo of a cat.

A cat hair or whisker might register as an edge because it is an area of a sharp change in intensity. However, it is not an edge. It is just noise. To solve this problem, a Gaussian smoothing filter is commonly applied to an image to reduce noise before the Laplacian is applied. This method is called the Laplacian of Gaussian (LoG).

We also set a threshold value to distinguish noise from edges. If the second derivative magnitude at a pixel exceeds this threshold, the pixel is part of an edge.

Mathematical Formulation of LoG

More formally, given a pixel (x, y), the Laplacian L(x,y) of an image with intensity values Ii can be written mathematically as follows:

5-log

Just like in the case of the Sobel Operator, we cannot calculate the second derivative directly because pixels in an image are discrete. We need to approximate it using the convolution operator. The two most common kernels are:

6-common-kernels

Calculating just the Laplacian will result in a lot of noise, so we need to convolve a Gaussian smoothing filter with the Laplacian filter to reduce noise prior to computing the second derivatives. The equation that combines both of these filters is called the Laplacian of Gaussian and is as follows:

7-log

The above equation is continuous, so we need to discretize it so that we can use it on discrete pixels in an image.

Here is an example of a LoG approximation kernel where σ = 1.4. This is just an example of one convolution kernel that can be used. There are others that would work as well.

8-convolution-kernel

This LoG kernel is convolved with a grayscale input image to detect the zero crossings of the second derivative. We set a threshold for these zero crossings and retain only those zero crossings that exceed the threshold. Strong zero crossings are ones that have a big difference between the positive maximum and the negative minimum on either size of the zero crossing. Weak zero crossings are most likely noise, so they are ignored due to the thresholding we apply.

Laplacian of Gaussian Code

This tutorial has the Python code for the Laplacian of Gaussian.

How the Sobel Operator Works

In this post, I will explain how the Sobel Operator edge detection algorithm works.

me-at-mammoth-original
The cover photo is what this photo looks like after the Sobel Operator has been applied.

In order to understand how the Sobel Operator detects edges, it is important to first understand how to model an edge. Let’s take a look at that now before we examine how the Sobel Operator works.

How to Model Edges Mathematically

Consider a grayscale image that is 10 x 1 pixels. Such an image would look something like the image below:

1-edge-1

One can clearly see in the image that there is a sudden change in the intensity in the middle of the image. This sudden change is an edge.

Now, let us draw the corresponding intensity graph of each pixel, going from left to right. We will assume that black has an intensity of -1 (the lowest intensity), and white has an intensity of 1 (the highest intensity).

2-pixel-intensity-graph

Mathematically, we can detect an abrupt change in the curve above by graphing the slope of the curve (i.e. taking the first derivative at each point along the curve).

The orange curve below is the graph of the slope at each point along the pixel intensity curve. Note how the curve reaches a maximum in the middle. This peak corresponds to an edge in the original image. It is an area where the intensity (brightness) of the image changes abruptly.

3-pixel-intensity-graph-slope

However, we have run into a problem here because I have drawn a continuous curve, but pixel intensity values are discrete. Therefore, in image processing, we cannot calculate the first derivatives of an image (known as “spatial derivatives”) directly using standard calculus methods; instead we have to estimate the derivative of the image using a mathematical operation known as convolution.

Professor Andrew Ng of Stanford University has a great video on how to perform the convolution operation given an input image:

I will explain the process below.

How to Perform the Convolution Operation Given an Image as Input

To do the convolution operation, we need to use a mathematical tool called a kernel (or filter). A kernel is just a fancy name for a small matrix.

Below is an example of a kernel. This small matrix is 3×3 (3 rows and 3 columns). It happens to be the kernel used in the Sobel algorithm to calculate estimates of the derivatives in the vertical direction of an image. I will explain the Sobel algorithm later in this section.

4-kernel

The middle cell in a kernel is known as an anchor. In this case, the anchor is 0.

The Sobel Operator, a popular edge detection algorithm, involves estimating the first derivative of an image by doing a convolution between an image (i.e. the input) and two special kernels, one to detect vertical edges and one to detect horizontal edges.

Let’s see how to estimate the first derivative of an image by doing an example convolution calculation…

Suppose we have a 6×6 pixel color image.

5-original-color-image

A color image has 3 channels (red, green and blue). To perform the convolution operation to detect edges, we only need one channel, so we first convert the image to grayscale. Converting the color image to grayscale is the first step of most of the edge detection algorithms, including the Sobel Operator algorithm.

Here are what the intensity values might look like after the grayscale conversion.

6-input-image-grayscale

Now, consider the kernel below (this happens to be the Sobel Vertical Kernel used to detect vertical edges).

7-kernel-matrix-b

To convolve matrix A with matrix B, we first overlay Matrix B on top of the upper-left portion of Matrix A and then take the element-wise product:

8-convolve-matrix-a-matrix-b

After completing the 9 multiplications, that upper-left portion becomes…

9-upper-left-portion

To complete the convolution operation, we add up all the numbers in the matrix above to get:

-150 + -300 + -150 + 0 + 0 + 0 + 150 + 510 + 255 = 315

The output matrix for convolving matrix A with matrix B, will be a 4×4 matrix. The value above, 315, becomes the first entry into this output matrix. You can see that 315 is a relatively large number.

10-output-image

We keep shifting the 3×3 kernel over the image, pixel by pixel, until we have filled up the output matrix above. That is the convolution operation.

How the Sobel Operator Works

The Sobel Operator uses kernels and the convolution operation (described above) to detect edges in an image. The algorithm works with two kernels

  1. A kernel to approximate intensity change in the x-direction (horizontal)
  2. A kernel to approximate intensity change at a pixel in the y-direction (vertical).

In mathematical jargon, “approximating the intensity change” means approximating the gradient of the intensity values at a pixel. The gradient is the multivariable generalization of the derivative or slope mentioned earlier in this discussion.

Here are the two special kernels used in the Sobel algorithm:

11-x-y-direction-kernel

The two kernels above are convolved with each pixel in the original image to identify the regions where the change (gradient) is maximized in magnitude in the x and y directions.

Mathematical Formulation of the Sobel Operator

More formally,

  • Let gradient approximations in the x-direction be denoted as Gx
  • Let gradient approximations in the y-direction be denoted as Gy.

Suppose we have a source image Ii.

We want to calculate the magnitude of the gradient at a pixel (x, y), so we extract from image Ii a 3×3 matrix with pixel (x,y) as the center pixel (i.e. anchor).

The gradient approximations at pixel (x,y) given a 3×3 portion of the source image Ii are calculated as follows:

Gx = x-direction kernel * (3×3 portion of image A with (x,y) as the center cell)

Gy = y-direction kernel * (3×3 portion of image A with (x,y) as the center cell)

* above is not normal matrix multiplication. * denotes the convolution operation I described earlier in this discussion.

We then combine the values above to calculate the magnitude of the gradient at pixel (x,y):

magnitude(G) = square_root(Gx2 + Gy2)

The direction of the gradient Ɵ at pixel (x, y) is:

Ɵ = atan(Gy / Gx)

where atan is the arctangent operator.

So, in short, the algorithm goes through each pixel in an image and defines 3×3 matrices that have that pixel as the center cell (anchor pixel). Convolving these matrices with both the x and y-direction kernels yields two values, gradient approximations in the x-direction (Gx) and y-direction (Gy). We square both of these values, add them, and then take the square root of the sum to calculate the magnitude of the gradient at that pixel. Pixels that have gradients of large magnitude are likely part of an edge in an image.

Example Calculation

Here is an example calculation showing how to calculate the gradient approximation at a single pixel in the Sobel algorithm. We have already converted the input image into grayscale:

12-original-source-image-grayscale

We start the calculation of the gradient in the y-direction (Gy) at the pixel in the second column and second row (marked in yellow).

13-small-matrices

First, we convolve the selection in the original image with the y-direction kernel. The result is as follows:

14-small-matrix

Which becomes…

15-small-matrix

Then we add up all the numbers in the matrix above:

Gy for the pixel in row 2, column 2 is 315.

We now do a similar calculation to calculate Gx at that same pixel.The only thing that changes is the kernel that we multiply our selection by.

16-original-source-image

We start the calculation of the gradient in the x-direction (Gx) at the pixel in the second column and second row (marked in yellow).

17-convolution

We convolve the selection in the original image with the x-direction kernel. The result is:

18-small-matrix

Which becomes…

19-small-matrix

Then we add up all the numbers in the matrix above:

Gx for the pixel in row 2, column 2 is 315.

Thus, at the pixel at row 2, column 2…

Gx = 315

Gy = 315

We then combine the values above to calculate the magnitude of the gradient at pixel (x,y):

magnitude(G) = square_root(Gx2 + Gy2)

magnitude(G) = square_root(3152 + 3152)

magnitude(G) ≈ 445

This magnitude above is just our calculation for a single 3×3 region. We have to do the same thing for the other 3×3 regions in the image. And once we have done those calculations, we know that the pixels with the highest magnitude(G) are likely part of an edge.

This tutorial on GitHub has a really nice animated demonstration of the Sobel Edge Detection algorithm I explained above.

Sobel Operator Code

This tutorial on image gradients has the Python code for the Sobel Operator.