Difference Between Histogram Equalization and Histogram Matching

In this post, I will explain the difference between histogram equalization and histogram matching. If you are in a hurry, here is the short answer: while the goal of histogram equalization is to produce an output image that has a flattened histogram, the goal of histogram matching is to take an input image and generate an output image that is based upon the shape of a specific (or reference) histogram.

Let’s take a look at the long answer by first examining the definition of a histogram (continued after the Table of Contents).

Table of Contents

What is a Histogram?

In image processing, a histogram shows the number of pixels (or voxels in the case of a 3D image) for each intensity value in a given image.

1-histogram
Image Source: Wikimedia Commons

A histogram is a statistical representation of an image. It doesn’t show any information about where the pixels are located in the image. Therefore, two different images can have equivalent histograms. For example, the two images below are different but have identical histograms because both are 50% white (grayscale value of 255) and 50% black (grayscale value of 0).

2-equivalent-images

Return to Table of Contents

Histogram Equalization

3-histogram-equalization-before-afterJPG
Image Source: Wikimedia Commons

In histogram equalization (also known as histogram flattening), the goal is to improve contrast in images that might be either blurry or have a background and foreground that are either both bright or both dark. Histogram equalization helps sharpen an image.

Low contrast images typically have histograms that are concentrated within a tight range of values. Histogram equalization can improve the contrast in these images by spreading out the histogram so that the intensity values are distributed uniformly over a larger intensity range. Ideally, the histogram of the output image will be perfectly flat.

The two images below are two examples of what the histogram for an input image might look like before and after it goes through histogram equalization.

4-histogram-tranform
Image Source: Wikimedia Commons
5-histogram-transform-2
Image Source: Wikimedia Commons

Histogram equalization is useful in a number of real-world use cases, such as x-rays, thermal imagery, and satellite photos.

Here is some Python code you can use to perform histogram equalization:

# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: Sharpen an image (i.e. increase contrast) 
# using histogram equalization

import cv2 # Computer vision library
import numpy as np # Scientific computing library

# Read the image
img = cv2.imread('before.jpg',0)

# Perform histogram equalization
equ = cv2.equalizeHist(img)

# Stack images side-by-side
after = np.hstack((img,equ)) 

# Save the output image
cv2.imwrite('after.jpg',after)

Here is the input:

before

Here is the output generated by the program:

after
Original image (left), Enhanced image (right)

Return to Table of Contents

How Histogram Equalization Works

The process for histogram equalization is as follows:

Step 1: Obtain the histogram.

For example, if the image is grayscale with 256 distinct intensity levels i (where i = 0 [black], 1, 2, …. 253, 254, 255 [white]), the probability that a pixel chosen at random will have an intensity level i is as follows:

6-histogram-equation

Step 2: Obtain the cumulative distribution function CDF.

The cumulative distribution function H(j) is defined as the probability H of a randomly selected pixel taking one of the intensity values from 0 through j (inclusive). Therefore, given our normalized histogram h(i) from above, we have the following formula:

7-cdf-equation

The sum of all the components in the normalized histogram is equal to 1. Therefore,

8-equation-hist

Step 3: Calculate the transformation T to map the old intensity values to new intensity values.

Let K represent the total number of possible intensity values (e.g. 256). j is the old intensity value, and T(j) is the new intensity value.

9-transform

Step 4: Given the new mappings of intensity values, we can use a lookup table to transform each pixel in the input image to a new intensity.

The result of this transformation is a new histogram which corresponds to a new output image.

Special note on transformation functions:

The formula I used for histogram equalization is a common one, but other transformation functions are possible. Different transformation functions will yield different output histograms.

Return to Table of Contents

Example of Histogram Equalization

Let us suppose we have a 3-bit, 8 x 8 grayscale image.  The grayscale range is 23 = 8 intensity values (i.e. gray levels) because the image is 3 bits. We label these intensity values 0 through 7. Below is the histogram of this image.

10-histogram-equalization-table
11-original-histogram-imageJPG

Now, we calculate the cumulative distribution function and perform the transformation.

12-histogram-tableJPG

The two yellow columns above are our lookup table. We use these two columns to generate the output image. For example, we map all pixels that had a gray level of 3 to 1. We map all pixels that had a gray level of 6 to 5, etc. The resulting histogram looks like this:

13-histogram-table
15-histogram

Return to Table of Contents

Histogram Matching

While the goal of histogram equalization is to produce an output image that has a flattened histogram, the goal of histogram matching is to take an input image and generate an output image that is based upon the shape of a specific (or reference) histogram. Histogram matching is also known as histogram specification. You can consider histogram equalization as a special case of histogram matching in which we want to force an image to have a uniform histogram (rather than just any shape as is the case for histogram matching).

Let us suppose we have two images, an input image and a specified image. We want to use histogram matching to force the input image to have a histogram that is the shape of the histogram of the specified image. The first few steps are similar to histogram equalization, except we are performing histogram equalization on two images (original image and the specific image).

Step 1: Obtain the histogram for both the input image and the specified image (same method as in histogram equalization).

For example, if both images are grayscale with 256 distinct intensity levels i (where i = 0 [black], 1, 2, …. 253, 254, 255 [white]), the probability that a pixel chosen at random will have an intensity level i is as follows:

16-histogram-matching
17-histogram-matching

Step 2: Obtain the cumulative distribution function CDF for both the input image and the specified image (same method as in histogram equalization).

The cumulative distribution function H(j) is defined as the probability H of a randomly selected pixel taking one of the intensity values from 0 through j (inclusive). Therefore, given our normalized histograms h(i) from above, we have the following formula:

7-cdf-equation-1

Step 3: Calculate the transformation T to map the old intensity values to new intensity values for both the input image and specified image (same method as in histogram equalization).

Let K represent the total number of possible intensity values (e.g. 256). j is the old intensity value, and T(j) is the new intensity value.

tinputj

Step 4: Use the transformed intensity values for both the input image and specified image to map the intensity values of the input image to new values

We go through each available intensity value j one at a time, doing the following steps:

  • See what the transformed intensity value is for the input image given the intensity value j. Let us call this Tinput(j).
  • We then find the Tspecified(j) that is closest to Tinput(j) and make a note of what j is. For example, if  j = 4:
20-histogram-matching

we map all intensity values of 4 in the input image to 1.

Here is another example. Let us suppose that:

21-histogram-matching

Therefore, we map all intensity values of 5 in the input image to 2.

After we have gone through all available intensity values and performed all the mappings, we have our output image which has a histogram that will approximately match the shape of the unequalized specified histogram.

Return to Table of Contents

Example of Histogram Matching

Let us take a look at an example. For convenience, I am reposting the unequalized and equalized histogram from the histogram equalization example.

Here is the histogram of the original image.

22-original-image-histogram-1

Now, we equalize the original input image to get the following table and histogram.

23-histogram-tableJPG
24-histogram-graph

Now, let us suppose we have the following specified histogram. We want to get the original image to have a histogram that is shaped like the specified histogram.

25-histogram-graph

We equalize the specified histogram, yielding the following table.

26-histogram-tableJPG

Using the two yellow columns above to map the old intensity values for the pixels to new intensity values, we get the following histogram after equalization:

27-histogram-table
28-equalized-specified-histogram

Now, we need to use the transformed intensity values for both the input image and specified image to map the intensity values of the input image to new values. To do that, all we need are the FLOOR((K – 1) * CDF) values for both the original image and the specified image.

29-histogram-table

We go through each available intensity value j one at a time, doing the following steps:

  • See what the transformed intensity value is for the input image given the intensity value j. Call this Tinput(j).
  • We then find the Tspecified(j) that is closest to Tinput(j) and make a note of what j is.

For example, when the gray level is 4, the original image is 2. 2 in the specified image corresponds to a gray level of 1. Therefore, we map 4 to 1.

When the gray level is 5, the original image is 3. 3 in the specified image is closest to 2 (go to the next lowest level by convention) corresponds to a gray level of 1. Therefore, we map 5 to 1.

Here is the final mapping.

30-histogram-table

To finish the histogram matching process, we have to replace the values in the original image with the map values. The final matched histogram is shown below:

31-matched-histogramJPG

Therefore, the histogram matching process got us from the original image histogram below to that matched histogram above. Notice the matched histogram has a similar shape to the original specified histogram.

32-matched-histogram

Return to Table of Contents

How to Do Multiple Object Tracking Using OpenCV

In this tutorial, we will learn how to track multiple objects in a video using OpenCV, the computer vision library for Python. By the end of this tutorial, you will be able to generate the following output:

object_tracking_gif

Real-World Applications

Object tracking has a number of real-world use cases. Here are a few:

  • Drone Surveillance 
  • Aerial Cinematography
  • Tracking Customers to Understand In-Store Consumer Behavior
  • Any Kind of Robot That You Want to Follow You as You Move Around

Let’s get started!

Prerequisites

Installation and Setup

We first need to make sure we have all the software packages installed. Check to see if you have OpenCV installed on your machine. If you are using Anaconda, you can type:

conda install -c conda-forge opencv

Alternatively, you can type:

pip install opencv-python

Find Video Files

Find a video file that contains objects you would like to detect. I suggest finding a file that is 1920 x 1080 pixels in dimensions and is in mp4 format. You can find some good videos at sites like Pixabay.com and Pexels.com. 

Save the video file in a folder somewhere on your computer.

Write the Code

Navigate to the directory where you saved your video, and open a new Python program named multi_object_tracking.py.

Here is the full code for the program. You can copy and paste this code directly into your program. The only line you will need to change is line 10. The video file I’m using is named fish.mp4, so I will write ‘fish’ as the file prefix. 

If your video is not 1920 x 1080 dimensions, you will need to modify line 12 accordingly.

# Project: How to Do Multiple Object Tracking Using OpenCV
# Author: Addison Sears-Collins
# Date created: March 2, 2021
# Description: Tracking multiple objects in a video using OpenCV

import cv2 # Computer vision library
from random import randint # Handles the creation of random integers

# Make sure the video file is in the same directory as your code
file_prefix = 'fish'
filename = file_prefix + '.mp4'
file_size = (1920,1080) # Assumes 1920x1080 mp4

# We want to save the output to a video file
output_filename = file_prefix + '_object_tracking.mp4'
output_frames_per_second = 20.0 

# OpenCV has a bunch of object tracking algorithms. We list them here.
type_of_trackers = ['BOOSTING', 'MIL', 'KCF','TLD', 'MEDIANFLOW', 'GOTURN', 
                     'MOSSE', 'CSRT']

# CSRT is accurate but slow. You can try others and see what results you get.			 
desired_tracker = 'CSRT'

# Generate a MultiTracker object	
multi_tracker = cv2.MultiTracker_create()

# Set bounding box drawing parameters
from_center = False # Draw bounding box from upper left
show_cross_hair = False # Don't show the cross hair
										 
def generate_tracker(type_of_tracker):
  """
  Create object tracker.
	
  :param type_of_tracker string: OpenCV tracking algorithm 
  """
  if type_of_tracker == type_of_trackers[0]:
    tracker = cv2.TrackerBoosting_create()
  elif type_of_tracker == type_of_trackers[1]:
    tracker = cv2.TrackerMIL_create()
  elif type_of_tracker == type_of_trackers[2]:
    tracker = cv2.TrackerKCF_create()
  elif type_of_tracker == type_of_trackers[3]:
    tracker = cv2.TrackerTLD_create()
  elif type_of_tracker == type_of_trackers[4]:
    tracker = cv2.TrackerMedianFlow_create()
  elif type_of_tracker == type_of_trackers[5]:
    tracker = cv2.TrackerGOTURN_create()
  elif type_of_tracker == type_of_trackers[6]:
    tracker = cv2.TrackerMOSSE_create()
  elif type_of_tracker == type_of_trackers[7]:
    tracker = cv2.TrackerCSRT_create()
  else:
    tracker = None
    print('The name of the tracker is incorrect')
    print('Here are the possible trackers:')
    for track_type in type_of_trackers:
      print(track_type)
  return tracker

def main():

  # Load a video
  cap = cv2.VideoCapture(filename)

  # Create a VideoWriter object so we can save the video output
  fourcc = cv2.VideoWriter_fourcc(*'mp4v')
  result = cv2.VideoWriter(output_filename,  
                           fourcc, 
                           output_frames_per_second, 
                           file_size) 

  # Capture the first video frame
  success, frame = cap.read() 

  bounding_box_list = []
  color_list = []	

  # Do we have a video frame? If true, proceed.
  if success:

    while True:
    
		  # Draw a bounding box over all the objects that you want to track_type
      # Press ENTER or SPACE after you've drawn the bounding box
      bounding_box = cv2.selectROI('Multi-Object Tracker', frame, from_center, 
        show_cross_hair) 

      # Add a bounding box
      bounding_box_list.append(bounding_box)
			
      # Add a random color_list
      blue = 255 # randint(127, 255)
      green = 0 # randint(127, 255)
      red = 255 #randint(127, 255)
      color_list.append((blue, green, red))

      # Press 'q' (make sure you click on the video frame so that it is the
      # active window) to start object tracking. You can press another key
      # if you want to draw another bounding box.			
      print("\nPress q to begin tracking objects or press " + 
        "another key to draw the next bounding box\n")

      # Wait for keypress
      k = cv2.waitKey() & 0xFF

      # Start tracking objects if 'q' is pressed			
      if k == ord('q'):
        break

    cv2.destroyAllWindows()
		
    print("\nTracking objects. Please wait...")
		
    # Set the tracker
    type_of_tracker = desired_tracker	
			
    for bbox in bounding_box_list:
		
      # Add tracker to the multi-object tracker
      multi_tracker.add(generate_tracker(type_of_tracker), frame, bbox)
      
    # Process the video
    while cap.isOpened():
		
      # Capture one frame at a time
      success, frame = cap.read() 
		
      # Do we have a video frame? If true, proceed.
      if success:

        # Update the location of the bounding boxes
        success, bboxes = multi_tracker.update(frame)

        # Draw the bounding boxes on the video frame
        for i, bbox in enumerate(bboxes):
          point_1 = (int(bbox[0]), int(bbox[1]))
          point_2 = (int(bbox[0] + bbox[2]), 
            int(bbox[1] + bbox[3]))
          cv2.rectangle(frame, point_1, point_2, color_list[i], 5)
				
        # Write the frame to the output video file
        result.write(frame)

      # No more video frames left
      else:
        break
		
  # Stop when the video is finished
  cap.release()
	
  # Release the video recording
  result.release()

main()

Run the Code

Launch the program.

python multi_object_tracking.py

The first frame of the video will pop up. With this frame selected, grab your mouse, and draw a rectangle around the object you would like to track. 

When you’re done drawing the rectangle, press Enter or Space.

If you just want to track this object, press ‘q’ to run the program’. Otherwise, if you want to track more objects, press any other key to draw some more rectangles around other objects you want to track.

After you press ‘q’, the program will run. Once the program is finished doing its job, you will have a new video file. It will be named something like fish_object_tracking.mp4. This file is your final video output.

console-instructions

Video Output

Here is my final video:

How the Code Works

OpenCV has a number of object trackers: ‘BOOSTING’, ‘MIL’, ‘KCF’,’TLD’, ‘MEDIANFLOW’, ‘GOTURN’, ‘MOSSE’, ‘CSRT’. In our implementation, we used CSRT which is slow but accurate.

When we run the program, the first video frame is captured. We have to identify the object(s) we want to track by drawing a rectangle around it. After we do that, the algorithm tracks the object through each frame of the video.

Once the program has finished processing all video frames, the annotated video is saved to your computer.

That’s it. Keep building!

How to Detect Objects in Video Using MobileNet SSD in OpenCV

In this tutorial, we will go through how to detect objects in a video stream using OpenCV. We will use MobileNet SSD, a special type of convolutional neural network architecture

Our output will look like this:

Real-World Applications

  • Object Detection
  • Object Tracking
  • Object Classification
  • Autonomous Vehicles
  • Self-Driving Cars

Let’s get started!

Prerequisites

Installation and Setup

We now need to make sure we have all the software packages installed. Check to see if you have OpenCV installed on your machine. If you are using Anaconda, you can type:

conda install -c conda-forge opencv

Alternatively, you can type:

pip install opencv-python

Download the Required Files

Download all the video files and other neural network-related files at this link. Place the files inside a directory on your computer.

Code

In the same folder where your image file is, open a new Python file called object_detection_mobile_ssd.py.

Here is the full code for the system. The only things you’ll need to change in this code is the name of your desired input video file on line 10 and the name of your desired output file on line 14.

# Project: How to Detect Objects in Video Using MobileNet SSD in OpenCV
# Author: Addison Sears-Collins
# Date created: March 1, 2021
# Description: Object detection using OpenCV

import cv2 # Computer vision library
import numpy as np # Scientific computing library 

# Make sure the video file is in the same directory as your code
filename = 'edmonton_canada.mp4'
file_size = (1920,1080) # Assumes 1920x1080 mp4

# We want to save the output to a video file
output_filename = 'edmonton_canada_obj_detect_mobssd.mp4'
output_frames_per_second = 20.0 

RESIZED_DIMENSIONS = (300, 300) # Dimensions that SSD was trained on. 
IMG_NORM_RATIO = 0.007843 # In grayscale a pixel can range between 0 and 255

# Load the pre-trained neural network
neural_network = cv2.dnn.readNetFromCaffe('MobileNetSSD_deploy.prototxt.txt', 
        'MobileNetSSD_deploy.caffemodel')

# List of categories and classes
categories = { 0: 'background', 1: 'aeroplane', 2: 'bicycle', 3: 'bird', 
               4: 'boat', 5: 'bottle', 6: 'bus', 7: 'car', 8: 'cat', 
               9: 'chair', 10: 'cow', 11: 'diningtable', 12: 'dog', 
              13: 'horse', 14: 'motorbike', 15: 'person', 
              16: 'pottedplant', 17: 'sheep', 18: 'sofa', 
              19: 'train', 20: 'tvmonitor'}

classes =  ["background", "aeroplane", "bicycle", "bird", "boat", "bottle", 
            "bus", "car", "cat", "chair", "cow", 
           "diningtable",  "dog", "horse", "motorbike", "person", 
           "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
					 
# Create the bounding boxes
bbox_colors = np.random.uniform(255, 0, size=(len(categories), 3))
	
def main():

  # Load a video
  cap = cv2.VideoCapture(filename)

  # Create a VideoWriter object so we can save the video output
  fourcc = cv2.VideoWriter_fourcc(*'mp4v')
  result = cv2.VideoWriter(output_filename,  
                           fourcc, 
                           output_frames_per_second, 
                           file_size) 
	
  # Process the video
  while cap.isOpened():
		
    # Capture one frame at a time
    success, frame = cap.read() 

    # Do we have a video frame? If true, proceed.
    if success:
		
      # Capture the frame's height and width
      (h, w) = frame.shape[:2]

      # Create a blob. A blob is a group of connected pixels in a binary 
      # frame that share some common property (e.g. grayscale value)
      # Preprocess the frame to prepare it for deep learning classification
      frame_blob = cv2.dnn.blobFromImage(cv2.resize(frame, RESIZED_DIMENSIONS), 
                     IMG_NORM_RATIO, RESIZED_DIMENSIONS, 127.5)
	
      # Set the input for the neural network
      neural_network.setInput(frame_blob)

      # Predict the objects in the image
      neural_network_output = neural_network.forward()

      # Put the bounding boxes around the detected objects
      for i in np.arange(0, neural_network_output.shape[2]):
			
        confidence = neural_network_output[0, 0, i, 2]
    
        # Confidence must be at least 30%		
        if confidence > 0.30:
				
          idx = int(neural_network_output[0, 0, i, 1])

          bounding_box = neural_network_output[0, 0, i, 3:7] * np.array(
            [w, h, w, h])

          (startX, startY, endX, endY) = bounding_box.astype("int")

          label = "{}: {:.2f}%".format(classes[idx], confidence * 100) 
        
          cv2.rectangle(frame, (startX, startY), (
            endX, endY), bbox_colors[idx], 2)     
						
          y = startY - 15 if startY - 15 > 15 else startY + 15     

          cv2.putText(frame, label, (startX, y),cv2.FONT_HERSHEY_SIMPLEX, 
            0.5, bbox_colors[idx], 2)
		
      # We now need to resize the frame so its dimensions
      # are equivalent to the dimensions of the original frame
      frame = cv2.resize(frame, file_size, interpolation=cv2.INTER_NEAREST)

			# Write the frame to the output video file
      result.write(frame)
		
    # No more video frames left
    else:
      break
			
  # Stop when the video is finished
  cap.release()
	
  # Release the video recording
  result.release()

main()

To run the code, type the following command:

python object_detection_mobile_ssd.py

You will see the video output that is at the top of this tutorial.

That’s it. Keep building!