How to Detect Pedestrians in Images and Video Using OpenCV

In this tutorial, we will write a program to detect pedestrians in a photo and a video using a technique called the Histogram of Oriented Gradients (HOG). We will use the OpenCV computer vision library, which has a built-in pedestrian detection method that is based on the original research paper on HOG

I won’t go into the details and the math behind HOG (you don’t need to know the math in order to implement the algorithm), but if you’re interested in learning about what goes on under the hood, check out that research paper.

pedestrians_2_detect

Real-World Applications

  • Self-Driving Cars

Prerequisites

Install OpenCV

The first thing you need to do is to make sure you have OpenCV installed on your machine. If you’re using Anaconda, you can type this command into the terminal:

conda install -c conda-forge opencv

Alternatively, you can type:

pip install opencv-python

Detect Pedestrians in an Image

Get an image that contains pedestrians and put it inside a directory somewhere on your computer. 

Here are a couple images that I will use as test cases:

pedestrians_1
pedestrians_2

Inside that same directory, write the following code. I will save my file as detect_pedestrians_hog.py:

import cv2 # Import the OpenCV library to enable computer vision

# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: Detect pedestrians in an image using the 
#   Histogram of Oriented Gradients (HOG) method

# Make sure the image file is in the same directory as your code
filename = 'pedestrians_1.jpg'

def main():

  # Create a HOGDescriptor object
  hog = cv2.HOGDescriptor()
	
  # Initialize the People Detector
  hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
	
  # Load an image
  image = cv2.imread(filename)
		
  # Detect people
  # image: Source image
  # winStride: step size in x and y direction of the sliding window
  # padding: no. of pixels in x and y direction for padding of sliding window
  # scale: Detection window size increase coefficient	
  # bounding_boxes: Location of detected people
  # weights: Weight scores of detected people
  (bounding_boxes, weights) = hog.detectMultiScale(image, 
                                                   winStride=(4, 4),
                                                   padding=(8, 8), 
                                                   scale=1.05)

  # Draw bounding boxes on the image
  for (x, y, w, h) in bounding_boxes: 
    cv2.rectangle(image, 
                  (x, y),  
                  (x + w, y + h),  
                  (0, 0, 255), 
                   4)
					
  # Create the output file name by removing the '.jpg' part
  size = len(filename)
  new_filename = filename[:size - 4]
  new_filename = new_filename + '_detect.jpg'
	
  # Save the new image in the working directory
  cv2.imwrite(new_filename, image)
	
  # Display the image 
  cv2.imshow("Image", image) 
	
  # Display the window until any key is pressed
  cv2.waitKey(0) 
	
  # Close all windows
  cv2.destroyAllWindows() 

main()

Run the code.

python detect_pedestrians_hog.py

Image Results

Here is the output for the first image:

pedestrians_1_detect-1

Here is the output for the second image:

pedestrians_2_detect-1

Detect Pedestrians in a Video

Now that we know how to detect pedestrians in an image, let’s detect pedestrians in a video.

Find a video file that has pedestrians in it. You can check free video sites like Pixabay if you don’t have any videos.

The video should have dimensions of 1920 x 1080 for this implementation. If you have a video of another size, you will have to tweak the parameters in the code.

Place the video inside a directory.

Now let’s download a library that will apply a fancy mathematical technique called non-maxima suppression to take multiple overlapping bounding boxes and compress them into just one bounding box.

pip install --upgrade imutils

Also, make sure you have NumPy installed, a scientific computing library for Python.

If you’re using Anaconda, you can type:

conda install numpy

Alternatively, you can type:

pip install numpy

Inside that same directory, write the following code. I will save my file as detect_pedestrians_video_hog.py. We will save the output as an .mp4 video file:

# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: Detect pedestrians in a video using the 
#   Histogram of Oriented Gradients (HOG) method

import cv2 # Import the OpenCV library to enable computer vision
import numpy as np # Import the NumPy scientific computing library
from imutils.object_detection import non_max_suppression # Handle overlapping

# Make sure the video file is in the same directory as your code
filename = 'pedestrians_on_street_1.mp4'
file_size = (1920,1080) # Assumes 1920x1080 mp4
scale_ratio = 1 # Option to scale to fraction of original size. 

# We want to save the output to a video file
output_filename = 'pedestrians_on_street.mp4'
output_frames_per_second = 20.0 

def main():

  # Create a HOGDescriptor object
  hog = cv2.HOGDescriptor()
	
  # Initialize the People Detector
  hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
	
  # Load a video
  cap = cv2.VideoCapture(filename)

  # Create a VideoWriter object so we can save the video output
  fourcc = cv2.VideoWriter_fourcc(*'mp4v')
  result = cv2.VideoWriter(output_filename,  
                           fourcc, 
                           output_frames_per_second, 
                           file_size) 
	
  # Process the video
  while cap.isOpened():
		
    # Capture one frame at a time
    success, frame = cap.read() 
		
    # Do we have a video frame? If true, proceed.
    if success:
		
    	# Resize the frame
      width = int(frame.shape[1] * scale_ratio)
      height = int(frame.shape[0] * scale_ratio)
      frame = cv2.resize(frame, (width, height))
			
      # Store the original frame
      orig_frame = frame.copy()
			
      # Detect people
      # image: a single frame from the video
      # winStride: step size in x and y direction of the sliding window
      # padding: no. of pixels in x and y direction for padding of 
      #	sliding window
      # scale: Detection window size increase coefficient	
      # bounding_boxes: Location of detected people
      # weights: Weight scores of detected people
      # Tweak these parameters for better results
      (bounding_boxes, weights) = hog.detectMultiScale(frame, 
                                                       winStride=(16, 16),
                                                       padding=(4, 4), 
                                                       scale=1.05)

      # Draw bounding boxes on the frame
      for (x, y, w, h) in bounding_boxes: 
            cv2.rectangle(orig_frame, 
            (x, y),  
            (x + w, y + h),  
            (0, 0, 255), 
             2)
						
      # Get rid of overlapping bounding boxes
      # You can tweak the overlapThresh value for better results
      bounding_boxes = np.array([[x, y, x + w, y + h] for (
                                x, y, w, h) in bounding_boxes])
			
      selection = non_max_suppression(bounding_boxes, 
                                      probs=None, 
                                      overlapThresh=0.45)
		
      # draw the final bounding boxes
      for (x1, y1, x2, y2) in selection:
        cv2.rectangle(frame, 
                     (x1, y1), 
                     (x2, y2), 
                     (0, 255, 0), 
                      4)
		
      # Write the frame to the output video file
      result.write(frame)
			
      # Display the frame 
      cv2.imshow("Frame", frame) 	

      # Display frame for X milliseconds and check if q key is pressed
      # q == quit
      if cv2.waitKey(25) & 0xFF == ord('q'):
        break
		
    # No more video frames left
    else:
      break
			
  # Stop when the video is finished
  cap.release()
	
  # Release the video recording
  result.release()
	
  # Close all windows
  cv2.destroyAllWindows() 

main()

Run the code.

python detect_pedestrians_hog.py

Video Results

Here is a video of the output:

What is Deep Learning?

In previous posts, I’ve talked a lot about deep learning.

However, I have never actually explained, in a concise way, what deep learning is, so here we go.

Deep learning is a technique for teaching a computer how to make predictions based on a set of inputs.

Input Data —–> Deep Learning Algorithm (i.e. Process) —–> Output Data

To make predictions (i.e. the “Process” part of the line above), deep learning uses deep neural networks. A deep neural network is a computer-based, simplified representation of neurons in the brain. It is computer science’s attempt to get a computer to process information just like real neurons in our brains do.

neural-network

Deep neural networks are well suited for complex applications like computer vision, natural language processing, and machine translation where you want to draw useful information from nonlinear and unstructured data such as images, audio, or text.

How To Draw Contours Around Objects Using OpenCV

In this tutorial, you will learn how to draw a contour around an object.

Prerequisites

Draw a Contour Around a T-Shirt

tshirt-1

We’ll start with this t-shirt above. Save that image to some folder on your computer.

Now, in the same folder you saved that image above (we’ll call the file tshirt.jpg), open up a new Python program.

Name the program draw_contour.py.

Write the following code:

# Project: How To Draw Contours Around Objects Using OpenCV
# Author: Addison Sears-Collins
# Date created: October 7, 2020
# Reference: https://stackoverflow.com/questions/58405171/how-to-find-the-extreme-corner-point-in-image

import cv2 # OpenCV library
import numpy as np # NumPy scientific computing library

# Read the image
image = cv2.imread("tshirt.jpg")

# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Convert the image to black and white.
# Modify the threshold (e.g. 75 for tshirt.jpg) accordingly depending on how to output looks.
# If you have a dark item on a light background, use cv2.THRESH_BINARY_INV and consider 
# changing the lower color threshold to 115.
thresh = cv2.threshold(gray, 75, 255, cv2.THRESH_BINARY)[1]
#thresh = cv2.threshold(gray, 115, 255, cv2.THRESH_BINARY_INV)[1]

# Create a kernel (i.e. a small matrix)
kernel = np.ones((5,5),np.uint8)

# Use the kernel to perform morphological opening
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# If you have a dark item on a light background, uncomment this line.
#thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# Find the contours
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

# Create a blank image
blank_image = np.ones((450,600,3), np.uint8)

# Set the minimum area for a contour
min_area = 5000

# Draw the contours on the original image and the blank image
for c in cnts:
    area = cv2.contourArea(c)
    if area > min_area:
        cv2.drawContours(image,[c], 0, (36,255,12), 2)
        cv2.drawContours(blank_image,[c], 0, (255,255,255), 2)

# Conver the blank image to grayscale for corner detection
gray = cv2.cvtColor(blank_image, cv2.COLOR_BGR2GRAY)

# Detect corners using the contours
corners = cv2.goodFeaturesToTrack(image=gray,maxCorners=25,qualityLevel=0.20,minDistance=50) # Determines strong corners on an image

# Draw the corners on the original image
for corner in corners:
    x,y = corner.ravel()
    cv2.circle(image,(x,y),10,(0,0,255),-1)

 # Display the image
image_copy = cv2.imread("tshirt.jpg")
cv2.imshow('original image', image_copy)
cv2.imshow('image with contours and corners', image)
cv2.imshow('blank_image with contours', blank_image)

# Save the image that has the contours and corners
cv2.imwrite('contour_tshirt.jpg', image)

# Save the image that has just the contours
cv2.imwrite('contour_tshirt_blank_image.jpg', blank_image)

# Exit OpenCV
if cv2.waitKey(0) & 0xff == 27:
    cv2.destroyAllWindows()

Run the code. Here is what you should see:

contour_tshirt_blank_image
Just the contour.
contour_tshirt
The contour with corner points.

Detecting Corners on Jeans

To detect corners on jeans, you’ll need to make the changes mentioned in the code. This is because the jeans are a dark object on a light background (in contrast to a light object on a dark background in the case of the t-shirt).

Let’s draw a contour around the pair of jeans.

Here is the input image (jeans.jpg):

jeans-1

Change the fileName variable in your code so that it is assigned the name of the image (‘jeans.jpg’).

Here is the output image:

contour_jeans_blank_image
contour_jeans

That’s it. Keep building!