In this tutorial, we will create a program to track a moving object in real-time using the built-in webcam of a laptop computer. We will use Python and the OpenCV computer vision library for the code.
A real-world application of this is in robotics. Imagine you have a robot arm that needs to continuously pick up moving items from a conveyor belt inside a warehouse. In order for the robot to pick up an object it needs to know the exact coordinates of the object. The program we will create below will give you the basic building block to do just that. It will locate the coordinates of the center of the moving object (often called the “centroid“) in real-time using an ordinary webcam.
Let’s get started!
Prerequisites
- Python 3.7 or higher
Requirements
Using real-time streaming video from your built-in webcam, create a program that:
- Draws a bounding box around a moving object
- Calculates the coordinates of the centroid of the object
- Tracks the centroid of the object
Directions
Open up your favorite IDE or code editor.
Make sure you have the OpenCV and Numpy libraries installed. There are a number of ways to install both libraries. The most common way is to use pip, which is the standard package manager for Python.
pip install opencv-python
pip install numpy
Copy and paste the code below. This is all you need to run the program.
I put detailed comments inside the code so that you know what is going on. The technique used here is background subtraction, one of the most common ways to detect moving objects in a video stream:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | #!/usr/bin/env python ''' Welcome to the Object Tracking Program! Using real-time streaming video from your built-in webcam, this program: - Creates a bounding box around a moving object - Calculates the coordinates of the centroid of the object - Tracks the centroid of the object Author: - Addison Sears-Collins ''' from __future__ import print_function # Python 2/3 compatibility import cv2 # Import the OpenCV library import numpy as np # Import Numpy library # Project: Object Tracking # Author: Addison Sears-Collins # Website: https://automaticaddison.com # Date created: 06/13/2020 # Python version: 3.7 def main(): """ Main method of the program. """ # Create a VideoCapture object cap = cv2.VideoCapture( 0 ) # Create the background subtractor object # Use the last 700 video frames to build the background back_sub = cv2.createBackgroundSubtractorMOG2(history = 700 , varThreshold = 25 , detectShadows = True ) # Create kernel for morphological operation # You can tweak the dimensions of the kernel # e.g. instead of 20,20 you can try 30,30. kernel = np.ones(( 20 , 20 ),np.uint8) while ( True ): # Capture frame-by-frame # This method returns True/False as well # as the video frame. ret, frame = cap.read() # Use every frame to calculate the foreground mask and update # the background fg_mask = back_sub. apply (frame) # Close dark gaps in foreground object using closing fg_mask = cv2.morphologyEx(fg_mask, cv2.MORPH_CLOSE, kernel) # Remove salt and pepper noise with a median filter fg_mask = cv2.medianBlur(fg_mask, 5 ) # Threshold the image to make it either black or white _, fg_mask = cv2.threshold(fg_mask, 127 , 255 ,cv2.THRESH_BINARY) # Find the index of the largest contour and draw bounding box fg_mask_bb = fg_mask contours, hierarchy = cv2.findContours(fg_mask_bb,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)[ - 2 :] areas = [cv2.contourArea(c) for c in contours] # If there are no countours if len (areas) < 1 : # Display the resulting frame cv2.imshow( 'frame' ,frame) # If "q" is pressed on the keyboard, # exit this loop if cv2.waitKey( 1 ) & 0xFF = = ord ( 'q' ): break # Go to the top of the while loop continue else : # Find the largest moving object in the image max_index = np.argmax(areas) # Draw the bounding box cnt = contours[max_index] x,y,w,h = cv2.boundingRect(cnt) cv2.rectangle(frame,(x,y),(x + w,y + h),( 0 , 255 , 0 ), 3 ) # Draw circle in the center of the bounding box x2 = x + int (w / 2 ) y2 = y + int (h / 2 ) cv2.circle(frame,(x2,y2), 4 ,( 0 , 255 , 0 ), - 1 ) # Print the centroid coordinates (we'll use the center of the # bounding box) on the image text = "x: " + str (x2) + ", y: " + str (y2) cv2.putText(frame, text, (x2 - 10 , y2 - 10 ), cv2.FONT_HERSHEY_SIMPLEX, 0.5 , ( 0 , 255 , 0 ), 2 ) # Display the resulting frame cv2.imshow( 'frame' ,frame) # If "q" is pressed on the keyboard, # exit this loop if cv2.waitKey( 1 ) & 0xFF = = ord ( 'q' ): break # Close down the video stream cap.release() cv2.destroyAllWindows() if __name__ = = '__main__' : print (__doc__) main() |