How to Write a Python Program for NVIDIA Jetson Nano

In this tutorial, we will write a basic Python program for NVIDIA Jetson Nano.

Prerequisites

You have already set up your NVIDIA Jetson Nano.
This tutorial will be similar to my Python fundamentals for robotics tutorial.

Install Python

To install Python, open a new terminal window and type:

sudo apt-get install python python3

To find out where the Python interpreter is located, type this command.

which python

You should see:

/usr/bin/python

Install Gedit

Install gedit, a text editor that will enable us to write code in Python.

sudo apt-get install gedit

Install Pip

Let’s begin by installing pip. Pip is a tool that will help us manage software packages for Python.

Software packages are bundles of code written by someone else that are designed to solve a specific problem. Why write code to solve a specific problem from scratch, when someone else has already written code to solve that exact same problem? That is where software packages come into play. They prevent you from having to reinvent the wheel.

Open up a fresh Linux terminal window.

Type the following command to update the list of available packages that can be installed on your system.

sudo apt-get update

Type your password.

Upgrade all the packages. The -y flag in the following command is used to confirm to our computer that we want to upgrade all the packages.

sudo apt-get -y upgrade

Type the following command to check the version of Python you have installed.

python3 --version

My version is 3.6.9. Your version might be different. That’s fine.

Now, let’s install pip.

sudo apt-get install -y python3-pip

If at any point in the future you want to install a Python-related package using pip, you can use the following command:

pip3 install package_name

Create a Virtual Environment

In this section, we will set up a virtual environment. You can think of a virtual environment as an independent workspace with its own set of libraries, settings, packages, and programming language versions installed.

For example, you might have a project that needs to run using an older version of Python, like Python 2.7. You might have another project that requires Python 3.8. Setting up separate virtual environments for each project will make sure that the projects stay isolated from one another.

Let’s install the virtual environment package.

sudo apt-get install -y python3-venv

With the software installed, we can now create the virtual environment using the following command. The dot(.) in front of py3venv makes the directory a hidden directory (the dot is optional):

python3 -m venv .py3venv

Type the following command to get a list of all the directories. You should see the .py3venv folder there.

ls -a

List all the contents inside the .py3venv folder.

ls .py3venv/

Now that the virtual environment has been created, we can activate it using the following command:

source ~/.py3venv/bin/activate

Look what happened. There is a prefix on the current line that has the name of the virtual environment we created. This prefix means that the .py3venv virtual environment is currently active.

When a virtual environment is active that means that when we create software programs here in Python, these programs will use the settings and packages of just this virtual environment.

Keep your terminal window open. We’re not ready to close it just yet. Move on to the next section so that we can write our first program in Python.

Write a “Hello World” Program

Let’s write a program that does nothing but print “Hello Automatic Addison” (i.e. my version of a “Hello World” program) to the screen.

Create a new folder.

mkdir py_basics

Move to that folder.

cd py_basics

Open a new Python program.

gedit hello_automaticaddison.py

Type the following code in there:

#!/usr/bin/env python 
print("Hello Automatic Addison!")

Save the file, and close it.

See if your file is in there.

ls

Run the program.

python hello_automaticaddison.py

Deactivate the virtual environment.

deactivate

That’s it. Keep building!

How to Determine the Orientation of an Object Using OpenCV

In this tutorial, we will build a program that can determine the orientation of an object (i.e. rotation angle in degrees) using the popular computer vision library OpenCV.

Real-World Applications

One of the most common real-world use cases of the program we will develop in this tutorial is when you want to develop a pick and place system for robotic arms. Determining the orientation of an object on a conveyor belt is key to determining the appropriate way to grasp the object, pick it up, and place it in another location.

Let’s get started!

Prerequisites

Python 3.7 or higher

Installation and Setup

Before we get started, let’s make sure we have all the software packages installed. Check to see if you have OpenCV installed on your machine. If you are using Anaconda, you can type:

conda install -c conda-forge opencv

Alternatively, you can type:

pip install opencv-python

Install Numpy, the scientific computing library.

pip install numpy

Find an Image File

Find an image. My input image is 1200 pixels in width and 900 pixels in height. The filename of my input image is input_img.jpg.

Write the Code

Here is the code. It accepts an image named input_img.jpg and outputs an annotated image named output_img.jpg. Pieces of the code pull from the official OpenCV implementation.

import cv2 as cv
from math import atan2, cos, sin, sqrt, pi
import numpy as np

def drawAxis(img, p_, q_, color, scale):
  p = list(p_)
  q = list(q_)

  ## [visualization1]
  angle = atan2(p[1] - q[1], p[0] - q[0]) # angle in radians
  hypotenuse = sqrt((p[1] - q[1]) * (p[1] - q[1]) + (p[0] - q[0]) * (p[0] - q[0]))

  # Here we lengthen the arrow by a factor of scale
  q[0] = p[0] - scale * hypotenuse * cos(angle)
  q[1] = p[1] - scale * hypotenuse * sin(angle)
  cv.line(img, (int(p[0]), int(p[1])), (int(q[0]), int(q[1])), color, 3, cv.LINE_AA)

  # create the arrow hooks
  p[0] = q[0] + 9 * cos(angle + pi / 4)
  p[1] = q[1] + 9 * sin(angle + pi / 4)
  cv.line(img, (int(p[0]), int(p[1])), (int(q[0]), int(q[1])), color, 3, cv.LINE_AA)

  p[0] = q[0] + 9 * cos(angle - pi / 4)
  p[1] = q[1] + 9 * sin(angle - pi / 4)
  cv.line(img, (int(p[0]), int(p[1])), (int(q[0]), int(q[1])), color, 3, cv.LINE_AA)
  ## [visualization1]

def getOrientation(pts, img):
  ## [pca]
  # Construct a buffer used by the pca analysis
  sz = len(pts)
  data_pts = np.empty((sz, 2), dtype=np.float64)
  for i in range(data_pts.shape[0]):
    data_pts[i,0] = pts[i,0,0]
    data_pts[i,1] = pts[i,0,1]

  # Perform PCA analysis
  mean = np.empty((0))
  mean, eigenvectors, eigenvalues = cv.PCACompute2(data_pts, mean)

  # Store the center of the object
  cntr = (int(mean[0,0]), int(mean[0,1]))
  ## [pca]

  ## [visualization]
  # Draw the principal components
  cv.circle(img, cntr, 3, (255, 0, 255), 2)
  p1 = (cntr[0] + 0.02 * eigenvectors[0,0] * eigenvalues[0,0], cntr[1] + 0.02 * eigenvectors[0,1] * eigenvalues[0,0])
  p2 = (cntr[0] - 0.02 * eigenvectors[1,0] * eigenvalues[1,0], cntr[1] - 0.02 * eigenvectors[1,1] * eigenvalues[1,0])
  drawAxis(img, cntr, p1, (255, 255, 0), 1)
  drawAxis(img, cntr, p2, (0, 0, 255), 5)

  angle = atan2(eigenvectors[0,1], eigenvectors[0,0]) # orientation in radians
  ## [visualization]

  # Label with the rotation angle
  label = "  Rotation Angle: " + str(-int(np.rad2deg(angle)) - 90) + " degrees"
  textbox = cv.rectangle(img, (cntr[0], cntr[1]-25), (cntr[0] + 250, cntr[1] + 10), (255,255,255), -1)
  cv.putText(img, label, (cntr[0], cntr[1]), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0,0,0), 1, cv.LINE_AA)

  return angle

# Load the image
img = cv.imread("input_img.jpg")

# Was the image there?
if img is None:
  print("Error: File not found")
  exit(0)

cv.imshow('Input Image', img)

# Convert image to grayscale
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

# Convert image to binary
_, bw = cv.threshold(gray, 50, 255, cv.THRESH_BINARY | cv.THRESH_OTSU)

# Find all the contours in the thresholded image
contours, _ = cv.findContours(bw, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)

for i, c in enumerate(contours):

  # Calculate the area of each contour
  area = cv.contourArea(c)

  # Ignore contours that are too small or too large
  if area < 3700 or 100000 < area:
    continue

  # Draw each contour only for visualisation purposes
  cv.drawContours(img, contours, i, (0, 0, 255), 2)

  # Find the orientation of each shape
  getOrientation(c, img)

cv.imshow('Output Image', img)
cv.waitKey(0)
cv.destroyAllWindows()
 
# Save the output image to the current directory
cv.imwrite("output_img.jpg", img)

Output Image

Here is the result:

Understanding the Rotation Axes

The positive x-axis of each object is the red line. The positive y-axis of each object is the blue line.

The global positive x-axis goes from left to right horizontally across the image. The global positive z-axis points out of this page. The global positive y-axis points from the bottom of the image to the top of the image vertically.

Using the right-hand rule to measure rotation, you stick your four fingers out straight (index finger to pinky finger) in the direction of the global positive x-axis.

You then rotate your four fingers 90 degrees counterclockwise. Your fingertips point towards the positive y-axis, and your thumb points out of this page towards the positive z-axis.

Calculate an Orientation Between 0 and 180 Degrees

If we want to calculate the orientation of an object and make sure that the result is always between 0 and 180 degrees, we can use this code:

# This programs calculates the orientation of an object.
# The input is an image, and the output is an annotated image
# with the angle of otientation for each object (0 to 180 degrees)

import cv2 as cv
from math import atan2, cos, sin, sqrt, pi
import numpy as np

# Load the image
img = cv.imread("input_img.jpg")

# Was the image there?
if img is None:
  print("Error: File not found")
  exit(0)

cv.imshow('Input Image', img)

# Convert image to grayscale
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

# Convert image to binary
_, bw = cv.threshold(gray, 50, 255, cv.THRESH_BINARY | cv.THRESH_OTSU)

# Find all the contours in the thresholded image
contours, _ = cv.findContours(bw, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)

for i, c in enumerate(contours):

  # Calculate the area of each contour
  area = cv.contourArea(c)

  # Ignore contours that are too small or too large
  if area < 3700 or 100000 < area:
    continue

  # cv.minAreaRect returns:
  # (center(x, y), (width, height), angle of rotation) = cv2.minAreaRect(c)
  rect = cv.minAreaRect(c)
  box = cv.boxPoints(rect)
  box = np.int0(box)

  # Retrieve the key parameters of the rotated bounding box
  center = (int(rect[0][0]),int(rect[0][1])) 
  width = int(rect[1][0])
  height = int(rect[1][1])
  angle = int(rect[2])

  	
  if width < height:
    angle = 90 - angle
  else:
    angle = -angle
		
  label = "  Rotation Angle: " + str(angle) + " degrees"
  textbox = cv.rectangle(img, (center[0]-35, center[1]-25), 
    (center[0] + 295, center[1] + 10), (255,255,255), -1)
  cv.putText(img, label, (center[0]-50, center[1]), 
    cv.FONT_HERSHEY_SIMPLEX, 0.7, (0,0,0), 1, cv.LINE_AA)
  cv.drawContours(img,[box],0,(0,0,255),2)

cv.imshow('Output Image', img)
cv.waitKey(0)
cv.destroyAllWindows()
 
# Save the output image to the current directory
cv.imwrite("min_area_rec_output.jpg", img)

Here is the output:

That’s it. Keep building!

Human Pose Estimation Using Deep Learning in OpenCV

In this tutorial, we will implement human pose estimation. Pose estimation means estimating the position and orientation of objects (in this case humans) relative to the camera. By the end of this tutorial, you will be able to generate the following output:

Real-World Applications

Human pose estimation has a number of real-world applications:

Robotic task learning: Enabling robots to acquire new skills by imitating the actions of a human teacher
Virtual reality applications
Augmented reality applications (overlaying graphics on top of physical objects)
Sign language understanding
Recognizing human poses (sitting, standing, running, etc.)

Let’s get started!

Prerequisites

Python 3.7 or higher

Installation and Setup

We need to make sure we have all the software packages installed. Check to see if you have OpenCV installed on your machine. If you are using Anaconda, you can type:

conda install -c conda-forge opencv

Alternatively, you can type:

pip install opencv-python

Make sure you have NumPy installed, a scientific computing library for Python.

If you’re using Anaconda, you can type:

conda install numpy

Alternatively, you can type:

pip install numpy

Find Some Videos

The first thing we need to do is find some videos to serve as our test cases.

We want to download videos that contain humans. The video files should be in mp4 format and 1920 x 1080 in dimensions.

I found some good candidates on Pixabay.com and Dreamstime.com.

Take your videos and put them inside a directory on your computer.

Download the Protobuf File

Inside the same directory as your videos, download the protobuf file on this page. It is named graph_opt.pb. This file contains the weights of the neural network. The neural network is what we will use to determine the human’s position and orientation (i.e. pose).

Brief Description of OpenPose

We will use the OpenPose application along with OpenCV to do what we need to do in this project. OpenPose is an open source real-time 2D pose estimation application for people in video and images. It was developed by students and faculty members at Carnegie Mellon University.

You can learn the theory and details of how OpenPose works in this paper and at GeeksforGeeks.

Write the Code

Here is the code. Make sure you put the code in the same directory on your computer where you put the other files.

The only lines you need to change are:

Line 14 (name of the input file in mp4 format)
Line 15 (input file size)
Line 18 (output file name)

# Project: Human Pose Estimation Using Deep Learning in OpenCV
# Author: Addison Sears-Collins
# Date created: February 25, 2021
# Description: A program that takes a video with a human as input and outputs
# an annotated version of the video with the human's position and orientation..

# Reference: https://github.com/quanhua92/human-pose-estimation-opencv

# Import the important libraries
import cv2 as cv # Computer vision library
import numpy as np # Scientific computing library

# Make sure the video file is in the same directory as your code
filename = 'dancing32.mp4'
file_size = (1920,1080) # Assumes 1920x1080 mp4 as the input video file

# We want to save the output to a video file
output_filename = 'dancing32_output.mp4'
output_frames_per_second = 20.0 

BODY_PARTS = { "Nose": 0, "Neck": 1, "RShoulder": 2, "RElbow": 3, "RWrist": 4,
               "LShoulder": 5, "LElbow": 6, "LWrist": 7, "RHip": 8, "RKnee": 9,
               "RAnkle": 10, "LHip": 11, "LKnee": 12, "LAnkle": 13, "REye": 14,
               "LEye": 15, "REar": 16, "LEar": 17, "Background": 18 }

POSE_PAIRS = [ ["Neck", "RShoulder"], ["Neck", "LShoulder"], ["RShoulder", "RElbow"],
               ["RElbow", "RWrist"], ["LShoulder", "LElbow"], ["LElbow", "LWrist"],
               ["Neck", "RHip"], ["RHip", "RKnee"], ["RKnee", "RAnkle"], ["Neck", "LHip"],
               ["LHip", "LKnee"], ["LKnee", "LAnkle"], ["Neck", "Nose"], ["Nose", "REye"],
               ["REye", "REar"], ["Nose", "LEye"], ["LEye", "LEar"] ]

# Width and height of training set
inWidth = 368
inHeight = 368

net = cv.dnn.readNetFromTensorflow("graph_opt.pb")

cap = cv.VideoCapture(filename)

# Create a VideoWriter object so we can save the video output
fourcc = cv.VideoWriter_fourcc(*'mp4v')
result = cv.VideoWriter(output_filename,  
                         fourcc, 
                         output_frames_per_second, 
                         file_size) 
# Process the video
while cap.isOpened():
    hasFrame, frame = cap.read()
    if not hasFrame:
        cv.waitKey()
        break

    frameWidth = frame.shape[1]
    frameHeight = frame.shape[0]
    
    net.setInput(cv.dnn.blobFromImage(frame, 1.0, (inWidth, inHeight), (127.5, 127.5, 127.5), swapRB=True, crop=False))
    out = net.forward()
    out = out[:, :19, :, :]  # MobileNet output [1, 57, -1, -1], we only need the first 19 elements

    assert(len(BODY_PARTS) == out.shape[1])

    points = []
    for i in range(len(BODY_PARTS)):
        # Slice heatmap of corresponging body's part.
        heatMap = out[0, i, :, :]

        # Originally, we try to find all the local maximums. To simplify a sample
        # we just find a global one. However only a single pose at the same time
        # could be detected this way.
        _, conf, _, point = cv.minMaxLoc(heatMap)
        x = (frameWidth * point[0]) / out.shape[3]
        y = (frameHeight * point[1]) / out.shape[2]
        # Add a point if it's confidence is higher than threshold.
        # Feel free to adjust this confidence value.  
        points.append((int(x), int(y)) if conf > 0.2 else None)

    for pair in POSE_PAIRS:
        partFrom = pair[0]
        partTo = pair[1]
        assert(partFrom in BODY_PARTS)
        assert(partTo in BODY_PARTS)

        idFrom = BODY_PARTS[partFrom]
        idTo = BODY_PARTS[partTo]

        if points[idFrom] and points[idTo]:
            cv.line(frame, points[idFrom], points[idTo], (0, 255, 0), 3)
            cv.ellipse(frame, points[idFrom], (3, 3), 0, 0, 360, (255, 0, 0), cv.FILLED)
            cv.ellipse(frame, points[idTo], (3, 3), 0, 0, 360, (255, 0, 0), cv.FILLED)

    t, _ = net.getPerfProfile()
    freq = cv.getTickFrequency() / 1000
    cv.putText(frame, '%.2fms' % (t / freq), (10, 20), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0))

    # Write the frame to the output video file
    result.write(frame)
		
# Stop when the video is finished
cap.release()
	
# Release the video recording
result.release()

Run the Code

To run the code, type:

python openpose.py

Video Output

Here is the output I got:

Further Work

If you would like to do a deep dive into pose estimation, check out the official GitHub for the OpenPose project here.

That’s it. Keep building!