In this tutorial, we will write a program to detect pedestrians in a photo and a video using a technique called the Histogram of Oriented Gradients (HOG). We will use the OpenCV computer vision library, which has a built-in pedestrian detection method that is based on the original research paper on HOG.
I won’t go into the details and the math behind HOG (you don’t need to know the math in order to implement the algorithm), but if you’re interested in learning about what goes on under the hood, check out that research paper.
The first thing you need to do is to make sure you have OpenCV installed on your machine. If you’re using Anaconda, you can type this command into the terminal:
conda install -c conda-forge opencv
Alternatively, you can type:
pip install opencv-python
Detect Pedestrians in an Image
Get an image that contains pedestrians and put it inside a directory somewhere on your computer.
Here are a couple images that I will use as test cases:
Inside that same directory, write the following code. I will save my file as detect_pedestrians_hog.py:
import cv2 # Import the OpenCV library to enable computer vision
# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: Detect pedestrians in an image using the
# Histogram of Oriented Gradients (HOG) method
# Make sure the image file is in the same directory as your code
filename = 'pedestrians_1.jpg'
def main():
# Create a HOGDescriptor object
hog = cv2.HOGDescriptor()
# Initialize the People Detector
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
# Load an image
image = cv2.imread(filename)
# Detect people
# image: Source image
# winStride: step size in x and y direction of the sliding window
# padding: no. of pixels in x and y direction for padding of sliding window
# scale: Detection window size increase coefficient
# bounding_boxes: Location of detected people
# weights: Weight scores of detected people
(bounding_boxes, weights) = hog.detectMultiScale(image,
winStride=(4, 4),
padding=(8, 8),
scale=1.05)
# Draw bounding boxes on the image
for (x, y, w, h) in bounding_boxes:
cv2.rectangle(image,
(x, y),
(x + w, y + h),
(0, 0, 255),
4)
# Create the output file name by removing the '.jpg' part
size = len(filename)
new_filename = filename[:size - 4]
new_filename = new_filename + '_detect.jpg'
# Save the new image in the working directory
cv2.imwrite(new_filename, image)
# Display the image
cv2.imshow("Image", image)
# Display the window until any key is pressed
cv2.waitKey(0)
# Close all windows
cv2.destroyAllWindows()
main()
Run the code.
python detect_pedestrians_hog.py
Image Results
Here is the output for the first image:
Here is the output for the second image:
Detect Pedestrians in a Video
Now that we know how to detect pedestrians in an image, let’s detect pedestrians in a video.
Find a video file that has pedestrians in it. You can check free video sites like Pixabay if you don’t have any videos.
The video should have dimensions of 1920 x 1080 for this implementation. If you have a video of another size, you will have to tweak the parameters in the code.
Place the video inside a directory.
Now let’s download a library that will apply a fancy mathematical technique called non-maxima suppression to take multiple overlapping bounding boxes and compress them into just one bounding box.
pip install --upgrade imutils
Also, make sure you have NumPy installed, a scientific computing library for Python.
If you’re using Anaconda, you can type:
conda install numpy
Alternatively, you can type:
pip install numpy
Inside that same directory, write the following code. I will save my file as detect_pedestrians_video_hog.py. We will save the output as an .mp4 video file:
# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: Detect pedestrians in a video using the
# Histogram of Oriented Gradients (HOG) method
import cv2 # Import the OpenCV library to enable computer vision
import numpy as np # Import the NumPy scientific computing library
from imutils.object_detection import non_max_suppression # Handle overlapping
# Make sure the video file is in the same directory as your code
filename = 'pedestrians_on_street_1.mp4'
file_size = (1920,1080) # Assumes 1920x1080 mp4
scale_ratio = 1 # Option to scale to fraction of original size.
# We want to save the output to a video file
output_filename = 'pedestrians_on_street.mp4'
output_frames_per_second = 20.0
def main():
# Create a HOGDescriptor object
hog = cv2.HOGDescriptor()
# Initialize the People Detector
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
# Load a video
cap = cv2.VideoCapture(filename)
# Create a VideoWriter object so we can save the video output
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
result = cv2.VideoWriter(output_filename,
fourcc,
output_frames_per_second,
file_size)
# Process the video
while cap.isOpened():
# Capture one frame at a time
success, frame = cap.read()
# Do we have a video frame? If true, proceed.
if success:
# Resize the frame
width = int(frame.shape[1] * scale_ratio)
height = int(frame.shape[0] * scale_ratio)
frame = cv2.resize(frame, (width, height))
# Store the original frame
orig_frame = frame.copy()
# Detect people
# image: a single frame from the video
# winStride: step size in x and y direction of the sliding window
# padding: no. of pixels in x and y direction for padding of
# sliding window
# scale: Detection window size increase coefficient
# bounding_boxes: Location of detected people
# weights: Weight scores of detected people
# Tweak these parameters for better results
(bounding_boxes, weights) = hog.detectMultiScale(frame,
winStride=(16, 16),
padding=(4, 4),
scale=1.05)
# Draw bounding boxes on the frame
for (x, y, w, h) in bounding_boxes:
cv2.rectangle(orig_frame,
(x, y),
(x + w, y + h),
(0, 0, 255),
2)
# Get rid of overlapping bounding boxes
# You can tweak the overlapThresh value for better results
bounding_boxes = np.array([[x, y, x + w, y + h] for (
x, y, w, h) in bounding_boxes])
selection = non_max_suppression(bounding_boxes,
probs=None,
overlapThresh=0.45)
# draw the final bounding boxes
for (x1, y1, x2, y2) in selection:
cv2.rectangle(frame,
(x1, y1),
(x2, y2),
(0, 255, 0),
4)
# Write the frame to the output video file
result.write(frame)
# Display the frame
cv2.imshow("Frame", frame)
# Display frame for X milliseconds and check if q key is pressed
# q == quit
if cv2.waitKey(25) & 0xFF == ord('q'):
break
# No more video frames left
else:
break
# Stop when the video is finished
cap.release()
# Release the video recording
result.release()
# Close all windows
cv2.destroyAllWindows()
main()
In this tutorial, we will use Tensorflow 2.0 with Keras to build a deep neural network that will enable us to predict a vehicle’s fuel economy (in miles per gallon) from eight different attributes:
Cylinders
Displacement
Horsepower
Weight
Acceleration
Model year
Origin
Car name
We will use the Auto MPG Data Set at the UCI Machine Learning Repository.
If you want to use GPU support for your TensorFlow installation, you will need to follow these steps. If you have trouble following those steps, you can follow these steps (note that the steps change quite frequently, but the overall process remains relatively the same).
Directions
Open up a new Python program (in your favorite text editor or Python IDE) and write the following code. I’m going to name the program vehicle_fuel_economy.py. I’ll explain the code later in this tutorial.
# Project: Predict Vehicle Fuel Economy Using a Deep Neural Network
# Author: Addison Sears-Collins
# Date created: November 3, 2020
import pandas as pd # Used for data analysis
import pathlib # An object-oriented interface to the filesystem
import matplotlib.pyplot as plt # Handles the creation of plots
import seaborn as sns # Data visualization library
import tensorflow as tf # Machine learning library
from tensorflow import keras # Library for neural networks
from tensorflow.keras import layers # Handles the layers of the neural network
def main():
# Set the data path for the Auto-Mpg data set from the UCI Machine Learning Repository
datasetPath = keras.utils.get_file("auto-mpg.data", "https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")
# Set the column names for the data set
columnNames = ['MPG', 'Cylinders','Displacement','Horsepower','Weight',
'Acceleration','Model Year','Origin']
# Import the data set
originalData = pd.read_csv(datasetPath, names=columnNames, na_values = "?",
comment='\t', sep=" ", skipinitialspace=True)
# Check the data set
# print("Original Data Set Excerpt")
# print(originalData.head())
# print()
# Generate a copy of the data set
data = originalData.copy()
# Count how many NAs each data attribute has
# print("Number of NAs in the data set")
# print(data.isna().sum())
# print()
# Now, let's remove the NAs from the data set
data = data.dropna()
# Perform one-hot encoding on the Origin attribute
# since it is a categorical variable
origin = data.pop('Origin') # Return item and drop from frame
data['USA'] = (origin == 1) * 1.0
data['Europe'] = (origin == 2) * 1.0
data['Japan'] = (origin == 3) * 1.0
# Generate a training data set (80% of the data) and a testing set (20% of the data)
trainingData = data.sample(frac = 0.8, random_state = 0)
# Generate a testing data set
testingData = data.drop(trainingData.index)
# Separate the attributes from the label in both the testing
# and training data. The label is the thing we are trying
# to predit (i.e. miles per gallon 'MPG')
trainingLabelData = trainingData.pop('MPG')
testingLabelData = testingData.pop('MPG')
# Normalize the data
normalizedTrainingData = normalize(trainingData)
normalizedTestingData = normalize(testingData)
#print(normalizedTrainingData.head())
# Generate the neural network
neuralNet = generateNeuralNetwork(trainingData)
# See a summary of the neural network
# The first layer has 640 parameters
#(9 input values * 64 neurons) + 64 bias values
# The second layer has 4160 parameters
#(64 input values * 64 neurons) + 64 bias values
# The output layer has 65 parameters
#(64 input values * 1 neuron) + 1 bias value
#print(neuralNet.summary())
EPOCHS = 1000
# Train the model for a fixed number of epochs
# history.history attribute is returned from the fit() function.
# history.history is a record of training loss values and
# metrics values at successive epochs, as well as validation
# loss values and validation metrics values.
history = neuralNet.fit(
x = normalizedTrainingData,
y = trainingLabelData,
epochs = EPOCHS,
validation_split = 0.2,
verbose = 0,
callbacks = [PrintDot()]
)
# Plot the neural network metrics (Training error and validation error)
# Training error is the error when the trained neural network is
# run on the training data.
# Validation error is used to minimize overfitting. It indicates how
# well the data fits on data it hasn't been trained on.
#plotNeuralNetMetrics(history)
# Generate another neural network so that we can use early stopping
neuralNet2 = generateNeuralNetwork(trainingData)
# We want to stop training the model when the
# validation error stops improving.
# monitor indicates the quantity we want to monitor.
# patience indicates the number of epochs with no improvement after which
# training will terminate.
earlyStopping = keras.callbacks.EarlyStopping(monitor = 'val_loss', patience = 10)
history2 = neuralNet2.fit(
x = normalizedTrainingData,
y = trainingLabelData,
epochs = EPOCHS,
validation_split = 0.2,
verbose = 0,
callbacks = [earlyStopping, PrintDot()]
)
# Plot metrics
#plotNeuralNetMetrics(history2)
# Return the loss value and metrics values for the model in test mode
# The mean absolute error for the predictions should
# stabilize around 2 miles per gallon
loss, meanAbsoluteError, meanSquaredError = neuralNet2.evaluate(
x = normalizedTestingData,
y = testingLabelData,
verbose = 0
)
#print(f'\nMean Absolute Error on Test Data Set = {meanAbsoluteError} miles per gallon')
# Make fuel economy predictions by deploying the trained neural network on the
# test data set (data that is brand new for the trained neural network).
testingDataPredictions = neuralNet2.predict(normalizedTestingData).flatten()
# Plot the predicted MPG vs. the true MPG
# testingLabelData are the true MPG values
# testingDataPredictions are the predicted MPG values
#plotTestingDataPredictions(testingLabelData, testingDataPredictions)
# Plot the prediction error distribution
#plotPredictionError(testingLabelData, testingDataPredictions)
# Save the neural network in Hierarchical Data Format version 5 (HDF5) format
neuralNet2.save('fuel_economy_prediction_nnet.h5')
# Import the saved model
neuralNet3 = keras.models.load_model('fuel_economy_prediction_nnet.h5')
print("\n\nNeural network has loaded successfully...\n")
# Show neural network parameters
print(neuralNet3.summary())
# Make a prediction using the saved model we just imported
print("\nMaking predictions...")
testingDataPredictionsNN3 = neuralNet3.predict(normalizedTestingData).flatten()
# Show Predicted MPG vs. Actual MPG
plotTestingDataPredictions(testingLabelData, testingDataPredictionsNN3)
# Generate the neural network
def generateNeuralNetwork(trainingData):
# A Sequential model is a stack of layers where each layer is
# single-input, single-output
# This network below has 3 layers.
neuralNet = keras.Sequential([
# Each neuron in a layer recieves input from all the
# neurons in the previous layer (Densely connected)
# Use the ReLU activation function. This function transforms the input
# into a node (i.e. summed weighted input) into output
# The first layer needs to know the number of attributes (keys) in the data set.
# This first and second layers have 64 nodes.
layers.Dense(64, activation=tf.nn.relu, input_shape=[len(trainingData.keys())]),
layers.Dense(64, activation=tf.nn.relu),
layers.Dense(1) # This output layer is a single, continuous value (i.e. Miles per gallon)
])
# Penalize the update of the neural network parameters that are causing
# the cost function to have large oscillations by using a moving average
# of the square of the gradients and dibiding the gradient by the root of this
# average. Reduces the step size for large gradients and increases
# the step size for small gradients.
# The input into this function is the learning rate.
optimizer = keras.optimizers.RMSprop(0.001)
# Set the configurations for the model to get it ready for training
neuralNet.compile(loss = 'mean_squared_error',
optimizer = optimizer,
metrics = ['mean_absolute_error', 'mean_squared_error'])
return neuralNet
# Normalize the data set using the mean and standard deviation
def normalize(data):
statistics = data.describe()
statistics = statistics.transpose()
return(data - statistics['mean']) / statistics['std']
# Plot metrics for the neural network
def plotNeuralNetMetrics(history):
neuralNetMetrics = pd.DataFrame(history.history)
neuralNetMetrics['epoch'] = history.epoch
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Mean Abs Error [MPG]')
plt.plot(neuralNetMetrics['epoch'],
neuralNetMetrics['mean_absolute_error'],
label='Train Error')
plt.plot(neuralNetMetrics['epoch'],
neuralNetMetrics['val_mean_absolute_error'],
label = 'Val Error')
plt.ylim([0,5])
plt.legend()
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Mean Square Error [$MPG^2$]')
plt.plot(neuralNetMetrics['epoch'],
neuralNetMetrics['mean_squared_error'],
label='Train Error')
plt.plot(neuralNetMetrics['epoch'],
neuralNetMetrics['val_mean_squared_error'],
label = 'Val Error')
plt.ylim([0,20])
plt.legend()
plt.show()
# Plot prediction error
def plotPredictionError(testingLabelData, testingDataPredictions):
# Error = Predicted - Actual
error = testingDataPredictions - testingLabelData
plt.hist(error, bins = 50)
plt.xlim([-10,10])
plt.xlabel("Predicted MPG - Actual MPG")
_ = plt.ylabel("Count")
plt.show()
# Plot predictions vs. true values
def plotTestingDataPredictions(testingLabelData, testingDataPredictions):
# Plot the data points (x, y)
plt.scatter(testingLabelData, testingDataPredictions)
# Label the axes
plt.xlabel('True Values (Miles per gallon)')
plt.ylabel('Predicted Values (Miles per gallon)')
# Plot a line between (0,0) and (50,50)
point1 = [0, 0]
point2 = [50, 50]
xValues = [point1[0], point2[0]]
yValues = [point1[1], point2[1]]
plt.plot(xValues, yValues)
# Set the x and y axes limits
plt.xlim(0, 50)
plt.ylim(0, 50)
# x and y axes are equal in displayed dimensions
plt.gca().set_aspect('equal', adjustable='box')
# Show the plot
plt.show()
# Show the training process by printing a period for each epoch that completes
class PrintDot(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs):
if epoch % 100 == 0: print('')
print('.', end='')
main()
Save the Python program.
If you run your Python programs using Anaconda, open the Anaconda prompt.
If you like to run your programs in a virtual environment, activate the virtual environment. I have a virtual environment named tf_2.
conda activate tf_2
Navigate to the folder where you saved the Python program.
cd [path to folder]
For example,
cd C:\MyFiles
Install any libraries that you need. I didn’t have some of the libraries in the “import” section of my code installed, so I’ll install them now.
pip install pandas
pip install seaborn
To run the code, type:
python vehicle_fuel_economy.py
If you’re using a GPU with Tensorflow, and you’re getting error messages about libraries missing, go to this folder C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin, and you can search on the Internet for the missing dll files. Download them, and then put them in that bin folder.
Code Output
In this section, I will pull out snippets of the code and show you the resulting output when you uncomment those lines.
# Check the data set
print("Original Data Set Excerpt")
print(originalData.head())
print()
# Count how many NAs each data attribute has
print("Number of NAs in the data set")
print(data.isna().sum())
print()
# See a summary of the neural network
# The first layer has 640 parameters
#(9 input values * 64 neurons) + 64 bias values
# The second layer has 4160 parameters
#(64 input values * 64 neurons) + 64 bias values
# The output layer has 65 parameters
#(64 input values * 1 neuron) + 1 bias value
print(neuralNet.summary())
# Plot the neural network metrics (Training error and validation error)
# Training error is the error when the trained neural network is
# run on the training data.
# Validation error is used to minimize overfitting. It indicates how
# well the data fits on data it hasn't been trained on.
plotNeuralNetMetrics(history)
# Plot metrics
plotNeuralNetMetrics(history2)
print(f'\nMean Absolute Error on Test Data Set = {meanAbsoluteError} miles per gallon')
# Plot the predicted MPG vs. the true MPG
# testingLabelData are the true MPG values
# testingDataPredictions are the predicted MPG values
plotTestingDataPredictions(testingLabelData, testingDataPredictions)
# Plot the prediction error distribution
plotPredictionError(testingLabelData, testingDataPredictions)
# Save the neural network in Hierarchical Data Format version 5 (HDF5) format
neuralNet2.save('fuel_economy_prediction_nnet.h5')
# Import the saved model
neuralNet3 = keras.models.load_model('fuel_economy_prediction_nnet.h5')
References
Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proceedings on the Tenth International Conference of Machine Learning, 236-243, University of Massachusetts, Amherst. Morgan Kaufmann.
Inverse kinematics is about calculating the angles of joints (i.e. angles of the servo motors on a robotic arm) that will cause the end effector of a robotic arm (e.g. robotics gripper, hand, vacuum suction cup, etc.) to reach some desired position (x, y, z) in 3D space.
In this tutorial, we’ll take a look at two approaches: an analytical approach and a numerical approach. We’ll then code these up in Python so that you can see how the calculations are done in actual code.
Real-World Applications
This post contains a list of a lot of the applications of six degree of freedom robotic arms.
Analytical Approach vs Numerical Approach to Inverse Kinematics
The analytical approach to inverse kinematics involves a lot of matrix algebra and trigonometry.
The advantage of this approach is that once you’ve drawn the kinematic diagram and derived the equations, computation is fast (compared to the numerical approach, which is iterative). You don’t have to make initial guesses for the joint angles like you do in the numerical approach (we’ll look at this approach later in this tutorial).
The disadvantage of the analytical approach is that the kinematic diagram and trigonometric equations are tedious to derive. Also, the solutions from one robotic arm don’t generalize to other robotic arms. You have to derive new equations for each new robotic arm you work with that has a different kinematic structure.
Analytical Approach to Inverse Kinematics
In this section, we’ll explore one analytical approach to inverse kinematics (i.e. there are many approaches).
Assumptions
In this approach, we first need to start off by making some assumptions.
We have to assume that:
The first three joints of the robotic arm are the only ones that determine the position of the end effector.
The last three joints (and any other joints after that) determine the orientation of the end effector.
Overall Steps
Here are the steps for calculating inverse kinematics for a six degree of freedom robotic arm.
Step 1: Draw the kinematic diagram of just the first three joints, and perform inverse kinematics using the graphical approach.
Step 2: Compute the forward kinematics on the first three joints to get the rotation of joint 3 relative to the global (i.e. base) coordinate frame. The outcome of this step will yield a matrix rot_mat_0_3, which means the rotation of frame 3 relative to frame 0 (i.e. the global coordinate frame).
Step 3: Calculate the inverse of rot_mat_0_3.
Step 4: Compute the forward kinematics on the last three joints, and extract the part that governs the rotation. This rotation matrix will be denoted as rot_mat_3_6.
Step 5: Determine the rotation matrix from frame 0 to 6 (i.e. rot_mat_0_6).
Step 6: Taking our desired x, y, and z coordinates as input, use the inverse kinematics equations from Step 1 to calculate the angles for the first three joints.
Step 7: Given the joint angles from Step 6, use the rotation matrix to calculate the values for the last three joints of the robotic arm.
Let’s run through an example.
Draw the Kinematic Diagram of Just the First Three Joints and Do Inverse Kinematics
First, let’s draw the kinematic diagram for our entire robot. If you need a refresher on how to draw kinematic diagrams, check out this tutorial.
Our robotic arm will have a cylindrical-style base (i.e. range of motion resembles a cylinder) and a spherical wrist (i.e. range of motion resembles a sphere). Here is its kinematic diagram:
Here is the kinematic diagram of just the first three joints:
Let’s do inverse kinematics for the diagram above using the graphical approach to inverse kinematics. We first need to draw the aerial view of the diagram above.
From the aerial view, we can see that we have two equations that come out of that.
θ2 = tan-1(y/x)
d3 = sqrt(x2 + y2) – a3 – a4
Now, let’s draw the side view of the robotic arm (sorry for the tiny print).
We can see from the above image that:
d1 = z – a1 – a2
Calculate rot_mat_0_3
For this step, you can either use this method or the Denavit-Hartenberg method to calculate the rotation matrix of frame 3 relative to frame 0. Either of those methods will yield the following rotation matrix:
Calculate the Inverse of rot_mat_0_3
Now that we know rot_mat_0_3 (which we defined in the previous section), we need to take its inverse. The reason for this is due to the following expression:
We can left-multiply the inverse of the rotation matrix we found in the previous section to solve for rot_mat_3_6.
Calculate rot_mat_3_6
To calculate the rotation of frame 6 relative to frame 3, we need to go back to the kinematic diagram we drew earlier.
Using either the rotation matrix method or Denavit-Hartenberg, here is the rotation matrix you get when you consider just the frames from 3 to 6.
Determine rot_mat_0_6
Now, let’s determine the rotation matrix of frame 6 relative to frame 0, the global coordinate frame.
In this part, we need to determine what we want the orientation of the end effector to be so that we can define the rotation matrix of frame 6 relative to frame 0. We can choose any rotation matrix as long as it is a valid rotation matrix (I’ll explain what “valid” means below).
Imagine you want the end effector to point upwards towards the sky. In this case, z6 will point in the same direction as z0. Accordingly, if you can imagine z6 pointing straight upwards, x6 would be oriented in the opposite direction as x0, and y6 would be oriented in the opposite direction as y0. Our rotation matrix is thus:
The rotation matrix above is valid. The reason it is valid is because the length (also known as “norm” or “magnitude”) of each row and each column is 1. The length of a column or row is the square root of the sum of the squared elements.
For example, looking at column 1, we have:
Write Python Code
Open up your favorite Python IDE or wherever you like to write Python code.
Create up a new Python script. Call it inverse_kinematics_6dof_v1.py.
We want to set a desired position and orientation (relative to the base frame) for the end effector of the robotic arm and then have the program calculate the servo angles necessary to move the end effector to that position and orientation.
Write the following code. These expressions and matrices below (that we derived earlier) will be useful to you as you go through the code. Don’t be intimidated at how long the code is. Just copy and paste it into your favorite Python IDE or text editor, and read through it one line at a time.
#######################################################################################
# Progam: Inverse Kinematics for a 6DOF Robotic Arm Using an Analytical Approach
# Description: Given a desired end position (x, y, z) of the end effector of a robot,
# and a desired orientation of the end effector (relative to the base frame),
# calculate the joint angles (i.e. angles for the servo motors).
# Author: Addison Sears-Collins
# Website: https://automaticaddison.com
# Date: October 19, 2020
# Reference: Sodemann, Dr. Angela 2020, RoboGrok, accessed 14 October 2020, <http://robogrok.com/>
#######################################################################################
import numpy as np # Scientific computing library
# Define the desired position of the end effector
# This is the target (goal) location.
x = 4.0
y = 2.0
# Calculate the angle of the second joint
theta_2 = np.arctan2(y,x)
print(f'Theta 2 = {theta_2} radians\n')
# Define the desired orientation of the end effector relative to the base frame
# (i.e. global frame)
# This is the target orientation.
# The 3x3 rotation matrix of frame 6 relative to frame 0
rot_mat_0_6 = np.array([[-1.0, 0.0, 0.0],
[0.0, -1.0, 0.0],
[0.0, 0.0, 1.0]])
# The 3x3 rotation matrix of frame 3 relative to frame 0
rot_mat_0_3 = np.array([[-np.sin(theta_2), 0.0, np.cos(theta_2)],
[np.cos(theta_2), 0.0, np.sin(theta_2)],
[0.0, 1.0, 0.0]])
# Calculate the inverse rotation matrix
inv_rot_mat_0_3 = np.linalg.inv(rot_mat_0_3)
# Calculate the 3x3 rotation matrix of frame 6 relative to frame 3
rot_mat_3_6 = inv_rot_mat_0_3 @ rot_mat_0_6
print(f'rot_mat_3_6 = {rot_mat_3_6}')
# We know the equation for rot_mat_3_6 from our pencil and paper
# analysis. The simplest term in that matrix is in the third column,
# third row. The value there in variable terms is cos(theta_5).
# From the printing above, we know the value there. Therefore,
# cos(theta_5) = value in the third row, third column of rot_mat_3_6, which means...
# theta_5 = arccosine(value in the third row, third column of rot_mat_3_6)
theta_5 = np.arccos(rot_mat_3_6[2, 2])
print(f'\nTheta 5 = {theta_5} radians')
# Calculate the value for theta_6
# We'll use the expression in the third row, first column of rot_mat_3_6.
# -sin(theta_5)cos(theta_6) = rot_mat_3_6[2,0]
# Solving for theta_6...
# rot_mat_3_6[2,0]/-sin(theta_5) = cos(theta_6)
# arccosine(rot_mat_3_6[2,0]/-sin(theta_5)) = theta_6
theta_6 = np.arccos(rot_mat_3_6[2, 0] / -np.sin(theta_5))
print(f'\nTheta 6 = {theta_6} radians')
# Calculate the value for theta_4 using one of the other
# cells in rot_mat_3_6. We'll use the second row, third column.
# cos(theta_4)sin(theta_5) = rot_mat_3_6[1,2]
# cos(theta_4) = rot_mat_3_6[1,2] / sin(theta_5)
# theta_4 = arccosine(rot_mat_3_6[1,2] / sin(theta_5))
theta_4 = np.arccos(rot_mat_3_6[1,2] / np.sin(theta_5))
print(f'\nTheta 4 = {theta_4} radians')
# Check that the angles we calculated result in a valid rotation matrix
r11 = -np.sin(theta_4) * np.cos(theta_5) * np.cos(theta_6) - np.cos(theta_4) * np.sin(theta_6)
r12 = np.sin(theta_4) * np.cos(theta_5) * np.sin(theta_6) - np.cos(theta_4) * np.cos(theta_6)
r13 = -np.sin(theta_4) * np.sin(theta_5)
r21 = np.cos(theta_4) * np.cos(theta_5) * np.cos(theta_6) - np.sin(theta_4) * np.sin(theta_6)
r22 = -np.cos(theta_4) * np.cos(theta_5) * np.sin(theta_6) - np.sin(theta_4) * np.cos(theta_6)
r23 = np.cos(theta_4) * np.sin(theta_5)
r31 = -np.sin(theta_5) * np.cos(theta_6)
r32 = np.sin(theta_5) * np.sin(theta_6)
r33 = np.cos(theta_5)
check_rot_mat_3_6 = np.array([[r11, r12, r13],
[r21, r22, r23],
[r31, r32, r33]])
# Original matrix
print(f'\nrot_mat_3_6 = {rot_mat_3_6}')
# Check Matrix
print(f'\ncheck_rot_mat_3_6 = {check_rot_mat_3_6}\n')
# Return if Original Matrix == Check Matrix
rot_minus_check_3_6 = rot_mat_3_6.round() - check_rot_mat_3_6.round()
zero_matrix = np.array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
matrices_are_equal = np.array_equal(rot_minus_check_3_6, zero_matrix)
# Determine if the solution is valid or not
# If the solution is valid, that means the end effector of the robotic
# arm can reach that target location.
if (matrices_are_equal):
valid_matrix = "Yes"
else:
valid_matrix = "No"
print(f'Is this solution valid?\n{valid_matrix}\n')
Here is the output:
References
Sodemann, Dr. Angela 2020, RoboGrok, accessed 14 October 2020, <http://robogrok.com/>
Inverse Kinematics Using the Pseudoinverse Jacobian Method (Numerical Approach)
Let’s take a look at a numerical approach to inverse kinematics. There are a number of approaches, but in this section will explore one that is quite popular in industry and academia: the Pseudoinverse Jacobian Method.
This method is called “numerical” because it involves iteration to calculate the joint angles from the desired end effector position.
We’ll take a look at a SCARA-like robotic arm below that has 2 degrees of freedom (two joints/motors and 4 links).
To extend this example to a 6DOF robotic arm, just add 4 more joints and the appropriate number of links. Make sure before you do this step, that you’ve drawn the kinematic diagram. On this page, I have several examples of kinematic diagrams for 6DOF robotic arms.
Full Code in Python
Here is the full code. I named the program inv_kinematics_using_pseudo_jacobian.py. I’ll explain each piece step-by-step below.
#######################################################################################
# Progam: Inverse Kinematics for a Robotic Arm Using the Pseudoinverse of the Jacobian
# Description: Given a desired end position (x, y, z) of the end effector of a robot,
# calculate the joint angles (i.e. angles for the servo motors).
# Author: Addison Sears-Collins
# Website: https://automaticaddison.com
# Date: October 15, 2020
#######################################################################################
import numpy as np # Scientific computing library
def axis_angle_rot_matrix(k,q):
"""
Creates a 3x3 rotation matrix in 3D space from an axis and an angle.
Input
:param k: A 3 element array containing the unit axis to rotate around (kx,ky,kz)
:param q: The angle (in radians) to rotate by
Output
:return: A 3x3 element matix containing the rotation matrix
"""
#15 pts
c_theta = np.cos(q)
s_theta = np.sin(q)
v_theta = 1 - np.cos(q)
kx = k[0]
ky = k[1]
kz = k[2]
# First row of the rotation matrix
r00 = kx * kx * v_theta + c_theta
r01 = kx * ky * v_theta - kz * s_theta
r02 = kx * kz * v_theta + ky * s_theta
# Second row of the rotation matrix
r10 = kx * ky * v_theta + kz * s_theta
r11 = ky * ky * v_theta + c_theta
r12 = ky * kz * v_theta - kx * s_theta
# Third row of the rotation matrix
r20 = kx * kz * v_theta - ky * s_theta
r21 = ky * kz * v_theta + kx * s_theta
r22 = kz * kz * v_theta + c_theta
# 3x3 rotation matrix
rot_matrix = np.array([[r00, r01, r02],
[r10, r11, r12],
[r20, r21, r22]])
return rot_matrix
def hr_matrix(k,t,q):
'''
Create the Homogenous Representation matrix that transforms a point from Frame B to Frame A.
Using the axis-angle representation
Input
:param k: A 3 element array containing the unit axis to rotate around (kx,ky,kz)
:param t: The translation from the current frame (e.g. Frame A) to the next frame (e.g. Frame B)
:param q: The rotation angle (i.e. joint angle)
Output
:return: A 4x4 Homogenous representation matrix
'''
# Calculate the rotation matrix (angle-axis representation)
rot_matrix_A_B = axis_angle_rot_matrix(k,q)
# Store the translation vector t
translation_vec_A_B = t
# Convert to a 2D matrix
t0 = translation_vec_A_B[0]
t1 = translation_vec_A_B[1]
t2 = translation_vec_A_B[2]
translation_vec_A_B = np.array([[t0],
[t1],
[t2]])
# Create the homogeneous transformation matrix
homgen_mat = np.concatenate((rot_matrix_A_B, translation_vec_A_B), axis=1) # side by side
# Row vector for bottom of homogeneous transformation matrix
extra_row_homgen = np.array([[0, 0, 0, 1]])
# Add extra row to homogeneous transformation matrix
homgen_mat = np.concatenate((homgen_mat, extra_row_homgen), axis=0) # one above the other
return homgen_mat
class RoboticArm:
def __init__(self,k_arm,t_arm):
'''
Creates a robotic arm class for computing position and velocity.
Input
:param k_arm: A 2D array that lists the different axes of rotation (rows) for each joint.
:param t_arm: A 2D array that lists the translations from the previous joint to the current joint
The first translation is from the global (base) frame to joint 1 (which is often equal to the global frame)
The second translation is from joint 1 to joint 2, etc.
'''
self.k = np.array(k_arm)
self.t = np.array(t_arm)
assert k_arm.shape == t_arm.shape, 'Warning! Improper definition of rotation axes and translations'
self.N_joints = k_arm.shape[0]
def position(self,Q,index=-1,p_i=[0,0,0]):
'''
Compute the position in the global (base) frame of a point given in a joint frame
(default values will assume the input position vector is in the frame of the last joint)
Input
:param p_i: A 3 element vector containing a position in the frame of the index joint
:param index: The index of the joint frame being converted from (first joint is 0, the last joint is N_joints - 1)
Output
:return: A 3 element vector containing the new position with respect to the global (base) frame
'''
# The position of this joint described by the index
p_i_x = p_i[0]
p_i_y = p_i[1]
p_i_z = p_i[2]
this_joint_position = np.array([[p_i_x],
[p_i_y],
[p_i_z],
[1]])
# End effector joint
if (index == -1):
index = self.N_joints - 1
# Store the original index of this joint
orig_joint_index = index
# Store the result of matrix multiplication
running_multiplication = None
# Start from the index of this joint and work backwards to index 0
while (index >= 0):
# If we are at the original joint index
if (index == orig_joint_index):
running_multiplication = hr_matrix(self.k[index],self.t[index],Q[index]) @ this_joint_position
# If we are not at the original joint index
else:
running_multiplication = hr_matrix(self.k[index],self.t[index],Q[index]) @ running_multiplication
index = index - 1
# extract the points
px = running_multiplication[0][0]
py = running_multiplication[1][0]
pz = running_multiplication[2][0]
position_global_frame = np.array([px, py, pz])
return position_global_frame
def pseudo_inverse(self,theta_start,p_eff_N,goal_position,max_steps=np.inf):
'''
Performs the inverse kinematics using the pseudoinverse of the Jacobian
:param theta_start: An N element array containing the current joint angles in radians (e.g. np.array([np.pi/8,np.pi/4,np.pi/6]))
:param p_eff_N: A 3 element vector containing translation from the last joint to the end effector in the last joints frame of reference
:param goal_position: A 3 element vector containing the desired end position for the end effector in the global (base) frame
:param max_steps: (Optional) Maximum number of iterations to compute
Output
:return: An N element vector containing the joint angles that result in the end effector reaching xend (i.e. the goal)
'''
v_step_size = 0.05
theta_max_step = 0.2
Q_j = theta_start # Array containing the starting joint angles
p_end = np.array([goal_position[0], goal_position[1], goal_position[2]]) # desired x, y, z coordinate of the end effector in the base frame
p_j = self.position(Q_j,p_i=p_eff_N) # x, y, z coordinate of the position of the end effector in the global reference frame
delta_p = p_end - p_j # delta_x, delta_y, delta_z between start position and desired final position of end effector
j = 0 # Initialize the counter variable
# While the magnitude of the delta_p vector is greater than 0.01
# and we are less than the max number of steps
while np.linalg.norm(delta_p) > 0.01 and j<max_steps:
print(f'j{j}: Q[{Q_j}] , P[{p_j}]') # Print the current joint angles and position of the end effector in the global frame
# Reduce the delta_p 3-element delta_p vector by some scaling factor
# delta_p represents the distance between where the end effector is now and our goal position.
v_p = delta_p * v_step_size / np.linalg.norm(delta_p)
# Get the jacobian matrix given the current joint angles
J_j = self.jacobian(Q_j,p_eff_N)
# Calculate the pseudo-inverse of the Jacobian matrix
J_invj = np.linalg.pinv(J_j)
# Multiply the two matrices together
v_Q = np.matmul(J_invj,v_p)
# Move the joints to new angles
# We use the np.clip method here so that the joint doesn't move too much. We
# just want the joints to move a tiny amount at each time step because
# the full motion of the end effector is nonlinear, and we're approximating the
# big nonlinear motion of the end effector as a bunch of tiny linear motions.
Q_j = Q_j + np.clip(v_Q,-1*theta_max_step,theta_max_step)#[:self.N_joints]
# Get the current position of the end-effector in the global frame
p_j = self.position(Q_j,p_i=p_eff_N)
# Increment the time step
j = j + 1
# Determine the difference between the new position and the desired end position
delta_p = p_end - p_j
# Return the final angles for each joint
return Q_j
def jacobian(self,Q,p_eff_N=[0,0,0]):
'''
Computes the Jacobian (just the position, not the orientation)
:param Q: An N element array containing the current joint angles in radians
:param p_eff_N: A 3 element vector containing translation from the last joint to the end effector in the last joints frame of reference
Output
:return: A 3xN 2D matrix containing the Jacobian matrix
'''
# Position of the end effector in global frame
p_eff = self.position(Q,-1,p_eff_N)
first_iter = True
jacobian_matrix = None
for i in range(0, self.N_joints):
if (first_iter == True):
# Difference in the position of the end effector in the global frame
# and this joint in the global frame
p_eff_minus_this_p = p_eff - self.position(Q,index=i)
# Axes
kx = self.k[i][0]
ky = self.k[i][1]
kz = self.k[i][2]
k = np.array([kx, ky, kz])
px = p_eff_minus_this_p[0]
py = p_eff_minus_this_p[1]
pz = p_eff_minus_this_p[2]
p_eff_minus_this_p = np.array([px, py, pz])
this_jacobian = np.cross(k, p_eff_minus_this_p)
# Convert to a 2D matrix
j0 = this_jacobian[0]
j1 = this_jacobian[1]
j2 = this_jacobian[2]
this_jacobian = np.array([[j0],
[j1],
[j2]])
jacobian_matrix = this_jacobian
first_iter = False
else:
p_eff_minus_this_p = p_eff - self.position(Q,index=i)
# Axes
kx = self.k[i][0]
ky = self.k[i][1]
kz = self.k[i][2]
k = np.array([kx, ky, kz])
# Difference between this joint's position and end effector's position
px = p_eff_minus_this_p[0]
py = p_eff_minus_this_p[1]
pz = p_eff_minus_this_p[2]
p_eff_minus_this_p = np.array([px, py, pz])
this_jacobian = np.cross(k, p_eff_minus_this_p)
# Convert to a 2D matrix
j0 = this_jacobian[0]
j1 = this_jacobian[1]
j2 = this_jacobian[2]
this_jacobian = np.array([[j0],
[j1],
[j2]])
jacobian_matrix = np.concatenate((jacobian_matrix, this_jacobian), axis=1) # side by side
return jacobian_matrix
def main():
'''Given a two degree of freedom robotic arm and a desired end position of the end effector,
calculate the two joint angles (i.e. servo angles).
'''
# A 2D array that lists the different axes of rotation (rows) for each joint
# Here I assume our robotic arm has two joints, but you can add more if you like.
# k = kx, ky, kz
k = np.array([[0,0,1],[0,0,1]])
# A 2D array that lists the translations from the previous joint to the current joint
# The first translation is from the base frame to joint 1 (which is equal to the base frame)
# The second translation is from joint 1 to joint 2
# t = tx, ty, tz
# These values are measured with a ruler based on the kinematic diagram
# This tutorial teaches you how to draw kinematic diagrams:
# https://automaticaddison.com/how-to-assign-denavit-hartenberg-frames-to-robotic-arms/
a1 = 4.7
a2 = 5.9
a3 = 5.4
a4 = 6.0
t = np.array([[0,0,0],[a2,0,a1]])
# Position of end effector in joint 2 (i.e. the last joint) frame
p_eff_2 = [a4,0,a3]
# Create an object of the RoboticArm class
k_c = RoboticArm(k,t)
# Starting joint angles in radians (joint 1, joint 2)
q_0 = np.array([0,0])
# desired end position for the end effector with respect to the base frame of the robotic arm
endeffector_goal_position = np.array([4.0,10.0,a1 + a4])
# Display the starting position of each joint in the global frame
for i in np.arange(0,k_c.N_joints):
print(f'joint {i} position = {k_c.position(q_0,index=i)}')
print(f'end_effector = {k_c.position(q_0,index=-1,p_i=p_eff_2)}')
print(f'goal = {endeffector_goal_position}')
# Return joint angles that result in the end effector reaching endeffector_goal_position
final_q = k_c.pseudo_inverse(q_0, p_eff_N=p_eff_2, goal_position=endeffector_goal_position, max_steps=500)
# Final Joint Angles in degrees
print('\n\nFinal Joint Angles in Degrees')
print(f'Joint 1: {np.degrees(final_q[0])} , Joint 2: {np.degrees(final_q[1])}')
if __name__ == '__main__':
main()
Here is the output:
axis_angle_rot_matrix(k,q)
There are multiple ways to describe rotation in a robotic system. In previous tutorial, I have covered some of those methods:
Another way to describe rotation in a robotic system is to use the Axis-Angle representation. With this representation, any orientation of a robotic system in three dimensions is equivalent to a rotation about a fixed axis k through angle θ.
The next frame (i.e. the next joint or servo motor) would be Frame 1 (Joint 2). Frame 1 of the robotic arm would be rotated around axis k by angle θ1.
In most of my robotics work, I’ve assumed that rotation at any joint is around the z axis. So, in this code, I’ve assumed that axis k is the following vector.
k = (kx, ky, kz) = (0, 0, 1)
With the rotation around that axis being θ (e.g. θ1, θ2, θ3, etc.).
Given axis-angle representation, the equivalent 3×3 rotation matrix Rk(θ) is as follows:
This equation above is what you see implemented in the code.
This class represents the robotic arm that we want to control.
__init__(self,k_arm,t_arm)
This piece of code is the constructor for the robotic arm class. This constructor initializes the data fields of the robotic arm of interest. These data fields include the number of joints, the rotation axes for each joint, and the measured lengths (i.e. translations) between each joint.
position(self,Q,index=-1,p_i=[0,0,0])
This piece of code uses hr_matrix(k,t,q) to calculate the position in the global (base) frame of a point given in a joint frame.
This piece of code performs inverse kinematics using the pseudoinverse of the Jacobian. What does that mean? Let me explain now how this process works.
You start off with the joints of the robotic arm at arbitrary angles (by convention I typically set them to 0 degrees).
The end effector of the robotic arm is located at some arbitrary position in 3D space.
You then iterate. At each step of the iteration, you
1. Calculate the distance between the current end effector position and the desired goal end effector position. This value (which is a vector…a list of numbers) is often called “delta p”.
2. Reduce the delta p vector by some scaling factor so that it is really small. We only want the end effector to move by a small amount at each time step because, for small motions, we can approximate the displacement of the end effector as linear. Doing this makes the math easier, enabling us to use the Jacobian matrix. The Jacobian matrix is, at its core, is a matrix of partial derivatives.
Remember, forward kinematics (i.e. the motion of revolute joints like servo motors) is nonlinear and would typically involve sines and cosines like we saw in the Analytical IK method), but in this case, we make linear approximations to that nonlinear motion.
3. Calculate the Jacobian matrix J.
4, Calculate the pseudoinverse of the Jacobian matrix. The pseudoinverse of the Jacobian matrix is calculated because the regular inverse (i.e. J-1 which we looked at in a previous tutorial) fails if a matrix is not square (i.e. a square matrix is a matrix with the same number of columns and rows). The pseudoinverse can invert a non-square matrix.
5. Multiply the pseudoinverse of the Jacobian matrix and the scaled delta p vector together to calculate the desired tiny change in the joint values.
6. Update each joint angle by the amount of that tiny change.
7. The resulting motion from updating the joint values should bring the end effector just a little bit closer to your desired goal position.
Keep repeating the steps above again and again until the end effector has reached the desired goal position (or rather “close enough”).
In a nutshell, at each iteration, we have the end effector take tiny steps towards the goal. We only stop when the end effector reaches the goal position.
I like to set the maximum number of iterations around 500, but you’ll often get the end effector to reach the desired goal position long before that many iterations.
Once the end effector is close enough, return the final joint angles. For a real robotic arm, these are the angles that each servo motor would need to be for the end effector of the arm to be at the goal position.
jacobian(self,Q,p_eff_N=[0,0,0])
In this piece of code, we calculate the Jacobian matrix (position part on the top half…not the orientation part on the bottom half).
d0n is the displacement from the origin of the global coordinate frame to the origin of the end of the end effector.
For a revolute joint (like a servo motor), the change in the linear velocity of the end effector is the cross product of the axis of revolution k (which is made up of 3 elements, kx, ky, and kz) and a 3-element position vector from the joint to the end effector.
For example, given a robotic arm with two joints (i.e. two servo motors), the Jacobian is calculated as follows:
Jv = [Jv1, Jv2]
Where:
Jv1 is the first column of the Jacobian matrix.
Jv2 is the second column of the Jacobian matrix.
k1 is the 3-element axis for the first joint (e.g. (0, 0, 1)).
k2 is the 3-element axis for the second joint (e.g. (0, 0, 1)).
peff is a 3-element vector that represents the position of the end effector in the base frame of the robotic arm.
pB1 is the position of joint 1 relative to the base frame.
pB2 is the position of joint 2 relative to the base frame.
main()
Given a two degree of freedom robotic arm and a desired end position of the end effector, we calculate the two joint angles (i.e. servo angles).
If you see this line of code, you’ll see that I want the robotic arm to go to x = 4.0, y = 10.0.
In that post, I set the goal destination to x = 4.0 and y = 10.0. If you uncomment the Arduino code on that tutorial and run it, you’ll see that the Serial Monitor outputs the final joint angles (that get the end effector of the robotic arm to the goal destination) as:
θ1 = 42.8
θ2 = 50.3
This is pretty much equal to what we get with the Pseudoinverse Jacobian approach from our Python code earlier in this section.
Conclusion
What I’ve shown in this tutorial are two popular methods for calculating the inverse kinematics of a robotic arm. There are other methods out there like Cyclic Coordinate Descent, the Damped Least Squares method, and the Jacobian Transpose method. What method you choose for your project depends on your personal preference and what you’re trying to achieve.