How To Derive the Observation Model for a Mobile Robot

In this tutorial, we will learn how to derive the observation model for a two-wheeled mobile robot (I’ll explain what an observation model is below). Specifically, we’ll assume that our mobile robot is a differential drive robot like the one on the cover photo. A differential drive robot is one in which each wheel is driven independently from the other. 

Below is a representation of a differential drive robot on a 3D coordinate plane with x, y, and z axes. The yaw angle γ represents the amount of rotation around the z axis as measured from the x axis. 

state-space-model-1JPG

Here is the aerial view:

state-space-model-2JPG
state-space-model-5JPG

Real World Applications

  • The most common application of deriving the observation model is when you want to do Kalman Filtering to smooth out noisy sensor measurements to generate a better estimate of the current state of a robotic system. I’ll cover this process in a future post. It is known as sensor fusion.

Prerequisites

  • It is helpful if you have gone through my tutorial on state space modeling. Otherwise, if you understand the concept of state space modeling, jump right into this tutorial.

What is an Observation Model?

Typically, when a mobile robot is moving around in the world, we gather measurements from sensors to create an estimate of the state (i.e. where the robot is located in its environment and how it is oriented relative to some axis). 

An observation model does the opposite. Instead of converting sensor measurements to estimate the state, we use the state (or predicted state at the next timestep) to estimate (or predict) sensor measurements. The observation model is the mathematical equation that does this calculation.

You can understand why doing something like this is useful. Imagine you have a state space model of your mobile robot. You want the robot to move from a starting location to a goal location. In order to reach the goal location, the robot needs to know where it is located and which direction it is moving.

Suppose the robot’s sensors are really noisy, and you don’t trust its observations entirely to give you an accurate estimate of the current state. Or perhaps, all of a sudden, the sensor breaks, or generates erroneous data.

The observation model can help us in these scenarios. It enables you to predict what the sensor measurements would be at some timestep in the future. 

The observation model works by first using the state model to predict the state of the robot at that next timestep and then using that state prediction to infer what the sensor measurement would be at that point in time

You can then compute a weighted average (i.e. this is what you do during Kalman Filtering) of the predicted sensor measurements and the actual sensor observation at that timestep to create a better estimate of your state. 

This is what you do when you do Kalman Filtering. You combine noisy sensor observations with predicted sensor measurements (based on your state space model) to create a near-optimal estimate of the state of a mobile robot at some time t.

An observation model (also known as measurement model or sensor model) is a mathematical equation that represents a vector of predicted sensor measurements y at time t as a function of the state of a robotic system x at time t, plus some sensor noise (because sensors aren’t 100% perfect) at time t, denoted by vector wt.

Here is the formal equation:

observation-model-4JPG

Because a mobile robot in 3D space has three states [x, y, yaw angle], in vector format, the observation model above becomes:

observation-model-5JPG

where:

  • t = current time
  • y vector (at left) = n predicted sensor observations at time t
  • n = number of sensor observations (i.e. measurements) at time t 
  • H = measurement matrix (used to convert the state at time t into predicted sensor observations at time t) that has n rows and 3 columns (i.e. a column for each state variable).
  • w = the noise of each of the n sensor observations (you often get this sensor error information from the datasheet that comes when you purchase a sensor)

Sensor observations can be anything from distance to a landmark, to GPS readings, to ultrasonic distance sensor readings, to odometry readings….pretty much any number you can read from a robotic sensor that can help give you an estimate of the robot’s state (position and orientation) in the world.

The reason why I say predicted measurements for the y vector is because they are not actual sensor measurements. Rather you are using the state of a mobile robot (i.e. the current location and orientation of a robot in 3D space) at time t-1 to predict the next state at time t, and then you use that next state (estimate) at time t to infer what the corresponding sensor measurement would be at that point in time t. 

How to Calculate the H Matrix

Let’s examine how to calculate H. The measurement matrix H is used to convert the predicted state estimate at time t into predicted sensor measurements at time t.

Imagine you had a sensor that could measure the x position, y position, and yaw angle (i.e. rotation angle around the z-axis) directly. What would H be? 

In this case, H will be the identity matrix since the estimated state maps directly to sensor measurements [x, y, yaw]. 

H has the same number of rows as sensor measurements and the same number of columns as states. Here is how the matrix would look in this scenario.

observation-model-6JPG

What if, for example, you had a landmark in your environment. How does the current state of the robot enable you to calculate the distance (i.e. range) to the landmark r and the bearing angle b to the landmark?

observation-model-7JPG

Using the Pythagorean Theorem and trigonometry, we get the following equations for the range r and bearing b:

observation-model-8JPG

Let’s put this in matrix form.

observation-model-9JPG

The formulas in the vector above are nonlinear (note the arctan2 function). They enable us to take the current state of the robot at time t and infer the corresponding sensor observations r and b at time t. 

Let’s linearize the model to create a linear observation model of the following form:

observation-model-10JPG

We have to calculate the Jacobian, just as we did when we calculated the A and B matrices in the state space modeling tutorial.

The formula for Ht is as follows:

observation-model-11JPG

So we need to calculate 6 different partial derivatives. Here is what you should get:

observation-model-12JPG

Putting It All Together

The final observation model is as follows:

observation-model-13JPG

Equivalently, in some literature, you might see that all the stuff above is equal to:

observation-model-14JPG

In fact, in the Wikipedia Extended Kalman Filter article, they replace time t with time k (not sure why because t is a lot more intuitive, but it is what it is).

Python Code Example for the Observation Model

Let’s create an observation model in code. As a reminder, here is our robot model (as seen from above).

observation-model-15JPG

We’ll assume we have a sensor on our mobile robot that is able to give us exact measurements of the xglobal position in meters, yglobal position in meters, and yaw angle γ of the robot in radians at each timestep.

Therefore, as mentioned earlier, 

observation-model-16JPG

We’ll also assume that the corresponding noise (error) for our sensor readings is +/-0.07 m for the x position, +/-0.07 m for the y position, and +/-0.04 radians for the yaw angle. Therefore, here is the sensor noise vector:

observation-model-17JPG

So, here is our equation:

observation-model-18JPG

Suppose that the state of the robot at time t is [x=5.2, y=2.8, yaw angle=1.5708]. Because the yaw angle is 90 degrees (1.5708 radians), we know that the robot is heading due north relative to the global reference frame (i.e. the world or environment the robot is in).

What is the corresponding estimated sensor observation at time t given the current state of the robot at time t?

Here is the code:

import numpy as np

# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: An observation model for a differential drive mobile robot

# Measurement matrix H_t
# Used to convert the predicted state estimate at time t
# into predicted sensor measurements at time t.
# In this case, H will be the identity matrix since the 
# estimated state maps directly to state measurements from the 
# odometry data [x, y, yaw]
# H has the same number of rows as sensor measurements
# and same number of columns as states.
H_t = np.array([ 	[1.0,  0,   0],
					[  0,1.0,   0],
					[  0,  0, 1.0]])
				
# Sensor noise. This is a vector with the
# number of elements equal to the number of sensor measurements.
sensor_noise_w_t = np.array([0.07,0.07,0.04])
						
# The estimated state vector at time t in the global 
# reference frame
# [x_t, y_t, yaw_t]
# [meters, meters, radians]
state_estimate_t = np.array([5.2,2.8,1.5708])

def main():

	estimated_sensor_observation_y_t = H_t @ (
			state_estimate_t) + (
			sensor_noise_w_t)

	print(f'State at time t: {state_estimate_t}')
	print(f'Estimated sensor observations at time t: {estimated_sensor_observation_y_t}')

main()

Here is the output:

code-output-3

And that’s it folks. That linear observation model above enables you to convert the state at time t to estimated sensor observations at time t.

In a real robot application, we would use an Extended Kalman Filter to compute a weighted average of the estimated sensor reading using the observation model and the actual sensor reading that is mounted on your robot. In my next tutorial, I’ll show you how to implement an Extended Kalman Filter from scratch, building on the content we covered in this tutorial.

Keep building!

How To Derive the State Space Model for a Mobile Robot

In this tutorial, we will learn how to create a state space model for a mobile robot. Specifically, we’ll take a look at a type of mobile robot called a differential drive robot. A differential drive robot is a robot like the one on the cover photo of this post, as well as this one below. It is called differential drive because each wheel is driven independently from the other. 

obstacle_avoiding_robot

Prerequisites

  • Basic linear algebra
    • You understand what a matrix and a vector are.
    • You know how to multiply two matrices together.
  • You know what a partial derivative is.
  • Optional: Python 3.7 or higher (just needed for the last part of this tutorial)

There are a lot of tutorials on YouTube if you don’t have the prerequisites above (e.g. Khan Academy).

What is a State Space Model?

A state space model (often called the state transition model) is a mathematical equation that represents the motion of a robotic system from one timestep to the next. It shows how the current position (e.g. X, Y coordinate) and orientation (yaw (heading) angle γ) of the robot in the world is impacted by changes to the control inputs into the robot (e.g. velocity in meters per second…often represented by the variable v).

Note: In a real-world scenario, we would need to represent the world the robot is in (e.g. a room in your house, etc.) as an x,y,z coordinate grid.

yaw_pitch_rollJPG
mobile_robot_in_3d

Deriving the Equations

Consider the mobile robot below. It is moving around on the flat x-y coordinate plane.

state-space-model-1JPG

Let’s look at it from an aerial view.

state-space-model-2JPG

What is the position and orientation of the robot in the world (i.e. global reference frame)? The position and orientation of a robot make up what we call the state vector.

state-space-model-5JPG

Note that x and y as well as the yaw angle γ are in the global reference frame.

The yaw angle γ describes the rotation around the z-axis (coming out of the page in the image above) in the counterclockwise direction (from the x axis). The units for the yaw angle are typically radians.

ω is the rotation rate (i.e. angular velocity or yaw rate) around the global z axis. Typically the units on this variable is radians per second.

The velocity of the robot v is in terms of the robot reference frame (labeled xrobot and yrobot). 

In terms of the global reference frame we can break v down into two components: 

  • velocity in the x direction v
  • velocity in the y direction vy

In most of my applications, I express the velocity in meters per second.

state-space-model-4JPG

Using trigonometry, we know that:

  • cos(γ) = vx/v 
  • sin(γ) = vx/v 

Therefore, with respect to the global reference frame, the robot’s motion equations are as follows:

linear velocity in the x direction = vx = vcos(γ)

linear velocity in the y direction = vy = vsin(γ)

angular velocity around the z axis = ω

Now let’s suppose at time t-1, the robot has the current state (x position, y position, yaw angle γ):

state-space-model-5JPG-1

We move forward one timestep dt. What is the state of the robot at time t? In other words, where is the robot location, and how is the robot oriented? 

Remember that distance = velocity * time

Therefore,

state-space-model-6JPG

Converting the Equations to a State Space Model

We now know how to describe the motion of the robot mathematically. However the equation above is nonlinear. For some applications, like using the Kalman Filter (I’ll cover Kalman Filters in a future post), we need to make our state space equations linear.

How do we take this equation that is nonlinear (note the cosines and sines)…

state-space-model-7JPG

…and convert it into a form that looks like the equation of a line? In mathematical jargon, we want to make the equation above a linear discrete-time state space model of the following form.

state-space-model-8JPG

where:

xt is our entire current state vector [xt, ytt]  (i.e. note xt in bold is the entire state vector…don’t get that confused with xt, which is just the x coordinate of the robot at time t)

xt-1 is the state of the mobile robot at the previous timestep: [xt-1, yt-1t-1]

ut-1 represents the control input vector at the previous timestep [vt-1, ωt-1] = [forward velocity, angular velocity]. 

I’ll get to what the A and B matrices represent later in this post.

Let’s write out the full form of the linear state space model equation:

state-space-model-9JPG

Let’s go through this equation term by term to make sure we understand everything so we can convert this equation below into the form above:

state-space-model-10JPG

How to Calculate the A Matrix

A is a matrix. The number of rows in the A matrix is equal to the number of states, and the number of columns in the A matrix is equal to the number of states. In this mobile robot example, we have three states.

The A matrix expresses how the state of the system [x position,y position,yaw angle γ] changes from t-1 to t when no control command is executed (i.e. when we don’t give any speed (velocity) commands to the robot). 

Typically a robot on wheels only drives when the wheels are commanded to turn. Therefore, for this case, A is the identity matrix (Note that A is sometimes F in the literature).

state-space-model-11JPG

In other applications, A might not be the identity matrix. An example is a helicopter. The state of a helicopter flying around in the air changes even if you give it no velocity command. Gravity will pull the helicopter downward to Earth regardless of what you do. It will eventually crash if you give it no velocity commands.

helicopter_rc_model_helicopter_2

Formally, to calculate A, you start with the equation for each state vector variable. In our mobile robot example, that is this stuff below.

state-space-model-12JPG

Our system is expressed as a nonlinear system now because the state is a function of cosines and sines (which are nonlinear trigonometric operations).

To get our state space model in this form…

state-space-model-13JPG

…we need to “linearize” the nonlinear equations. To do this, we need to calculate the Jacobian, which is nothing more than a fancy name for “matrix of partial derivatives.” 

Do you remember the equation for a straight line from grade school: y=mx+b? m is the slope. It is “change in y / change in x”.

The Jacobian is nothing more than a multivariable form of the slope m. It is “change in y1/x1, change in y2/x2, change in y3/x3, etc….”  

You can consider the Jacobian a slope on steroids. It represents how fast a group of variables (as opposed to just one variable…(e.g. m = rise/run = change in y/change in x) are changing with respect to another group of variables.

You start with this here:

state-space-model-14JPG

You then calculate the partial derivative of the state vector at time t with respect to the states at time t-1. Don’t be scared at all the funny symbols inside this matrix A. When you’re calculating the Jacobian matrix, you calculate each partial derivative, one at a time.

state-space-model-15JPG

You will have to calculate 9 partial derivatives in total for this example. 

Again, if you are a bit rusty on calculating partial derivatives, there are some good tutorials online. We’ll do an example calculation now so you can see how this works.

Let’s calculate the first partial derivative in the upper left corner of the matrix.

state-space-model-16JPG

and

state-space-model-17JPG

So, 1 goes in the upper-left corner of the matrix.

If you calculate all 9 partial derivatives above, you’ll get:

state-space-model-18JPG

which is the identity matrix.

If you actually calculate these partial derivatives, you will get the following for A_{t-1}:

partial_derivatives

Since we assume 0 for the control inputs, A becomes the identity matrix.

state-space-model-18JPG

The assumption that the control inputs (v_{t-1} and ω_{t-1}) are zero when calculating the A matrix is based on the physical characteristics of the differential drive mobile robot and the purpose of the A matrix in the state space model.

Physical characteristics of the robot:

A differential drive mobile robot typically only moves when its wheels are commanded to turn. If no control input is given (i.e., the robot is not commanded to move), the robot will remain stationary, and its position and orientation will not change from one time step to the next.

Purpose of the A matrix:

In the state space model, the A matrix is used to express how the state of the system [x position, y position, yaw angle γ] changes from time step t-1 to t when no control command is executed. It represents the natural evolution of the system state without any external influence from the control inputs.

By setting v_{t-1} and ω_{t-1} to zero when calculating the A matrix, we are essentially modeling the behavior of the robot when it is not being actively controlled. This is a valid assumption because we are interested in understanding how the system state evolves on its own, without any control commands.

Linearization at a Specific Point

  • Linearizing a nonlinear system involves finding the Jacobian matrix at a specific operating point.
  • The chosen operating point here is where the robot is stationary (i.e., no control inputs).

Use Case: Kalman Filtering

The assumption that the A matrix is an identity matrix when no control inputs are given is particularly useful in the context of Kalman filtering for a few reasons:

  1. Simplification of the prediction step: In the Kalman filter, the prediction step is responsible for estimating the robot’s state at the next time step based on its current state and the control inputs. By assuming that the A matrix is an identity matrix, the prediction step becomes simpler. The predicted state is essentially the same as the current state if no control inputs are given, which aligns with the physical behavior of the robot.
  2. Focus on the effect of control inputs: With the A matrix being an identity matrix, the Kalman filter can focus more on the effect of the control inputs on the robot’s state. The B matrix captures how the control inputs (linear and angular velocities) influence the change in the robot’s position and orientation. This allows the Kalman filter to effectively incorporate the control inputs into the state estimation process. We’ll calculate the B matrix in the next section.
  3. Modeling the uncertainty: The Kalman filter is designed to handle uncertainty in the robot’s motion and measurements. By assuming that the A matrix is an identity matrix, the uncertainty in the robot’s motion is primarily attributed to the control inputs and the process noise. The process noise can be modeled to account for factors like wheel slippage, uneven surfaces, or other disturbances that affect the robot’s motion. The Kalman filter uses this uncertainty information to estimate the robot’s state more accurately.
  4. Computational efficiency: Using an identity matrix for the A matrix can lead to more efficient computations in the Kalman filter. Since the identity matrix has a simple structure (1s on the diagonal and 0s elsewhere), it can simplify the matrix calculations involved in the prediction and update steps of the filter. This can be beneficial when running the Kalman filter on resource-constrained systems or when real-time performance is critical.

It’s important to note that the assumption of an identity matrix for A is specific to the case when no control inputs are given. In the overall Kalman filter algorithm, the actual control inputs (linear and angular velocities) are still used in the prediction step to estimate the robot’s state at the next time step.

By leveraging this assumption, the Kalman filter can more effectively estimate the robot’s state, account for uncertainty, and incorporate the effect of control inputs, all while maintaining computational efficiency. This makes it a powerful tool for state estimation and localization in mobile robot applications.

How to Calculate the B Matrix

The B matrix in our running example of a mobile robot is a 3×2 matrix. 

The B matrix has the same number of rows as the number of states and has the same number of columns as the number of control inputs

The control inputs in this example are the linear velocity (v) and the angular velocity around the z axis, ω (also known as the “yaw rate”).

The B matrix expresses how the state of the system (i.e. [x,y,γ]) changes from t-1 to t due to the control commands (i.e. control inputs v and ω)

Since we’re dealing with a robotic car here, we know that if we apply forward and angular velocity commands to the car, the car will move.

The equation for B is as follows. We need to calculate yet another Jacobian matrix. However, unlike our calculation for the A matrix, we need to compute the partial derivative of the state vector at time t with respect to the control inputs at time t-1.

state-space-model-19JPG

Remember the equations for f1, f2, and f3:

state-space-model-20JPG

If you calculate the 6 partial derivatives, you’ll get this for the B matrix below:

state-space-model-21JPG

Putting It All Together

Ok, so how do we get from this:

state-space-model-22JPG

Into this form:

state-space-model-23JPG

We now know the A and B matrices, so we can plug those in. The final state space model for our differential drive robot is as follows:

state-space-model-24JPG

where vt-1 is the linear velocity of the robot in the robot’s reference frame, and ωt-1 is the angular velocity in the robot’s reference frame.

And there you have it. If you know the current position of the robot (x,y), the orientation of the robot (yaw angle γ), the linear velocity of the robot, the angular velocity of the robot, and the change in time from one timestep to the next, you can calculate the state of the robot at the next timestep.

Adding Process Noise

The world isn’t perfect. Sometimes the robot might not act the way you would expect when you want it to execute a specific velocity command. 

It is often common to add a noise term to the state space model to account for randomness in the world. The equation would then be:

state-space-model-25JPG

Python Code Example for the State Space Model

In this section, I’ll show you code in Python for the state space model we have developed in this tutorial. We will assume:

  • The robot begins at the origin at a yaw angle of 0 radians. 
  • We then apply a forward velocity of 4.5 meters per second at time t-1 and an angular velocity of 0.05 radians per second.

The output will show the state of the robot at the next timestep, time t. I’ll name the file state_space_model.py.

Make sure you have NumPy installed before you run the code. NumPy is a scientific computing library for Python.

If you’re using Anaconda, you can type:

conda install numpy

Alternatively, you can type:

pip install numpy

Here is the code. You can copy and paste this code into your favorite IDE and then run it.

import numpy as np

# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: A state space model for a differential drive mobile robot

# A matrix
# 3x3 matrix -> number of states x number of states matrix
# Expresses how the state of the system [x,y,yaw] changes 
# from t-1 to t when no control command is executed.
# Typically a robot on wheels only drives when the wheels are commanded
# to turn.
# For this case, A is the identity matrix.
# A is sometimes F in the literature.
A_t_minus_1 = np.array([[1.0,  0,   0],
                        [  0,1.0,   0],
                        [  0,  0, 1.0]])
						
# The estimated state vector at time t-1 in the global 
# reference frame
# [x_t_minus_1, y_t_minus_1, yaw_t_minus_1]
# [meters, meters, radians]
state_estimate_t_minus_1 = np.array([0.0,0.0,0.0])

# The control input vector at time t-1 in the global 
# reference frame
# [v, yaw_rate]
# [meters/second, radians/second]
# In the literature, this is commonly u.
control_vector_t_minus_1 = np.array([4.5, 0.05])

# Noise applied to the forward kinematics (calculation
# of the estimated state at time t from the state
# transition model of the mobile robot). This is a vector
# with the number of elements equal to the number of states
process_noise_v_t_minus_1 = np.array([0.01,0.01,0.003])

yaw_angle = 0.0 # radians
delta_t = 1.0 # seconds

def getB(yaw,dt):
  """
  Calculates and returns the B matrix
  3x2 matix -> number of states x number of control inputs
  The control inputs are the forward speed and the
  rotation rate around the z axis from the x-axis in the 
  counterclockwise direction.
  [v, yaw_rate]
  Expresses how the state of the system [x,y,yaw] changes
  from t-1 to t due to the control commands (i.e. control
  input).
  :param yaw: The yaw (rotation angle around the z axis) in rad 
  :param dt: The change in time from time step t-1 to t in sec
	"""
  B = np.array([[np.cos(yaw)*dt, 0],
                [np.sin(yaw)*dt, 0],
                [0, dt]])
  return B

def main():
  state_estimate_t = A_t_minus_1 @ (
    state_estimate_t_minus_1) + (
    getB(yaw_angle, delta_t)) @ (
    control_vector_t_minus_1) + (
    process_noise_v_t_minus_1)

  print(f'State at time t-1: {state_estimate_t_minus_1}')
  print(f'Control input at time t-1: {control_vector_t_minus_1}')
  print(f'State at time t: {state_estimate_t}') # State after delta_t seconds

main()

Run the code:

python state_space_model.py

Here is the output:

99-state-space-model-output-python

And that’s it for the fundamentals of state space modeling. Once you know the physics of how a robotic system moves from one timestep to the next, you can create a mathematical model in state space form. 

Keep building!

How to Perform Camera Calibration Using OpenCV

In this tutorial, we will write a program to correct camera lens distortion using OpenCV. This process is known as camera calibration.

7_chessboard_input1_drawn_corners

Real-World Applications

Prerequisites

Install OpenCV

The first thing you need to do is to make sure you have OpenCV installed on your machine. If you are using Anaconda, you can type:

conda install -c conda-forge opencv

Alternatively, you can type:

pip install opencv-python

What Is Camera Calibration and Why Is It Important?

A camera’s job is to convert light that hits its image sensor into an image. An image is made up of picture elements known as pixels

In a perfect world, these pixels (i.e. what the camera sees) would look exactly like what you see in the real world. However, no camera is perfect. They all contain physical flaws that can, in some cases, cause significant distortion.

For example, look at an extreme example of image distortion below. 

1_before_distorted_houseJPG

Here is the same image after calibrating the camera and correcting for the distortion.

2_after_distorted_houseJPG

Do you see how this could lead to problems…especially in robotics where accuracy key?

Imagine you have a robotic arm that is working inside a warehouse. Its job is to pick up items off a conveyor belt and place those items into a bin. 

scara-robot-packaging-cookies

To help the robot determine the exact position of the items, there is a camera overhead. To get an accurate translation between camera pixel coordinates and real world coordinates, you need to calibrate the camera to remove any distortion.

The two most important forms of distortion are radial distortion and tangential distortion.

Radial distortion occurs when straight lines appear curved. Here is an example below of a type of radial distortion known as barrel distortion. Notice the bulging in the image that is making straight lines (in the real world) appear curved.

600px-Barrel_distortion

Tangential distortion occurs when the camera lens is not exactly aligned parallel to the camera sensor. Tangential distortion can make the real world look stretched or tilted. It can also make items appear closer than they are in real life.

To get an accurate representation of the real world on a camera, we have to calibrate the camera. We want to know that when we see an object of a certain size in the real world, this size translates to a known size in camera pixel coordinates. 

By calibrating a camera, we can enable a robotic arm, for example, to have better “hand-eye” coordination. You also enable a self-driving car to know the location of pedestrians, dogs, and other objects relative to the car. 

Calibrating the camera is the process of using a known real-world pattern (e.g. a chessboard) to estimate the extrinsic parameters (rotation and translation vectors) and intrinsic parameters (e.g. focal length, optical center, etc.) of a camera’s lens and image sensor to reduce distortion error caused by the camera’s imperfections. 

These parameters include:

  • Focal length 
  • Image (i.e. optical) center (it is usually not exactly at (width/2, height/2))
  • Scaling factors for the pixels along the rows and columns
  • Skew factor
  • Lens distortion

Why Use a Chessboard?

3-chessboard

Chessboard calibration is a standard technique for performing camera calibration and estimating the values of the unknown parameters I mentioned in the previous section. 

OpenCV has a chessboard calibration library that attempts to map points in 3D on a real-world chessboard to 2D camera coordinates. This information is then used to correct distortion. 

Note that any object could have been used (a book, a laptop computer, a car, etc.), but a chessboard has unique characteristics that make it well-suited for the job of correcting camera distortions:

  • It is flat, so you don’t need to deal with the z-axis (z=0), only the x and y-axis. All the points on the chessboard lie on the same plane.
  • There are clear corners and points, making it easy to map points in the 3D real world coordinate system to points on the camera’s 2D pixel coordinate system.
  • The points and corners all occur on straight lines.

Perform Camera Calibration Using OpenCV

The official tutorial from OpenCV is here on their website, but let’s go through the process of camera calibration slowly, step by step. 

Print a Chessboard

The first step is to get a chessboard and print it out on regular A4 size paper. You can download this pdf, which is the official chessboard from the OpenCV documentation, and just print that out. 

Measure Square Length

Measure the length of the side of one of the squares. In my case, I measured 2.3 cm (0.023 m).

Take Photos of the Chessboard From Different Distances and Orientations

We need to take at least 10 photos of the chessboard from different distances and orientations. I’ll take 19 photos so that my algorithm will have a lot of input images from which to perform camera calibration.

Tape the chessboard to a flat, solid object.

Take the photos, and then move them to a directory on your computer.

Here are examples of some of the photos I took: 

6-chessboard-calibration-angles

Draw the Corners

The first thing we need to do is to find and then draw the corners on the image.

Make sure you have NumPy installed, a scientific computing library for Python.

If you’re using Anaconda, you can type:

conda install numpy

Alternatively, you can type:

pip install numpy

Write the following code in Python. You can copy and paste this code into your favorite IDE. Pick one of the chessboard images as a test case. I’ll name the file draw_corners.py. This code will draw the corners on an input chessboard image and then save the drawing to an image file.

import cv2 # Import the OpenCV library to enable computer vision
import numpy as np # Import the NumPy scientific computing library

# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: Detect corners on a chessboard

filename = 'chessboard_input1.jpg'

# Chessboard dimensions
number_of_squares_X = 10 # Number of chessboard squares along the x-axis
number_of_squares_Y = 7  # Number of chessboard squares along the y-axis
nX = number_of_squares_X - 1 # Number of interior corners along x-axis
nY = number_of_squares_Y - 1 # Number of interior corners along y-axis

def main():
	
  # Load an image
  image = cv2.imread(filename)

  # Convert the image to grayscale
  gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)  

  # Find the corners on the chessboard
  success, corners = cv2.findChessboardCorners(gray, (nY, nX), None)
	
  # If the corners are found by the algorithm, draw them
  if success == True:

    # Draw the corners
    cv2.drawChessboardCorners(image, (nY, nX), corners, success)

    # Create the output file name by removing the '.jpg' part
    size = len(filename)
    new_filename = filename[:size - 4]
    new_filename = new_filename + '_drawn_corners.jpg'		
    
    # Save the new image in the working directory
    cv2.imwrite(new_filename, image)

    # Display the image 
    cv2.imshow("Image", image) 
	
    # Display the window until any key is pressed
    cv2.waitKey(0) 
	
    # Close all windows
    cv2.destroyAllWindows() 
	
main()

Run the code.

Here is the output:

7_chessboard_input1_drawn_corners-1

Write the Python Code for Camera Calibration

Now that we know how to identify corners on a chessboard, let’s write the code to perform camera calibration.

Here is the code. I put my distorted image inside a folder named ‘distorted’ inside the working directory. This folder is where the output undistorted saves to.

Don’t be scared at how long the code is. I put a lot of comments in here in order to make each line easier for you to understand. Just copy and paste the code into your favorite text editor or IDE. I named the file camera_calibration.py.

# Author: Addison Sears-Collins
# https://automaticaddison.com
# Description: Perform camera calibration using a chessboard.

import cv2 # Import the OpenCV library to enable computer vision
import numpy as np # Import the NumPy scientific computing library
import glob # Used to get retrieve files that have a specified pattern

# Path to the image that you want to undistort
distorted_img_filename = 'distorted/chessboard_input12.jpg'

# Chessboard dimensions
number_of_squares_X = 10 # Number of chessboard squares along the x-axis
number_of_squares_Y = 7  # Number of chessboard squares along the y-axis
nX = number_of_squares_X - 1 # Number of interior corners along x-axis
nY = number_of_squares_Y - 1 # Number of interior corners along y-axis
square_size = 0.023 # Length of the side of a square in meters

# Store vectors of 3D points for all chessboard images (world coordinate frame)
object_points = []

# Store vectors of 2D points for all chessboard images (camera coordinate frame)
image_points = []

# Set termination criteria. We stop either when an accuracy is reached or when
# we have finished a certain number of iterations.
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# Define real world coordinates for points in the 3D coordinate frame
# Object points are (0,0,0), (1,0,0), (2,0,0) ...., (5,8,0)
object_points_3D = np.zeros((nX * nY, 3), np.float32) 		

# These are the x and y coordinates												 
object_points_3D[:,:2] = np.mgrid[0:nY, 0:nX].T.reshape(-1, 2) 

object_points_3D = object_points_3D * square_size

def main():
	
  # Get the file path for images in the current directory
  images = glob.glob('*.jpg')
	
  # Go through each chessboard image, one by one
  for image_file in images:
 
    # Load the image
    image = cv2.imread(image_file)  

    # Convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)  

    # Find the corners on the chessboard
    success, corners = cv2.findChessboardCorners(gray, (nY, nX), None)
	
    # If the corners are found by the algorithm, draw them
    if success == True:

      # Append object points
      object_points.append(object_points_3D)

      # Find more exact corner pixels		
      corners_2 = cv2.cornerSubPix(gray, corners, (11,11), (-1,-1), criteria)		
      
	  # Append image points
      image_points.append(corners_2)

      # Draw the corners
      cv2.drawChessboardCorners(image, (nY, nX), corners_2, success)

      # Display the image. Used for testing.
      #cv2.imshow("Image", image) 
	
      # Display the window for a short period. Used for testing.
      #cv2.waitKey(200) 
																													
  # Now take a distorted image and undistort it 
  distorted_image = cv2.imread(distorted_img_filename)

  # Perform camera calibration to return the camera matrix, distortion coefficients, rotation and translation vectors etc 
  ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(object_points, 
                                                    image_points, 
                                                    gray.shape[::-1], 
                                                    None, 
                                                    None)

  # Get the dimensions of the image	
  height, width = distorted_image.shape[:2]
	
  # Refine camera matrix
  # Returns optimal camera matrix and a rectangular region of interest
  optimal_camera_matrix, roi = cv2.getOptimalNewCameraMatrix(mtx, dist, 
                                                            (width,height), 
                                                            1, 
                                                            (width,height))

  # Undistort the image
  undistorted_image = cv2.undistort(distorted_image, mtx, dist, None, 
                                    optimal_camera_matrix)
  
  # Crop the image. Uncomment these two lines to remove black lines
  # on the edge of the undistorted image.
  #x, y, w, h = roi
  #undistorted_image = undistorted_image[y:y+h, x:x+w]
	
  # Display key parameter outputs of the camera calibration process
  print("Optimal Camera matrix:") 
  print(optimal_camera_matrix) 

  print("\n Distortion coefficient:") 
  print(dist) 
  
  print("\n Rotation Vectors:") 
  print(rvecs) 
  
  print("\n Translation Vectors:") 
  print(tvecs) 

  # Create the output file name by removing the '.jpg' part
  size = len(distorted_img_filename)
  new_filename = distorted_img_filename[:size - 4]
  new_filename = new_filename + '_undistorted.jpg'
	
  # Save the undistorted image
  cv2.imwrite(new_filename, undistorted_image)

  # Close all windows
  cv2.destroyAllWindows() 
	
main()

Output

Here is the original distorted image.

chessboard_input12

Here is the undistorted image, which is the output. Note that the distorted image is the same dimensions as the undistorted image. Both images are 600 x 450 pixels.

chessboard_input12_undistorted

The correction is quite subtle, so it might be hard to see (my camera must be pretty good!). It is more noticeable when I flip back and forth between both images as you see in the gif below.

8-distorted-vs-undistorted

Saving Parameters Using Pickle

If you want to save the camera calibration parameters to a file, you can use a package like Pickle to do it. It will encode these parameters into a text file that you can later upload into a program.

The three key parameters you want to make sure to save are mtx, dist, and optimal_camera_matrix.

Here is a list of some tutorials on how to use Pickle:

Assuming you have import pickle at the top of your program, the Python code for saving the parameters to a pickle file would be as follows:

# Save the camera calibration results.
calib_result_pickle = {}
calib_result_pickle["mtx"] = mtx
calib_result_pickle["optimal_camera_matrix"] = optimal_camera_matrix
calib_result_pickle["dist"] = dist
calib_result_pickle["rvecs"] = rvecs
calib_result_pickle["tvecs"] = tvecs
pickle.dump(calib_result_pickle, open("camera_calib_pickle.p", "wb" )) 

Then if at a later time, you wanted to load the parameters into a new program, you would use this code:

calib_result_pickle = pickle.load(open("camera_calib_pickle.p", "rb" ))
mtx = calib_result_pickle["mtx"]
optimal_camera_matrix = calib_result_pickle["optimal_camera_matrix"]
dist = calib_result_pickle["dist"]

Then, given an input image or video frame (i.e. distorted_image), we can undistort it using the following lines of code:

undistorted_image = cv2.undistort(distorted_image, mtx, dist, None, 
                                    optimal_camera_matrix)

That’s it for this tutorial. Hope you enjoyed it. Now you know how to calibrate a camera using OpenCV. 

Keep building!