How To Write a ROS2 Publisher and Subscriber (Python) – Foxy

In this tutorial, we will learn how to create a publisher and a subscriber node in ROS2 (Foxy Fitzroy…the latest version of ROS2) from scratch.

Requirements

We’ll create three separate nodes:

  1. A node that publishes the coordinates of an object detected by a fictitious camera (in reality, we’ll just publish random (x,y) coordinates of an object to a ROS2 topic).
  2. A program that converts the coordinates of the object from the camera reference frame to the (fictitious) robotic arm base frame.
  3. A node that publishes the coordinates of the object with respect to the robotic arm base frame.

I thought this example would be more fun and realistic than the “hello world” example commonly used to teach people how to write ROS subscriber and publisher nodes.

Real-World Applications

2-use-some-tape
Two degree of freedom robotic arm powered by Arduino with an overhead Raspberry Pi camera. With some additions and modifications, this system could be integrated with ROS2 using the code we’ll develop in this tutorial.

A real-world application of what we’ll accomplish in this tutorial would be a pick and place task where a robotic arm needs to pick up an object from one location and place it in another (e.g. factory, warehouse, etc.).

With some modifications to the code in this tutorial, you can create a complete ROS2-powered robotic system that can:

To complete such a project, you would need to write one more node, a node (i.e. a single Arduino sketch) for your Arduino to control the motors of the robotic arm. This node would need to:

  1. Subscribe to the coordinates of the object.
  2. Convert those coordinates into servo motor angles using inverse kinematics.
  3. Move the robotic arm to the location of the object to pick it up.

Prerequisites

Source Your ROS2 Installation

Open a new terminal window.

Open your .bashrc file.

gedit ~/.bashrc

If you don’t have gedit installed, be sure to install it before you run the command above.

sudo apt-get install gedit

Add this line of code to the very bottom of the file (Note: The line might already be there. If so, please ignore these steps).

source /opt/ros/foxy/setup.bash

You now don’t have to source your ROS2 installation every time you open a new terminal window.

Save the file, and close it.

Let’s create a workspace directory.

mkdir -p ~/camera_to_robot_base_frame_ws/src

Open your .bashrc file.

gedit ~/.bashrc

Add this line of code to the very bottom of the .bashrc file.

source ~/camera_to_robot_base_frame_ws/install/setup.bash

1_add_bashrcJPG

Save the file and close it.

Create a Package

Open a new terminal window, and type:

cd ~/camera_to_robot_base_frame_ws/src

Now let’s create a ROS package.

Type this command:

ros2 pkg create --build-type ament_python my_package

Your package named my_package has now been created.

2-package-createdJPG

Build Your Package

Return to the root of your workspace:

cd ~/camera_to_robot_base_frame_ws/

Build all packages in the workspace.

colcon build
3-build-packageJPG

Write Node(s)

Open a new terminal window.

Move to the camera_to_robot_base_frame_ws/src/my_package/my_package folder.

cd camera_to_robot_base_frame_ws/src/my_package/my_package

Write this program, and add it to this folder you’re currently in. The name of this file is camera_publisher.py.

This code generates random object coordinates (i.e. x and y location). In a real-world scenario, you would be detecting an object with a camera and then publishing those coordinates to a ROS2 topic.

gedit camera_publisher.py
''' ####################
    Detect an object in a video stream and publish its coordinates
	from the perspective of the camera (i.e. camera reference frame)
	Note: You don't need a camera to run this node. This node just demonstrates how to create
	a publisher node in ROS2 (i.e. how to publish data to a topic in ROS2).
    -------
	Publish the coordinates of the centroid of an object to a topic:
	  /pos_in_cam_frame – The position of the center of an object in centimeter coordinates
    ==================================
    Author: Addison Sears-Collins
    Date: September 28, 2020
    #################### '''

import rclpy # Import the ROS client library for Python
from rclpy.node import Node # Enables the use of rclpy's Node class
from std_msgs.msg import Float64MultiArray # Enable use of the std_msgs/Float64MultiArray message type
import numpy as np # NumPy Python library
import random # Python library to generate random numbers

class CameraPublisher(Node):
  """
  Create a CameraPublisher class, which is a subclass of the Node class.
  The class publishes the position of an object every 3 seconds.
  The position of the object are the x and y coordinates with respect to 
  the camera frame.
  """
 
  def __init__(self):
    """
	Class constructor to set up the node
    """
  
    # Initiate the Node class's constructor and give it a name
    super().__init__('camera_publisher')
	
    # Create publisher(s)  
	
    # This node publishes the position of an object every 3 seconds.
    # Maximum queue size of 10. 
    self.publisher_position_cam_frame = self.create_publisher(Float64MultiArray, '/pos_in_cam_frame', 10)
	
    # 3 seconds
    timer_period = 3.0	
    self.timer = self.create_timer(timer_period, self.get_coordinates_of_object)
    self.i = 0	# Initialize a counter variable
	
    # Centimeter to pixel conversion factor
    # Assume we measure 36.0 cm across the width of the field of view of the camera.
    # Assume camera is 640 pixels in width and 480 pixels in height
    self.CM_TO_PIXEL = 36.0 / 640
  
  def get_coordinates_of_object(self):
    """
    Callback function.
    This function gets called every 3 seconds.
	We locate an object using the camera and then publish its coordinates to ROS2 topics.
    """	
    # Center of the bounding box that encloses the detected object.
    # This is in pixel coordinates.
    # Since we don't have an actual camera and an object to detect, 
    # we generate random pixel locations.
    # Assume x (width) can go from 0 to 640 pixels, and y (height) can go from 0 to 480 pixels
    x = random.randint(250,450)   # Generate a random integer from 250 to 450 (inclusive)
    y = random.randint(250,450)   # Generate a random integer from 250 to 450 (inclusive)
    
    # Calculate the center of the object in centimeter coordinates
    # instead of pixel coordinates
    x_cm = x * self.CM_TO_PIXEL
    y_cm = y * self.CM_TO_PIXEL
	
    # Store the position of the object in a NumPy array 
    object_position = [x_cm, y_cm]	  
	
    # Publish the coordinates to the topic
    self.publish_coordinates(object_position)
	
    # Increment counter variable
    self.i += 1
   	
  def publish_coordinates(self,position):
    """
    Publish the coordinates of the object to ROS2 topics
    :param: The position of the object in centimeter coordinates [x , y] 
    """
    msg = Float64MultiArray() # Create a message of this type 
    msg.data = position # Store the x and y coordinates of the object
    self.publisher_position_cam_frame.publish(msg) # Publish the position to the topic 	  

def main(args=None):

  # Initialize the rclpy library
  rclpy.init(args=args)

  # Create the node
  camera_publisher = CameraPublisher()

  # Spin the node so the callback function is called.
  # Publish any pending messages to the topics.
  rclpy.spin(camera_publisher)

  # Destroy the node explicitly
  # (optional - otherwise it will be done automatically
  # when the garbage collector destroys the node object)
  camera_publisher.destroy_node()

  # Shutdown the ROS client library for Python
  rclpy.shutdown()

if __name__ == '__main__':
  main()

Save the file and close it.

Now change the permissions on the file.

chmod +x camera_publisher.py

Let’s add a program (this isn’t a ROS2 node…it’s just a helper program that performs calculations) that will be responsible for converting the object’s coordinates from the camera reference frame to the to robot base frame. We’ll call it coordinate_transform.py.

gedit coordinate_transform.py
''' ####################
    Convert camera coordinates to robot base frame coordinates
    ==================================
    Author: Addison Sears-Collins
    Date: September 28, 2020
    #################### '''

import numpy as np
import random # Python library to generate random numbers

class CoordinateConversion(object):
  """
  Parent class for coordinate conversions
  All child classes must implement the convert function.
  Every class in Python is descended from the object class
  class CoordinateConversion == class CoordinateConversion(object)
  This class is a superclass, a general class from which 
  more specialized classes (e.g. CameraToRobotBaseConversion) can be defined.
  """
  def convert(self, frame_coordinates):
    """
    Convert between coordinate frames

    Input
      :param frame_coordinates: Coordinates of the object in a reference frame (x, y, z, 1)

    Output
      :return: new_frame_coordinates: Coordinates of the object in a new reference frame (x, y, z, 1)

    """
    # Any subclasses that inherit this superclass CoordinateConversion must implement this method.	  
    raise NotImplementedError


class CameraToRobotBaseConversion(CoordinateConversion):
  """
  Convert camera coordinates to robot base frame coordinates
  This class is a subclass that inherits the methods from the CoordinateConversion class.
  """

  def __init__(self,rot_angle,x_disp,y_disp,z_disp):
    """
    Constructor for the CameraToRobotBaseConversion class. Sets the properties.

    Input
      :param rot_angle: Angle between axes in degrees
      :param x_disp: Displacement between coordinate frames in the x direction in centimeters
      :param y_disp: Displacement between coordinate frames in the y direction in centimeters
      :param z_disp: Displacement between coordinate frames in the z direction in centimeters

    """
    self.angle = np.deg2rad(rot_angle) # Convert degrees to radians
    self.X = x_disp
    self.Y = y_disp
    self.Z = z_disp

  def convert(self, frame_coordinates):
    """
    Convert camera coordinates to robot base frame coordinates

    Input
      :param frame_coordinates: Coordinates of the object in the camera reference frame (x, y, z, 1) in centimeters

    Output
      :return: new_frame_coordinates: Coordinates of the object in the robot base reference frame (x, y, z, 1) in centimeters

    """
    # Define the rotation matrix from the robotic base frame (frame 0)
    # to the camera frame (frame c).
    rot_mat_0_c = np.array([[1, 0, 0],
                            [0, np.cos(self.angle), -np.sin(self.angle)],
                            [0, np.sin(self.angle), np.cos(self.angle)]])

    # Define the displacement vector from frame 0 to frame c
    disp_vec_0_c = np.array([[self.X],
                             [self.Y], 
                             [self.Z]])

    # Row vector for bottom of homogeneous transformation matrix
    extra_row_homgen = np.array([[0, 0, 0, 1]])

    # Create the homogeneous transformation matrix from frame 0 to frame c
    homgen_0_c = np.concatenate((rot_mat_0_c, disp_vec_0_c), axis=1) # side by side
    homgen_0_c = np.concatenate((homgen_0_c, extra_row_homgen), axis=0) # one above the other
	
    # Coordinates of the object in base reference frame
    new_frame_coordinates = homgen_0_c @ frame_coordinates
	
    return new_frame_coordinates


def main():
  """
  This code is used to test the methods implemented above.
  """

  # Define the displacement from frame base frame of robot to camera frame in centimeters
  x_disp = -17.8
  y_disp = 24.4
  z_disp = 0.0
  rot_angle = 180 # angle between axes in degrees
  
  # Create a CameraToRobotBaseConversion object
  cam_to_robo = CameraToRobotBaseConversion(rot_angle, x_disp, y_disp, z_disp)
  
  # Centimeter to pixel conversion factor
  # I measured 36.0 cm across the width of the field of view of the camera.
  CM_TO_PIXEL = 36.0 / 640

  print(f'Detecting an object for 3 seconds')
  dt = 0.1 # Time interval
  t = 0    # Set starting time
  while t<3:
    t = t + dt
  
    # This is in pixel coordinates.
	# Since we don't have an actual camera and an object to detect, 
	# we generate random pixel locations.
	# Assume x (width) can go from 0 to 640 pixels, and y (height) can go from 0 to 480 pixels
    x = random.randint(250,450)   # Generate a random integer from 250 to 450 (inclusive)
    y = random.randint(250,450)   # Generate a random integer from 250 to 450 (inclusive)

    # Calculate the center of the object in centimeter coordinates
    # instead of pixel coordinates
    x_cm = x * CM_TO_PIXEL
    y_cm = y * CM_TO_PIXEL	

    # Coordinates of the object in the camera reference frame
    cam_ref_coord = np.array([[x_cm],
                              [y_cm],
                              [0.0],
                              [1]])
    
    robot_base_frame_coord = cam_to_robo.convert(cam_ref_coord) # Return robot base frame coordinates 
   
    text = "x: " + str(robot_base_frame_coord[0][0]) + ", y: " + str(robot_base_frame_coord[1][0])
    print(f'{t}:{text}') # Display time stamp and coordinates   

if __name__ == '__main__':
  main()

Save the file and close it.

Now change the permissions on the file.

chmod +x coordinate_transform.py

Now we’ll add one more node. This node will be responsible for receiving the coordinates of an object in the camera reference frame (these coordinates are published to the ‘/pos_in_cam_frame’ topic by camera_publisher.py), using coordinate_transform.py to convert those coordinates to the robotic arm base frame, and finally publishing these transformed coordinates to the ‘/pos_in_robot_base_frame’ topic.

Name this node robotic_arm_publishing_subscriber.py.

gedit robotic_arm_publishing_subscriber.py
''' ####################
    Receive (i.e. subscribe) coordinates of an object in the camera reference frame, and
	publish those coordinates in the robotic arm base frame.
    -------
	Publish the coordinates of the centroid of an object to topic:
	  /pos_in_robot_base_frame – The x and y position of the center of an object in centimeter coordinates
    ==================================
    Author: Addison Sears-Collins
    Date: September 28, 2020
    #################### '''

import rclpy # Import the ROS client library for Python
from rclpy.node import Node # Enables the use of rclpy's Node class
from std_msgs.msg import Float64MultiArray # Enable use of the std_msgs/Float64MultiArray message type
import numpy as np # NumPy Python library
from .coordinate_transform import CameraToRobotBaseConversion

class PublishingSubscriber(Node):
  """
  Create a PublishingSubscriber class, which is a subclass of the Node class.
  This class subscribes to the x and y coordinates of an object in the camera reference frame.
  The class then publishes the x and y coordinates of the object in the robot base frame.  
  """
 
  def __init__(self):
    """
	Class constructor to set up the node
    """
  
    # Initiate the Node class's constructor and give it a name
    super().__init__('publishing_subscriber')
	
    # Create subscriber(s)    
	
    # The node subscribes to messages of type std_msgs/Float64MultiArray, over a topic named:
    #   /pos_in_cam_frame
    # The callback function is called as soon as a message is received.
    # The maximum number of queued messages is 10.
    self.subscription_1 = self.create_subscription(
      Float64MultiArray,
      '/pos_in_cam_frame',
      self.pos_received,
      10)
    self.subscription_1  # prevent unused variable warning
		
    # Create publisher(s)  
	
    # This node publishes the position in robot frame coordinates.
    # Maximum queue size of 10. 
    self.publisher_pos_robot_frame = self.create_publisher(Float64MultiArray, '/pos_in_robot_base_frame', 10)

    # Define the displacement from frame base frame of robot to camera frame in centimeters
    x_disp = -17.8
    y_disp = 24.4
    z_disp = 0.0
    rot_angle = 180 # angle between axes in degrees
	
    # Create a CameraToRobotBaseConversion object
    self.cam_to_robo = CameraToRobotBaseConversion(rot_angle, x_disp, y_disp, z_disp)
  
  def pos_received(self, msg):
    """
    Callback function.
    This function gets called as soon as the position of the object is received.
    :param: msg is of type std_msgs/Float64MultiArray 
    """
    object_position = msg.data
    
    # Coordinates of the object in the camera reference frame in centimeters
    cam_ref_coord = np.array([[object_position[0]],
                              [object_position[1]],
                              [0.0],
                              [1]])
    
    robot_base_frame_coord = self.cam_to_robo.convert(cam_ref_coord) # Return robot base frame coordinates 
	
    # Capture the object's desired position (x, y)
    object_position = [robot_base_frame_coord[0][0], robot_base_frame_coord[1][0]]
	  
    # Publish the coordinates to the topics
    self.publish_position(object_position)
   	
  def publish_position(self,object_position):
    """
    Publish the coordinates of the object with respect to the robot base frame.
    :param: object position [x, y] 
    """
    msg = Float64MultiArray() # Create a message of this type 
    msg.data = object_position # Store the object's position
    self.publisher_pos_robot_frame.publish(msg) # Publish the position to the topic 
	  
def main(args=None):

  # Initialize the rclpy library
  rclpy.init(args=args)

  # Create the node
  publishing_subscriber = PublishingSubscriber()

  # Spin the node so the callback function is called.
  # Pull messages from any topics this node is subscribed to.
  # Publish any pending messages to the topics.
  rclpy.spin(publishing_subscriber)

  # Destroy the node explicitly
  # (optional - otherwise it will be done automatically
  # when the garbage collector destroys the node object)
  publishing_subscriber.destroy_node()

  # Shutdown the ROS client library for Python
  rclpy.shutdown()

if __name__ == '__main__':
  main()

Save the file and close it.

Now change the permissions on the file.

chmod +x robotic_arm_publishing_subscriber.py

Add Dependencies

Navigate one level back.

cd ..

You are now in the camera_to_robot_base_frame_ws/src/my_package/ directory. 

Type 

ls

You should see the setup.py, setup.cfg, and package.xml files.

Open package.xml with your text editor.

gedit package.xml

Fill in the <description>, <maintainer>, and <license> tags.

  <name>my_package</name>
  <version>0.0.0</version>
  <description>Converts camera coordinates to robotic arm base frame coordinates</description>
  <maintainer email="example@automaticaddison.com">Addison Sears-Collins</maintainer>
  <license>All Rights Reserved.</license>

Add a new line after the ament_python build_type dependency, and add the following dependencies which will correspond to other packages the packages in this workspace needs (this will be in your node’s import statements):

<exec_depend>rclpy</exec_depend>
<exec_depend>std_msgs</exec_depend>

This let’s the system know that this package needs the rclpy and std_msgs packages when its code is executed. 

Save the file and close it.

Add an Entry Point

Now we need to add the entry points for the node(s) by opening the setup.py file.

gedit setup.py

Make sure to modify the entry_points block of code so that it looks like this:

entry_points={
        'console_scripts': [
                'camera_publisher = my_package.camera_publisher:main',
                'robotic_arm_publishing_subscriber = my_package.robotic_arm_publishing_subscriber:main',
        ],
},

Save setup.py and close it.

Check for Missing Dependencies

Check for any missing dependencies before you build the package.

Move to your workspace folder.

cd ~/camera_to_robot_base_frame_ws

Run the following command:

rosdep install -i --from-path src --rosdistro foxy -y

I get a message that says:

“#All required rosdeps installed successfully”

Build and Run

Now, build your package by first moving to the src folder.

cd src

Then build the package.

colcon build

You should get a message that says:

“Summary: 1 package finished [<time it took in seconds>]”

If you get errors, go to your Python programs and check for errors. Python is sensitive to indentation (tabs and spaces), so make sure the spacing of the code is correct (go one line at a time through the code).

The programs are located in this folder.

cd camera_to_robot_base_frame_ws/src/my_package/my_package

After you make changes, build the package again.

cd ~/camera_to_robot_base_frame_ws
colcon build

You should see this message when everything build successfully:

4-build-successfulJPG

Open a new terminal tab.

Run the nodes:

ros2 run my_package camera_publisher

In another terminal tab:

ros2 run robotic_arm_publishing_subscriber

Open a new terminal tab, and move to the root directory of the workspace.

cd ~/camera_to_robot_base_frame_ws

List the active topics.

ros2 topic list -t
5-ros2-topic-listJPG

To listen to any topic, type:

ros2 topic echo /topic_name

Where you replace “/topic_name” with the name of the topic. 

For example:

ros2 topic echo /pos_in_cam_frame

You should see the (x, y) coordinates of the object (in the camera reference frame) printed to the screen.

In another terminal, you can type:

ros2 topic echo /pos_in_robot_base_frame

You should see the (x, y) coordinates of the object (in the robot base frame) printed to the screen.

Here is how my terminal windows look. The camera coordinates in centimeters are on top, and the robot base frame coordinates are on the bottom.

7-camera-frame-above-robot-base-frame-belowJPG

When you’ve had enough, type Ctrl+C in each terminal tab to stop the nodes from spinning.

Zip the Workspace for Distribution

If you ever want to zip the whole workspace and send it to someone, open a new terminal window.

Move to the directory containing your workspace.

cd ~/camera_to_robot_base_frame_ws
cd ..
zip -r camera_to_robot_base_frame_ws.zip camera_to_robot_base_frame_ws

The syntax is:

zip -r <filename.zip> <foldername>

How To Set up a ROS2 Project for Python – Foxy Fitzroy

In this tutorial, we will learn how to set up a ROS2 project from scratch.

Prerequisites

Source Your ROS2 Installation

We first need to create a new workspace. A workspace is a folder that will contain all your ROS packages.

Here is the official tutorial, but I’ll walk through all the steps below.

Open a new terminal window.

To find out which version of ROS2 you have installed, type the following command:

printenv ROS_DISTRO

I’m using the foxy distribution of ROS2.

Now, let’s source the ROS2 environment. Type the following command:

source /opt/ros/foxy/setup.bash

Create a Workspace

Let’s create a workspace directory. “-p” is short for “–parents”. This flag makes sure the mkdir (“make directory”) command makes the parent /dev_ws/ folder as well as the /src/ folder within it (Feel free to check out my Linux commands tutorial).

mkdir -p ~/dev_ws/src
cd ~/dev_ws/src

The syntax is:

mkdir -p ~/<yourworkspace>_ws/src

cd ~/<yourworkspace>_ws/src

Create a Package

Now let’s create a ROS package.

We will use the following syntax.

ros2 pkg create –build-type ament_python <package_name>

Type this command:

ros2 pkg create --build-type ament_python my_package

Your package named my_package has now been created.

Build Your Package

Return to the root of your workspace:

cd ~/dev_ws

Build all packages in the workspace.

colcon build

If you only want to build my_package, you can type

colcon build --packages-select my_package

Source the Setup File

Open a new terminal window.

Source your ROS2 installation.

source /opt/ros/foxy/setup.bash

Move to your workspace folder.

cd ~/dev_ws

Add the files you just created to the path.

source install/setup.bash

If you don’t have gedit installed, install it using this command:

sudo apt-get install gedit

Open your .bashrc file.

gedit ~/.bashrc

Add this line of code to the very bottom of the file.

source /opt/ros/foxy/setup.bash

You now don’t have to source your ROS2 installation every time you open a new terminal window.

While you’re at it add this line of code to the very bottom of the .bashrc file.

source ~/dev_ws/install/setup.bash

The syntax is

source ~/<your_workspace>/install/setup.bash

Save the file, and close it.

Your system now knows where to find your ROS2 packages.

Write Node(s)

Move to the dev_ws/src/my_package/my_package folder.

cd dev_ws/src/my_package/my_package

Write a Python program (i.e. node), and add it to this folder you’re currently in. This post has some sample code you can use. Copy the Publisher Node code, and paste it into a file named my_python_script.py.

gedit my_python_script.py

Save the file and close it.

Now change the permissions on the file.

chmod +x my_python_script.py

Make sure that the names of your python nodes are not the same as the name of the package. Otherwise, you will have ImportErrors.

You can add as many Python programs as you like. For example, you might have one node that defines a class, and then you have another node that imports that class and uses it to make calculations, which are then published to a ROS2 topic as some data type (e.g. Float64, geometry_msgs/Twist, etc.).

Add Dependencies

Navigate one level back.

cd ..

You are now in the dev_ws/src/my_package/ directory. 

Type 

ls

You should see the setup.py, setup.cfg, and package.xml files.

Open package.xml with your text editor.

gedit package.xml

Fill in the <description>, <maintainer>, and <license> tags.

Add a new line after the ament_python build_type dependency and add the following dependencies which will correspond to other packages the packages in this workspace needs (this will be in your node’s import statements):

<exec_depend>rclpy</exec_depend>
<exec_depend>geometry_msgs</exec_depend>

This let’s the system know that this package needs the rclpy and geometry_msgs packages when its code is executed. 

The code doesn’t need the geometry_msgs package, but I’m having you add it anyway just to show you that you can add all sorts of message types in this package.xml file.

Save the file.

Add an Entry Point

Now we need to add the entry points for the node(s) by opening the setup.py file.

gedit setup.py

Make sure this code is in there:

entry_points={
        'console_scripts': [
                'my_python_script = my_package.my_python_script:main',
        ],
},

The syntax is:

‘my_python_script = my_package.my_python_script:main’

my_python_script will be the name of the executable.

‘my_package.my_python_script:main’ tells the system to execute the main() function inside my_python_script.py. Therefore, this will be the entry point when you run your node.

The executable script will go to:

~/dev_ws/install/my_package/lib/my_package/

Save setup.py and close it.

Check for Missing Dependencies

Check for any missing dependencies before you build the package.

Move to your workspace folder.

cd ~/dev_ws

Run the following command:

rosdep install -i --from-path src --rosdistro <distro> -y

For example:

rosdep install -i --from-path src --rosdistro foxy -y

I get a message that says:

“#All required rosdeps installed successfully”

Build and Run

Now, build your package by first moving to the src folder.

cd src

Then build the package.

colcon build --packages-select my_package

You should get a message that says:

“Summary: 1 package finished [<time it took in seconds>]”

Open a new terminal tab.

cd ~/dev_ws

Run the node:

ros2 run my_package my_python_script

The syntax is:

ros2 run <package_name> <node name without the .py>

Open a new terminal tab, and move to the root directory of the workspace.

cd ~/dev_ws

List the active topics.

ros2 topic list -t

To listen to any topic, type:

ros2 topic echo /topic_name

Where you replace “/topic_name” with the name of the topic. 

When you’ve had enough, type Ctrl+C in each terminal tab to stop the nodes from spinning.

Zip the Workspace for Distribution

If you ever want to zip the whole workspace and send it to someone, open a new terminal window.

Move to the directory containing your workspace.

cd ~/dev_ws
cd ..
zip -r myros2workspace.zip /dev_ws

The syntax is:

zip -r <filename.zip> <foldername>

How to Convert Camera Pixels to Robot Base Frame Coordinates

In this tutorial, we’ll learn how to convert camera pixel coordinates to coordinates that are relative to the base frame for a robotic arm.

Prerequisites

This tutorial will make a lot more sense if you’ve gone through the following tutorials first. Otherwise, if you are already familiar with terms like “homogeneous transformation matrix,” jump right ahead into this tutorial.

Case 1: Camera Lens is Parallel to the Surface

Label the Axes

Here is a two degree of freedom robotic arm that I built.

22-beam-mount-pointing-right

Here is the diagram of this robotic arm.

13-add-y-axesJPG

What we need to do is to draw the frames on the dry-erase board.

The origin of the base frame of the robotic arm is located here.

1-origin-is-located-here

The x0 axis is this line here.

2-x0-axis-line-is-here

The y0 axis is this line here.

3-y0-axis-is-here

Open up your Raspberry Pi and turn on the video stream by running the following code:

# Credit: Adrian Rosebrock
# https://www.pyimagesearch.com/2015/03/30/accessing-the-raspberry-pi-camera-with-opencv-and-python/

# import the necessary packages
from picamera.array import PiRGBArray # Generates a 3D RGB array
from picamera import PiCamera # Provides a Python interface for the RPi Camera Module
import time # Provides time-related functions
import cv2 # OpenCV library

# Initialize the camera
camera = PiCamera()

# Set the camera resolution
camera.resolution = (640, 480)

# Set the number of frames per second
camera.framerate = 32

# Generates a 3D RGB array and stores it in rawCapture
raw_capture = PiRGBArray(camera, size=(640, 480))

# Wait a certain number of seconds to allow the camera time to warmup
time.sleep(0.1)

# Capture frames continuously from the camera
for frame in camera.capture_continuous(raw_capture, format="bgr", use_video_port=True):
    
    # Grab the raw NumPy array representing the image
    image = frame.array
    
    # Display the frame using OpenCV
    cv2.imshow("Frame", image)
    
    # Wait for keyPress for 1 millisecond
    key = cv2.waitKey(1) & 0xFF
    
    # Clear the stream in preparation for the next frame
    raw_capture.truncate(0)
    
    # If the `q` key was pressed, break from the loop
    if key == ord("q"):
        break

Let’s label the origin for the camera reference frame. The origin (x = 0, y = 0) for the camera reference frame is located in the far upper-left corner of the screen. I’ve put a screw driver head on the table to mark the location of the origin of the camera frame.

4b-label-origin
4-label-the-originJPG

The x axis for the camera frame runs to the right from the origin along the top of the field of view. I’ll label this axis xc.

The y axis for the camera frame runs downward from the origin along the left side of the field of view. I’ll label this axis yc.

5-label-x-y-axes
6-label-x-y-axes

Press CTRL + C to stop the code from running.

Run the Object Detector Code

Run this code (absolute_difference_method_cm.py):

# Author: Addison Sears-Collins
# Description: This algorithm detects objects in a video stream
#   using the Absolute Difference Method. The idea behind this 
#   algorithm is that we first take a snapshot of the background.
#   We then identify changes by taking the absolute difference 
#   between the current video frame and that original 
#   snapshot of the background (i.e. first frame). 

# import the necessary packages
from picamera.array import PiRGBArray # Generates a 3D RGB array
from picamera import PiCamera # Provides a Python interface for the RPi Camera Module
import time # Provides time-related functions
import cv2 # OpenCV library
import numpy as np # Import NumPy library

# Initialize the camera
camera = PiCamera()

# Set the camera resolution
camera.resolution = (640, 480)

# Set the number of frames per second
camera.framerate = 30

# Generates a 3D RGB array and stores it in rawCapture
raw_capture = PiRGBArray(camera, size=(640, 480))

# Wait a certain number of seconds to allow the camera time to warmup
time.sleep(0.1)

# Initialize the first frame of the video stream
first_frame = None

# Create kernel for morphological operation. You can tweak
# the dimensions of the kernel.
# e.g. instead of 20, 20, you can try 30, 30
kernel = np.ones((20,20),np.uint8)

# Centimeter to pixel conversion factor
# I measured 32.0 cm across the width of the field of view of the camera.
CM_TO_PIXEL = 32.0 / 640

# Capture frames continuously from the camera
for frame in camera.capture_continuous(raw_capture, format="bgr", use_video_port=True):
    
    # Grab the raw NumPy array representing the image
    image = frame.array

    # Convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # Close gaps using closing
    gray = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel)
      
    # Remove salt and pepper noise with a median filter
    gray = cv2.medianBlur(gray,5)
    
    # If first frame, we need to initialize it.
    if first_frame is None:
        
      first_frame = gray
      
      # Clear the stream in preparation for the next frame
      raw_capture.truncate(0)
      
      # Go to top of for loop
      continue
      
    # Calculate the absolute difference between the current frame
    # and the first frame
    absolute_difference = cv2.absdiff(first_frame, gray)

    # If a pixel is less than ##, it is considered black (background). 
    # Otherwise, it is white (foreground). 255 is upper limit.
    # Modify the number after absolute_difference as you see fit.
    _, absolute_difference = cv2.threshold(absolute_difference, 50, 255, cv2.THRESH_BINARY)

    # Find the contours of the object inside the binary image
    contours, hierarchy = cv2.findContours(absolute_difference,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)[-2:]
    areas = [cv2.contourArea(c) for c in contours]
 
    # If there are no countours
    if len(areas) < 1:
 
      # Display the resulting frame
      cv2.imshow('Frame',image)
 
      # Wait for keyPress for 1 millisecond
      key = cv2.waitKey(1) & 0xFF
 
      # Clear the stream in preparation for the next frame
      raw_capture.truncate(0)
    
      # If "q" is pressed on the keyboard, 
      # exit this loop
      if key == ord("q"):
        break
    
      # Go to the top of the for loop
      continue
 
    else:
        
      # Find the largest moving object in the image
      max_index = np.argmax(areas)
      
    # Draw the bounding box
    cnt = contours[max_index]
    x,y,w,h = cv2.boundingRect(cnt)
    cv2.rectangle(image,(x,y),(x+w,y+h),(0,255,0),3)
 
    # Draw circle in the center of the bounding box
    x2 = x + int(w/2)
    y2 = y + int(h/2)
    cv2.circle(image,(x2,y2),4,(0,255,0),-1)
	
    # Calculate the center of the bounding box in centimeter coordinates
    # instead of pixel coordinates
    x2_cm = x2 * CM_TO_PIXEL
    y2_cm = y2 * CM_TO_PIXEL
 
    # Print the centroid coordinates (we'll use the center of the
    # bounding box) on the image
    text = "x: " + str(x2_cm) + ", y: " + str(y2_cm)
    cv2.putText(image, text, (x2 - 10, y2 - 10),
      cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
         
    # Display the resulting frame
    cv2.imshow("Frame",image)
    
    # Wait for keyPress for 1 millisecond
    key = cv2.waitKey(1) & 0xFF
 
    # Clear the stream in preparation for the next frame
    raw_capture.truncate(0)
    
    # If "q" is pressed on the keyboard, 
    # exit this loop
    if key == ord("q"):
      break

# Close down windows
cv2.destroyAllWindows()

Now, grab an object. I’ll grab a quarter and place it in the field of view.

10-quarter-in-field-of-viewJPG

You can see the position of the quarter in the camera reference frame is:

  • xc in centimeters = 19.5 cm
  • yc in centimeters = 13.75 cm
  • zc in centimeters = 0.0 cm

Finding the Homogeneous Transformation Matrix

This is cool, but what I really want to know are the coordinates of the quarter relative to the base frame of my two degree of freedom robotic arm. 

If I know the coordinates of the object relative to the base frame of my two degree of freedom robotic arm, I can then use inverse kinematics to command the robot to move the end effector to that location.

A tool that can help us do that is known as the homogeneous transformation matrix.

1-takes-the-following-formJPG

A homogeneous transformation matrix is a 4×4 matrix (i.e. 4 rows and 4 columns) that enables us to find the position of a point in reference frame m (e.g. the robotic arm base frame) given the position of a point in reference frame n (e.g. the camera reference frame).

In other words:

11-in-other-wordsJPG

More, specifically:

12-more-specificallyJPG

Which is the same thing as…

13-matrixJPG

So you can see that if we know the vector:

14-we-know-vectorJPG

If we can then determine the 4×4 matrix…

15-determine-4by4JPG

…and multiply the two terms together, We can calculate…

16-multiply-two-termsJPG

the position of the object in the robotic arm base reference frame.

homgen0c has two pieces we need to find. We need to find the rotation matrix portion (i.e. the 3×3 matrix on the left. We also need to find the displacement vector portion (the rightmost column).

17-homgen-0-cJPG

Calculate the Rotation Matrix That Converts the Camera Reference Frame to the Base Frame

Let’s start by finding the rotation matrix portion. I’m going to draw the camera and robotic base reference frames below. We first need to look at how we can rotate the base frame to match up with the camera frame of the robotic arm.

18-axesJPG

Which way is z0 pointing? Using the right hand rule, take your four fingers and point them towards the x0 axis. Your palm faces towards y0. Your thumb points towards z0, which is upwards out of the page (or upwards out of the dry erase board). I’ll mark this with a dark circle.

19-axesJPG

Which way is zc pointing? Using the right hand rule, take your four fingers and point them towards the xc axis. Your palm faces towards yc. Your thumb points towards zc, which is downwards into the page (or downward into the dry erase board). I’m marking this with an X.

20-axesJPG

Now, stick your thumb in the direction of x0. Your thumb points is the axis of rotation, while your four fingers indicate the direction of positive rotation. You can see that we would need to rotate frame 0 an angle of +180 degrees around the x0 axis in order to get frame 0 to match up with frame c (the camera reference frame).

The standard rotation matrix for rotation around the x-axis is…

21-standard-x-axisJPG

Substitute 180 degrees for ɑ, we get:

22-substitute-180JPG

Calculate the Displacement Vector

Now that we have the rotation matrix portion of the homogeneous transformation matrix, we need to calculate the displacement vector from the origin of frame 0 to the origin of frame c.

23-axesJPG

Displacement Along x0

Grab your ruler and measure the distance from the origin of frame 0 to the origin of frame c along the x0 axis. 

I measured -17.8 cm (because the displacement is in the negative x0 direction).

displacement-along-x0

Displacement Along y0

Grab your ruler and measure the distance from the origin of frame 0 to the origin of frame c along the y0 axis. 

I measured 23.0 cm.

displacement-along-y0
displacement-along-y02

Displacement Along z0

Both reference frames are in the same z plane, so there is no displacement along z0 from frame 0 to frame c.

Therefore, we have 0.0 cm.

The full displacement vector is:

24-displacementJPG

Putting It All Together

Now that we have our rotation matrix…

25-putting-all-togetherJPG

and our displacement vector…

26-displacementJPG

we put them both together to get the following homogeneous transformation matrix.

27-cam-ref-frameJPG

We can now convert a point in the camera reference frame to a point in the base frame of the robotic arm.

Print the Base Frame Coordinates of an Object

Now, let’s modify our absolute_difference_method_cm.py program so that we display an object’s x and y position in base frame coordinates rather than camera frame coordinates. I remeasured the width of the field of view in centimeters and calculated it as 36.0 cm.

I’ll name the program display_base_frame_coordinates.py.

Here is the code:

# Author: Addison Sears-Collins
# Description: This algorithm detects objects in a video stream
#   using the Absolute Difference Method. The idea behind this 
#   algorithm is that we first take a snapshot of the background.
#   We then identify changes by taking the absolute difference 
#   between the current video frame and that original 
#   snapshot of the background (i.e. first frame).
#   The coordinates of an object are displayed relative to
#   the base frame of a two degree of freedom robotic arm.

# import the necessary packages
from picamera.array import PiRGBArray # Generates a 3D RGB array
from picamera import PiCamera # Provides a Python interface for the RPi Camera Module
import time # Provides time-related functions
import cv2 # OpenCV library
import numpy as np # Import NumPy library

# Initialize the camera
camera = PiCamera()

# Set the camera resolution
camera.resolution = (640, 480)

# Set the number of frames per second
camera.framerate = 30

# Generates a 3D RGB array and stores it in rawCapture
raw_capture = PiRGBArray(camera, size=(640, 480))

# Wait a certain number of seconds to allow the camera time to warmup
time.sleep(0.1)

# Initialize the first frame of the video stream
first_frame = None

# Create kernel for morphological operation. You can tweak
# the dimensions of the kernel.
# e.g. instead of 20, 20, you can try 30, 30
kernel = np.ones((20,20),np.uint8)

# Centimeter to pixel conversion factor
# I measured 36.0 cm across the width of the field of view of the camera.
CM_TO_PIXEL = 36.0 / 640

# Define the rotation matrix from the robotic base frame (frame 0)
# to the camera frame (frame c).
rot_angle = 180 # angle between axes in degrees
rot_angle = np.deg2rad(rot_angle)
rot_mat_0_c = np.array([[1, 0, 0],
                        [0, np.cos(rot_angle), -np.sin(rot_angle)],
                        [0, np.sin(rot_angle), np.cos(rot_angle)]])

# Define the displacement vector from frame 0 to frame c
disp_vec_0_c = np.array([[-17.8],
                         [24.4], # This was originally 23.0 but I modified it for accuracy
                         [0.0]])

# Row vector for bottom of homogeneous transformation matrix
extra_row_homgen = np.array([[0, 0, 0, 1]])

# Create the homogeneous transformation matrix from frame 0 to frame c
homgen_0_c = np.concatenate((rot_mat_0_c, disp_vec_0_c), axis=1) # side by side
homgen_0_c = np.concatenate((homgen_0_c, extra_row_homgen), axis=0) # one above the other

# Initialize coordinates in the robotic base frame
coord_base_frame = np.array([[0.0],
                             [0.0],
                             [0.0],
                             [1]])

# Capture frames continuously from the camera
for frame in camera.capture_continuous(raw_capture, format="bgr", use_video_port=True):
    
    # Grab the raw NumPy array representing the image
    image = frame.array

    # Convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # Close gaps using closing
    gray = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel)
      
    # Remove salt and pepper noise with a median filter
    gray = cv2.medianBlur(gray,5)
    
    # If first frame, we need to initialize it.
    if first_frame is None:
        
      first_frame = gray
      
      # Clear the stream in preparation for the next frame
      raw_capture.truncate(0)
      
      # Go to top of for loop
      continue
      
    # Calculate the absolute difference between the current frame
    # and the first frame
    absolute_difference = cv2.absdiff(first_frame, gray)

    # If a pixel is less than ##, it is considered black (background). 
    # Otherwise, it is white (foreground). 255 is upper limit.
    # Modify the number after absolute_difference as you see fit.
    _, absolute_difference = cv2.threshold(absolute_difference, 95, 255, cv2.THRESH_BINARY)

    # Find the contours of the object inside the binary image
    contours, hierarchy = cv2.findContours(absolute_difference,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)[-2:]
    areas = [cv2.contourArea(c) for c in contours]
 
    # If there are no countours
    if len(areas) < 1:
 
      # Display the resulting frame
      cv2.imshow('Frame',image)
 
      # Wait for keyPress for 1 millisecond
      key = cv2.waitKey(1) & 0xFF
 
      # Clear the stream in preparation for the next frame
      raw_capture.truncate(0)
    
      # If "q" is pressed on the keyboard, 
      # exit this loop
      if key == ord("q"):
        break
    
      # Go to the top of the for loop
      continue
 
    else:
        
      # Find the largest moving object in the image
      max_index = np.argmax(areas)
      
    # Draw the bounding box
    cnt = contours[max_index]
    x,y,w,h = cv2.boundingRect(cnt)
    cv2.rectangle(image,(x,y),(x+w,y+h),(0,255,0),3)
 
    # Draw circle in the center of the bounding box
    x2 = x + int(w/2)
    y2 = y + int(h/2)
    cv2.circle(image,(x2,y2),4,(0,255,0),-1)
    
    # Calculate the center of the bounding box in centimeter coordinates
    # instead of pixel coordinates
    x2_cm = x2 * CM_TO_PIXEL
    y2_cm = y2 * CM_TO_PIXEL
    
    # Coordinates of the object in the camera reference frame
    cam_ref_coord = np.array([[x2_cm],
                              [y2_cm],
                              [0.0],
                              [1]])
    
    # Coordinates of the object in base reference frame
    coord_base_frame = homgen_0_c @ cam_ref_coord
 
    # Print the centroid coordinates (we'll use the center of the
    # bounding box) on the image
    text = "x: " + str(coord_base_frame[0][0]) + ", y: " + str(coord_base_frame[1][0])
    cv2.putText(image, text, (x2 - 10, y2 - 10),
      cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
         
    # Display the resulting frame
    cv2.imshow("Frame",image)
    
    # Wait for keyPress for 1 millisecond
    key = cv2.waitKey(1) & 0xFF
 
    # Clear the stream in preparation for the next frame
    raw_capture.truncate(0)
    
    # If "q" is pressed on the keyboard, 
    # exit this loop
    if key == ord("q"):
      break

# Close down windows
cv2.destroyAllWindows()

Run the code.

Then place a quarter on the dry erase board. You should see that the x and y coordinates of the quarter relative to the base frame are printed to the screen.

Then place a quarter on the dry erase board. You should see that the x and y coordinates of the quarter relative to the base frame are printed to the screen.

I am placing the quarter at x = 4 and y = 9. The text on the screen should now display the coordinates of the quarter in the base frame coordinates.

quarter-base-frame-coordinates
quarter-base-frame-coordinates-2

The first time I ran this code, my x coordinate was accurate, but my y coordinate was not. I modified the y displacement value until I got something in the ballpark of x =4 and y = 9 for the coordinates printed to the screen. In the end, my displacement value was 24.4 cm instead of 23.0 cm (which is what it was originally).

35-make-modifications-to-displacement-vectorJPG

Thus, here was my final equation for converting camera coordinates to robotic base frame coordinates:

28-all-togetherJPG-1

Where z0 = zc = 0 cm.

Pretty cool!

Case 2: Camera Lens Is Not Parallel To the Surface

Up until now, we have assumed that the camera lens is always parallel to the underlying surface like this.

29-camera-lens-not-parallelJPG

What happens if it isn’t, and instead the camera lens is looking down on the surface at some angle. You can imagine that in the real world, this situation is quite common. 

36-tilt-looking-at-angle

Consider this humanoid robot for example. This robot might have some sort of camera inside his helmet. Any surface that he looks at that is in front of his body will likely be from some angle.

robonaut

Also, in real-world use cases, a physical grid is usually not present. Instead, the structured light technique is used to project a known grid pattern on a surface in order to help the robot determine the coordinates of an object.

You can see an example of structured light developed at the NASA Jet Propulsion Laboratory on the left side of this image.

structured-light-nasa-jpl

A robot that looks at structured light would also see it from an angle. So, what changes in this case?

Previously in this tutorial, we assumed that the pixel-to-centimeter conversion factor will be the same in the x and y directions. This assumption only holds true, however, if the camera lens is directly overhead and parallel to the grid surface. This all changes when the camera is looking at a surface from an angle.

One way we can solve this problem is to derive (based on observation) a function that takes as input the pixel coordinates of an object (as seen by the camera) and outputs the (x0,y0) real-world coordinates in centimeters (i.e. the robotic arm base frame).

If you look at the image below, you can see that the pixel-to-centimeter conversion factor in the y (and x) direction is larger at the bottom of the image than it is at the top. And since the squares on the dry-erase board look more like rectangles, you know that the pixel-to-centimeter conversion factor is different in the x (horizontal) and y directions (vertical).

Example

Let’s do a real example, so you can see how to do this.

Create an Equation to Map Camera Pixel Y Coordinates to Global Reference Frame Y Coordinates in Centimeters

Open a snapshot from the tilted camera frame in a photo editing program like Microsoft Paint. My image is 640 pixels wide and 480 pixels in height. 

37-open-ms-paintJPG

You can see in the bottom-left of Microsoft Paint how the program spits out the pixel location of my cursor. Make sure that whatever photo editing program you’re using gives you the ability to see pixel locations.

38-pixel-locationJPG

Open a blank spreadsheet. I’m using Microsoft Excel.

Create four columns:

39-create-four-columnsJPG
  • x in Column A (Global Reference Frame coordinates in cm)
  • y in Column B (Global Reference Frame coordinates in cm)
  • x in Column A (Camera Pixel coordinates)
  • y in Column B (Camera Pixel coordinates)

We will start by making x=0 and then finding the corresponding y.

40-all-0sJPG

Let’s fill in y with some evenly spaced values.

41-fill-in-yJPG

Take your cursor and place it on top of each global reference frame coordinate, and record the corresponding pixel value.

42-fill-in-corresponding-coordinatesJPG

Now we want to find an equation that can convert camera pixel y coordinates to global reference frame y coordinates.

Plot the data. Here is how it looks.

44-polynomial-order-2

I will right-click on the data points and add a trendline. I will place a checkmark near the “Display Equation on chart” and “Display R-squared value” options. I want to have a good R-squared value. This value shows how well my trendline fits the data. The closer it is to 1, the better.

A Polynomial best-fit line of Order 2 gives me an R-squared value of around 0.9999, so let’s go with this equation.

(Global Reference Frame Y Coordinate in cm) = 0.00006*(Camera Pixel Y Coordinate)2 – 0.0833*(Camera Pixel Y Coordinate) + 25.848

Now, let’s enter this formula into our spreadsheet.

43-excel-equation-bJPG
43-excel-equationJPG

I’m going to test the equation by choosing a random y pixel value and seeing what the y coordinate would be in centimeters.

Go back to MS Paint. I’m going to put the cursor over (x=0 cm, y=7 cm) and then record what the y pixel value is.

I get a y pixel value of 285 when I put my cursor over this location. Let’s see what the equation spits out:

44-what-equation-spits-outJPG

I calculated 6.981 cm. I expected to get a y value of 7 cm, so this is right around what I expected.

Create an Equation to Map Camera Pixel Y and X Coordinates to Global Reference Frame X Coordinates in Centimeters

Now let’s see how we can get the x position in centimeters. To do this, we need to do an interim step. Since the pixel-to-centimeter conversion factor in the x-direction varies along the y-axis, we need to find out the pixel-to-centimeter conversion factor in the x-direction for our original list of global reference frame y centimeter values.

Go to the spreadsheet and insert a column to the right of the Y values for the global reference frame. We will label this column Pixels per Centimeter.

45-pixels-per-centimeterJPG

Let’s start at y = 0 cm. The pixel value there for x is 287. Now, we go 1 cm over to the right (i.e. each square is 1 cm in length) and find an x pixel value of 316. Therefore, for y = 0 cm, the pixel-to-centimeter conversion factor in the x direction is:

(316 pixels – 287 pixels) / 1 cm = 29 pixels/cm  

I’ll write that in the spreadsheet.

46-write-in-spreadsheetJPG

Now, do the same thing for all values of y in your spreadsheet.

Here is what my table now looks like:

47-pixel-to-cm-filled-outJPG

Now what we want to do is create an equation that takes as input the global reference frame y value in centimeters and outputs the pixel-to-centimeter conversion factor along the x-direction.

Here is the plot of the data:

48-pixel-to-cm-conversion-graph

I will add a polynomial trendline of Order 2 and the R-squared value to see how well this line fits the data.

49-polynomial-trendline-order-2

Our equation is:

Pixel/cm Conversion Factor in the x-direction = 0.0625 * (Global Reference Frame Y Coordinate in cm)2 -1.6393 * (Global Reference Frame Y Coordinate in cm) + 29.071

Let’s add this to our spreadsheet.

48-add-to-our-spreadsheetJPG
49-add-to-our-spreadsheet-2JPG-1

Now to find the x position in centimeters of an object in the camera frame, we need to take the x position in pixels as input, subtract the x pixel position of the centerline (i.e. when ≈ 292 pixels…which is the average of the highest and lowest x pixel coordinates of the centerline) and then divide by the pixel-to-centimeter conversion factor (along the x-direction for the corresponding y cm position). In other words:

Global Reference Frame X Coordinate in cm = ((Camera Pixel X Coordinate) – (X Pixel Coordinate of the Centerline))/ (Pixel/cm Conversion Factor in the x-direction)

I will add this formula to the spreadsheet.

50-x-pixel-coordinateJPG
51-x-cm-coordinateJPG

Putting It All Together

Let’s test our math out to see what we get.

Go over to MS Paint or whatever photo editor you’re using. Select a random point on the grid. I’ll select the point (x = 6 cm, y = 8 cm). MS Paint tells me that the pixel value at this location is: (x = 417 px, y = 265 px)

I’ll plug 417 and 265 into the spreadsheet as inputs.

52-coordinates-given-pixel-inputsJPG

You can see that our formula spit out:

  • Global Reference Frame X Coordinate in cm: 6.26
  • Global Reference Frame Y Coordinate in cm: 7.99

This result is pretty close to what we expected.

Congratulations! You now know what to do if you have a camera lens that is tilted relative to the plane of a gridded surface. No need to memorize this process. When you face this problem in the future, just come back to the steps I’ve outlined in this tutorial, and go through them one-by-one.

That’s it for this tutorial. Keep building!

References

Credit to Professor Angela Sodemann for teaching me this stuff. Dr. Sodemann is an excellent teacher (She runs a course on RoboGrok.com).