How to Simulate a Robot Using Gazebo and ROS 2

In this tutorial, we will learn how to create an autonomous mobile robot from scratch using Gazebo. We will learn how to create an environment for the robot to move around in. We will also learn how to integrate ROS 2 and Gazebo so that we can control the robot by sending it velocity commands. Here is what you will build:

The type of robot we will create is an autonomous differential drive mobile warehouse robot. We will build the entire SDF file (Simulation Description Format) from scratch. Our simulated robot will be similar to the one below created by Fetch Robotics, a mobile robotics company based in Silicon Valley in California. 

fetch-warehouse
Credit: Fetch Robotics

Real-World Applications

Mobile warehouse robots are used extensively in the industry. Amazon uses these robots to transport shelves around their fulfillment centers to make sure customers receive their goods as quickly as possible.

Roboticists like to simulate robots before building them in order to test out different algorithms. You can imagine the cost of making mistakes with a physical robot can be high (e.g. crashing a mobile robot into a wall at high speed means lost money).

Prerequisites

Build the Warehouse Robot

Create Model.config

Create a folder for the model.

mkdir -p ~/.gazebo/models/mobile_warehouse_robot

Create a model config file. This file will contain a description of the model.

gedit ~/.gazebo/models/mobile_warehouse_robot/model.config

Modify this model.config file. You can see this file contains fields for the name of the robot, the version, the author (that’s you), your e-mail address, and a description of the robot.

1-config-fileJPG

Save the file, and then close it.

Download the Mesh Files

Mesh files help make your robots look more realistic than just using basic shapes.

Download the warehouse robot mesh file. The warehouse robot mesh file is at this link. To download it, you need to open a new terminal window, and type the following commands:

cd ~/.gazebo/models
wget -q -R *index.html*,*.tar.gz --no-parent -r -x -nH http://models.gazebosim.org/warehouse_robot/

You can see on the website that the name of the actual robot is robot.dae. You will see this later in our SDF file.

2-robot-daeJPG

Now download the Hokuyo Laser Range Finder mesh file.

cd ~/.gazebo/models
wget -q -R *index.html*,*.tar.gz --no-parent -r -x -nH http://models.gazebosim.org/hokuyo/

You can see that the dae file we need for the mesh is hokuyo.dae. You will see this filename inside the sdf file we’ll create in the next section.

3-hokuyo-daeJPG

Create Model.sdf

Now, let’s create an SDF (Simulation Description Format) file. This file will contain the tags that are needed to create an instance of the mobile_warehouse_robot model. Our robot will have three wheels: two wheels in the front and a caster wheel in the back. On top of the robot, we will mount the laser range finder that will send out beams in a 180-degree radius in order to detect obstacles.

gedit ~/.gazebo/models/mobile_warehouse_robot/model.sdf

Here is my sdf file. You can copy and paste those lines inside your sdf file.

Save the file and close it to return to the terminal.

Test Your Robot

Now let’s run Gazebo so that we can see our model. Type the following command:

gazebo

On the left-hand side, click the “Insert” tab.

On the left panel, click “Mobile Warehouse Robot”. You should see a warehouse robot. You can place it wherever you want by clicking inside the environment.

Click on your model to select it.

4b-insert-into-environmentJPG

Go back to the terminal window, and type CTRL + C to close Gazebo.

Integrate ROS 2 and Gazebo

Install gazebo_ros_pkgs

Open a new terminal window, and install the packages that will enable you to use ROS 2 to interface with Gazebo. We need to install a whole bunch of stuff, including the differential drive plugin that will enable us to control the speed of our robot using ROS 2 commands.

sudo apt install ros-foxy-gazebo-ros-pkgs

Test Your ROS 2 and Gazebo Integration

Open a new terminal window.

Install some more tools.

sudo apt install ros-foxy-ros-core ros-foxy-geometry2

Open a new terminal window.

Load a demo robot. This is all one command.

gazebo --verbose /opt/ros/foxy/share/gazebo_plugins/worlds/gazebo_ros_diff_drive_demo.world
6-should-see-following-vehicleJPG

Let’s see what commands are available to us. Open up the sdf file.

gedit /opt/ros/foxy/share/gazebo_plugins/worlds/gazebo_ros_diff_drive_demo.world
5-commands_availableJPG

Open up a new terminal window, and type the following command to make the robot move forward at a speed of 1.0 meters per second:

ros2 topic pub /demo/cmd_demo geometry_msgs/Twist '{linear: {x: 1.0}}' -1

The robot will begin moving forward.

Close all the active programs by pressing CTRL + C in all terminals.

Now launch Gazebo again.

gazebo

Insert your “Mobile Warehouse Robot” model into the environment.

In the terminal window, you should see the following output from ROS 2.

7-output-from-ros2JPG

Let’s see the active topics.

ros2 topic list -t
8-active-topicsJPG

The /demo/cmd_vel topic is where you can give your robot velocity commands. Let’s make the robot drive forward at 0.05 meters per second.

ros2 topic pub /demo/cmd_vel geometry_msgs/Twist '{linear: {x: 0.05}}' -1

You will see the robot bob up and down for a bit and then stabilize as it moves forward. The bobbing action is due to the uneven weight caused by the beacon and the laser range finder on top of the robot. 

Feel free to tweak the parameters (radii, positioning of the wheels, etc.) to see if you get smoother performance.

Build a Warehouse

Now we are going to build a warehouse for our robot to move around in.

Create Model.config

Create a folder for the model.

mkdir -p ~/.gazebo/models/small_warehouse

Create a model config file. This file will contain a description of the model.

gedit ~/.gazebo/models/small_warehouse/model.config

Add these lines to the file, and then Save. You can see this file contains fields for the name of the robot, the version, the author (that’s you), your email address, and a description of the robot.

Save the file, and then close it.

Create Model.sdf

Now, let’s create an SDF (Simulation Description Format) file. This file will contain the tags that are needed to create an instance of the small_warehouse model. 

gedit ~/.gazebo/models/small_warehouse/model.sdf

Write these lines inside the sdf file. You can see how many lines of code there are. It takes a lot of code to create even the simplest of models in Gazebo.

Save the file and close it to return to the terminal.

Note, you can also create your own warehouse using Gazebo’s drag and drop interface. Gazebo enables you to add pieces to the world and then save the world as an sdf file.

Test Your Warehouse

Now let’s run Gazebo so that we can see our model. Type the following command:

gazebo

On the left-hand side, click the “Insert” tab.

On the left panel, click “Small Warehouse”. You should see a warehouse robot. You can place it wherever you want by clicking inside the environment.

Click on your model to select it. I have also added our mobile warehouse robot to the scene.

9-robot-in-warehouseJPG

Go back to the terminal window, and type CTRL + C to close Gazebo.

Launch Your Robot and Warehouse Using ROS 2

Now that we have created our robot and our warehouse, let’s see how we can launch these pieces using ROS 2.

Create a Package

Let’s create a package.

Open a new terminal window, and navigate to the src directory of your workspace (the name of my workspace is dev_ws, but your workspace might have a different name):

cd ~/dev_ws/src

Now let’s create a package named warehouse_robot_spawner_pkg.

ros2 pkg create --build-type ament_python warehouse_robot_spawner_pkg

Package Setup

Now that we have our package, we need to make sure some key files are in there.

Move inside the folder, and see what you have so far.

cd ~/dev_ws/src/warehouse_robot_spawner_pkg
dir
10-what-we-have-so-farJPG

Open your package.xml file.

gedit package.xml

Fill in the description, your email address, and any license you want to use. Then save and close the file.

11-package-xmlJPG

Now, open your setup.py file. Copy and paste these lines into it:

gedit setup.py

Save the file, and close it.

Now, we need to add our models. Create a new folder inside the ~/dev_ws/src/warehouse_robot_spawner_pkg directory named models.

mkdir ~/dev_ws/src/warehouse_robot_spawner_pkg/models

Go to your Gazebo models.

cd ~/.gazebo/models

Copy your models over to your package.

cp -r small_warehouse ~/dev_ws/src/warehouse_robot_spawner_pkg/models
cp -r mobile_warehouse_robot ~/dev_ws/src/warehouse_robot_spawner_pkg/models

Move the meshes over as well.

cp -r hokuyo ~/dev_ws/src/warehouse_robot_spawner_pkg/models
cp -r warehouse_robot ~/dev_ws/src/warehouse_robot_spawner_pkg/models

Check that everything copied over correctly.

cd ~/dev_ws/src/warehouse_robot_spawner_pkg/models
dir
12-copied-over-correctlyJPG

Now let’s create a folder named worlds. We want to create an sdf file in here that handles the generation of the Gazebo environment and the warehouse.

mkdir ~/dev_ws/src/warehouse_robot_spawner_pkg/worlds

Move inside this folder.

cd ~/dev_ws/src/warehouse_robot_spawner_pkg/worlds

Open up a new file called warehouse.world. Even though this file has the .world extension, it is still an sdf file.

gedit warehouse.world

Add this code inside the file, click Save, then close it.

Create a Node

Now we need to create a ROS 2 node that will spawn the warehouse robot and connect it to ROS 2.

Move inside the warehouse_robot_spawner_pkg folder.

cd ~/dev_ws/src/warehouse_robot_spawner_pkg/warehouse_robot_spawner_pkg/

Open a new Python file named spawn_demo.py.

gedit spawn_demo.py

Write this code inside the file

Save and close the Python program.

Create a Launch File

Now let’s create a launch file.

mkdir ~/dev_ws/src/warehouse_robot_spawner_pkg/launch/
cd ~/dev_ws/src/warehouse_robot_spawner_pkg/launch/

Create a file named gazebo_world.launch.py

gedit gazebo_world.launch.py

Add this code.

# Copyright 2019 Open Source Robotics Foundation, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
Demo for spawn_entity.
Launches Gazebo and spawns a model
"""
# A bunch of software packages that are needed to launch ROS2
import os
from launch import LaunchDescription
from launch.actions import IncludeLaunchDescription
from launch.launch_description_sources import PythonLaunchDescriptionSource
from launch.substitutions import ThisLaunchFileDir,LaunchConfiguration
from launch_ros.actions import Node
from launch.actions import ExecuteProcess
from ament_index_python.packages import get_package_share_directory

def generate_launch_description():
    use_sim_time = LaunchConfiguration('use_sim_time', default='True')
    world_file_name = 'warehouse.world'
    pkg_dir = get_package_share_directory('warehouse_robot_spawner_pkg')

    os.environ["GAZEBO_MODEL_PATH"] = os.path.join(pkg_dir, 'models')

    world = os.path.join(pkg_dir, 'worlds', world_file_name)
    launch_file_dir = os.path.join(pkg_dir, 'launch')

    gazebo = ExecuteProcess(
            cmd=['gazebo', '--verbose', world, '-s', 'libgazebo_ros_init.so', 
            '-s', 'libgazebo_ros_factory.so'],
            output='screen')

    #GAZEBO_MODEL_PATH has to be correctly set for Gazebo to be able to find the model
    #spawn_entity = Node(package='gazebo_ros', node_executable='spawn_entity.py',
    #                    arguments=['-entity', 'demo', 'x', 'y', 'z'],
    #                    output='screen')
    spawn_entity = Node(package='warehouse_robot_spawner_pkg', executable='spawn_demo',
                        arguments=['WarehouseBot', 'demo', '-1.5', '-4.0', '0.0'],
                        output='screen')

    return LaunchDescription([
        gazebo,
        spawn_entity,
    ])

Save the file and close it.

Build the Package

Return to the root of your workspace:

cd ~/dev_ws/

Make sure the setuptools package is installed.

sudo pip3 install setuptools

or 

sudo pip install setuptools

Build the package.

colcon build --packages-select warehouse_robot_spawner_pkg

Launch

Ok, now we are ready to launch.

Open a new terminal window.

Move to the root of the workspace.

cd ~/dev_ws/

Type the following command:

ros2 launch warehouse_robot_spawner_pkg gazebo_world.launch.py
13-warehouseJPG

Here is the console output.

15-console-outputJPG

Let’s see what topics are active. Open a new terminal, and type:

ros2 topic list -t
16-ros2-topic-listJPG

Move the Robot Around the Warehouse

Now that we know how to spawn the robot and the world, let’s see how we can make the robot move using ROS 2.

I will make the mobile robot patrol the warehouse. It will use its laser range finder to follow walls.

Create a Package

Let’s create a package.

Open a new terminal window, and navigate to the src directory of your workspace:

cd ~/dev_ws/src

Now let’s create a package named warehouse_robot_controller_pkg.

ros2 pkg create --build-type ament_python warehouse_robot_controller_pkg

Package Setup

Now that we have our package, we need to make sure some important files are in there.

Move inside the folder, and see what you have so far.

cd ~/dev_ws/src/warehouse_robot_controller_pkg
dir

Open your package.xml file.

gedit package.xml
17-fill-in-package-xmlJPG

Fill in the description, your email address, and any license you want to use. Then save and close the file.

Now, open your setup.py file. Make sure it looks like this.

gedit setup.py

Save the file, and close it.

Create an Estimator and a Controller Node

Now we need to create two ROS 2 nodes. One node will be responsible for estimating the current state of the robot in the world (i.e. position and orientation). The other node will send velocity commands to the robot. 

Move inside the warehouse_robot_controller_pkg folder.

cd ~/dev_ws/src/warehouse_robot_controller_pkg/warehouse_robot_controller_pkg/

Open a new Python file named robot_estimator.py. Don’t be intimidated by how much code there is. Just take it one section at a time. I included a lot of comments so that you know what is going on.

gedit robot_estimator.py

Write the following code inside the file.

# Author: Addison Sears-Collins
# Date: March 19, 2021
# ROS Version: ROS 2 Foxy Fitzroy

# Python math library
import math

# ROS client library for Python
import rclpy

# Used to create nodes
from rclpy.node import Node

# Twist is linear and angular velocity
from geometry_msgs.msg import Twist 

# Position, orientation, linear velocity, angular velocity
from nav_msgs.msg import Odometry

# Handles laser distance scan to detect obstacles
from sensor_msgs.msg import LaserScan

# Used for laser scan
from rclpy.qos import qos_profile_sensor_data

# Enable use of std_msgs/Float64MultiArray message
from std_msgs.msg import Float64MultiArray 

# Scientific computing library for Python
import numpy as np

class Estimator(Node):
  """
  Class constructor to set up the node
  """
  def __init__(self):

    ############## INITIALIZE ROS PUBLISHERS AND SUBSCRIBERS ######
    super().__init__('Estimator')

    # Create a subscriber
    # This node subscribes to messages of type
    # nav_msgs/Odometry (i.e. position and orientation of the robot)
    self.odom_subscriber = self.create_subscription(
                           Odometry,
                           '/demo/odom',
                           self.odom_callback,
                           10)

    # Create a subscriber 
    # This node subscribes to messages of type 
    # geometry_msgs/Twist.msg. We are listening to the velocity commands here.
    # The maximum number of queued messages is 10.
    self.velocity_subscriber = self.create_subscription(
                               Twist,
                               '/demo/cmd_vel',
                               self.velocity_callback,
                               10)

    # Create a publisher
    # This node publishes the estimated position (x, y, yaw) 
    # The type of message is std_msgs/Float64MultiArray
    self.publisher_state_est = self.create_publisher(
                               Float64MultiArray, 
                               '/demo/state_est', 
                               10)

  def odom_callback(self, msg):
    """
    Receive the odometry information containing the position and orientation
    of the robot in the global reference frame. 
    The position is x, y, z.
    The orientation is a x,y,z,w quaternion. 
    """						
    roll, pitch, yaw = self.euler_from_quaternion(
      msg.pose.pose.orientation.x,
      msg.pose.pose.orientation.y,
      msg.pose.pose.orientation.z,
      msg.pose.pose.orientation.w)

    obs_state_vector_x_y_yaw = [msg.pose.pose.position.x,msg.pose.pose.position.y,yaw]

    # Publish the estimated state (x position, y position, yaw angle)
    self.publish_estimated_state(obs_state_vector_x_y_yaw)

  def publish_estimated_state(self, state_vector_x_y_yaw):
    """
    Publish the estimated pose (position and orientation) of the 
    robot to the '/demo/state_est' topic. 
    :param: state_vector_x_y_yaw [x, y, yaw] 
    x is in meters, y is in meters, yaw is in radians
    """
    msg = Float64MultiArray()
    msg.data = state_vector_x_y_yaw
    self.publisher_state_est.publish(msg)

  def euler_from_quaternion(self, x, y, z, w):
    """
    Convert a quaternion into euler angles (roll, pitch, yaw)
    roll is rotation around x in radians (counterclockwise)
    pitch is rotation around y in radians (counterclockwise)
    yaw is rotation around z in radians (counterclockwise)
    """
    t0 = +2.0 * (w * x + y * z)
    t1 = +1.0 - 2.0 * (x * x + y * y)
    roll_x = math.atan2(t0, t1)

    t2 = +2.0 * (w * y - z * x)
    t2 = +1.0 if t2 > +1.0 else t2
    t2 = -1.0 if t2 < -1.0 else t2
    pitch_y = math.asin(t2)

    t3 = +2.0 * (w * z + x * y)
    t4 = +1.0 - 2.0 * (y * y + z * z)
    yaw_z = math.atan2(t3, t4)

    return roll_x, pitch_y, yaw_z # in radians

  def velocity_callback(self, msg):
    """
    Listen to the velocity commands (linear forward velocity 
    in the x direction in the robot's reference frame and 
    angular velocity (yaw rate) around the robot's z-axis.
    [v,yaw_rate]
    [meters/second, radians/second]
    """
    # Forward velocity in the robot's reference frame
    v = msg.linear.x

    # Angular velocity around the robot's z axis
    yaw_rate = msg.angular.z

def main(args=None):
    """
    Entry point for the program.
    """
    # Initialize rclpy library
    rclpy.init(args=args)

    # Create the node
    estimator = Estimator()

    # Spin the node so the callback function is called.
    # Pull messages from any topics this node is subscribed to.
    # Publish any pending messages to the topics.
    rclpy.spin(estimator)

    # Destroy the node explicitly
    # (optional - otherwise it will be done automatically
    # when the garbage collector destroys the node object)
    estimator.destroy_node()
    
    # Shutdown the ROS client library for Python
    rclpy.shutdown()

if __name__ == '__main__':
    main()

Save and close the Python program.

Open a new Python file named robot_controller.py

gedit robot_controller.py

Write the following code inside the file. 

# Author: Addison Sears-Collins
# Date: March 19, 2021
# ROS Version: ROS 2 Foxy Fitzroy

############## IMPORT LIBRARIES #################
# Python math library
import math 

# ROS client library for Python
import rclpy 

# Enables pauses in the execution of code
from time import sleep 

# Used to create nodes
from rclpy.node import Node

# Enables the use of the string message type
from std_msgs.msg import String 

# Twist is linear and angular velocity
from geometry_msgs.msg import Twist 	
					
# Handles LaserScan messages to sense distance to obstacles (i.e. walls)      	
from sensor_msgs.msg import LaserScan	 

# Handle Pose messages
from geometry_msgs.msg import Pose 

# Handle float64 arrays
from std_msgs.msg import Float64MultiArray
					
# Handles quality of service for LaserScan data
from rclpy.qos import qos_profile_sensor_data 

# Scientific computing library
import numpy as np 

class Controller(Node):
  """
  Create a Controller class, which is a subclass of the Node 
  class for ROS2.
  """
  def __init__(self):
    """
    Class constructor to set up the node
    """
    ##################### ROS SETUP ####################################################
    # Initiate the Node class's constructor and give it a name
    super().__init__('Controller')

    # Create a subscriber
    # This node subscribes to messages of type Float64MultiArray  
    # over a topic named: /demo/state_est
    # The message represents the current estimated state:
    #   [x, y, yaw]
    # The callback function is called as soon as a message 
    # is received.
    # The maximum number of queued messages is 10.
    self.subscription = self.create_subscription(
                        Float64MultiArray,
                        '/demo/state_est',
                        self.state_estimate_callback,
                        10)
    self.subscription  # prevent unused variable warning

    # Create a subscriber
    # This node subscribes to messages of type 
    # sensor_msgs/LaserScan		
    self.scan_subscriber = self.create_subscription(
                           LaserScan,
                           '/demo/laser/out',
                           self.scan_callback,
                           qos_profile=qos_profile_sensor_data)
                           
    # Create a publisher
    # This node publishes the desired linear and angular velocity of the robot (in the
    # robot chassis coordinate frame) to the /demo/cmd_vel topic. Using the diff_drive
    # plugin enables the robot model to read this /demo/cmd_vel topic and execute
    # the motion accordingly.
    self.publisher_ = self.create_publisher(
                      Twist, 
                      '/demo/cmd_vel', 
                      10)

    # Initialize the LaserScan sensor readings to some large value
    # Values are in meters.
    self.left_dist = 999999.9 # Left
    self.leftfront_dist = 999999.9 # Left-front
    self.front_dist = 999999.9 # Front
    self.rightfront_dist = 999999.9 # Right-front
    self.right_dist = 999999.9 # Right

    ################### ROBOT CONTROL PARAMETERS ##################
    # Maximum forward speed of the robot in meters per second
    # Any faster than this and the robot risks falling over.
    self.forward_speed = 0.025 

    # Current position and orientation of the robot in the global 
    # reference frame
    self.current_x = 0.0
    self.current_y = 0.0
    self.current_yaw = 0.0

    ############# WALL FOLLOWING PARAMETERS #######################		
    # Finite states for the wall following mode
    #  "turn left": Robot turns towards the left
    #  "search for wall": Robot tries to locate the wall		
    #  "follow wall": Robot moves parallel to the wall
    self.wall_following_state = "turn left"
		
    # Set turning speeds (to the left) in rad/s 
    # These values were determined by trial and error.
    self.turning_speed_wf_fast = 3.0  # Fast turn
    self.turning_speed_wf_slow = 0.05 # Slow turn

    # Wall following distance threshold.
    # We want to try to keep within this distance from the wall.
    self.dist_thresh_wf = 0.50 # in meters	

    # We don't want to get too close to the wall though.
    self.dist_too_close_to_wall = 0.19 # in meters

  def state_estimate_callback(self, msg):
    """
    Extract the position and orientation data. 
    This callback is called each time
    a new message is received on the '/demo/state_est' topic
    """
    # Update the current estimated state in the global reference frame
    curr_state = msg.data
    self.current_x = curr_state[0]
    self.current_y = curr_state[1]
    self.current_yaw = curr_state[2]

    # Command the robot to keep following the wall		
    self.follow_wall()

  def scan_callback(self, msg):
    """
    This method gets called every time a LaserScan message is 
    received on the '/demo/laser/out' topic	
    """
    # Read the laser scan data that indicates distances
    # to obstacles (e.g. wall) in meters and extract
    # 5 distinct laser readings to work with.
    # Each reading is separated by 45 degrees.
    # Assumes 181 laser readings, separated by 1 degree. 
    # (e.g. -90 degrees to 90 degrees....0 to 180 degrees)

    #number_of_laser_beams = str(len(msg.ranges))		
    self.left_dist = msg.ranges[180]
    self.leftfront_dist = msg.ranges[135]
    self.front_dist = msg.ranges[90]
    self.rightfront_dist = msg.ranges[45]
    self.right_dist = msg.ranges[0]
			
  def follow_wall(self):
    """
    This method causes the robot to follow the boundary of a wall.
    """
    # Create a geometry_msgs/Twist message
    msg = Twist()
    msg.linear.x = 0.0
    msg.linear.y = 0.0
    msg.linear.z = 0.0
    msg.angular.x = 0.0
    msg.angular.y = 0.0
    msg.angular.z = 0.0			

    # Logic for following the wall
    # >d means no wall detected by that laser beam
    # <d means an wall was detected by that laser beam
    d = self.dist_thresh_wf
    
    if self.leftfront_dist > d and self.front_dist > d and self.rightfront_dist > d:
      self.wall_following_state = "search for wall"
      msg.linear.x = self.forward_speed
      msg.angular.z = -self.turning_speed_wf_slow # turn right to find wall

    elif self.leftfront_dist > d and self.front_dist < d and self.rightfront_dist > d:
      self.wall_following_state = "turn left"
      msg.angular.z = self.turning_speed_wf_fast

    elif (self.leftfront_dist > d and self.front_dist > d and self.rightfront_dist < d):
      if (self.rightfront_dist < self.dist_too_close_to_wall):
        # Getting too close to the wall
        self.wall_following_state = "turn left"
        msg.linear.x = self.forward_speed
        msg.angular.z = self.turning_speed_wf_fast		
      else: 			
        # Go straight ahead
        self.wall_following_state = "follow wall"  
        msg.linear.x = self.forward_speed	

    elif self.leftfront_dist < d and self.front_dist > d and self.rightfront_dist > d:
      self.wall_following_state = "search for wall"
      msg.linear.x = self.forward_speed
      msg.angular.z = -self.turning_speed_wf_slow # turn right to find wall

    elif self.leftfront_dist > d and self.front_dist < d and self.rightfront_dist < d:
      self.wall_following_state = "turn left"
      msg.angular.z = self.turning_speed_wf_fast

    elif self.leftfront_dist < d and self.front_dist < d and self.rightfront_dist > d:
      self.wall_following_state = "turn left" 
      msg.angular.z = self.turning_speed_wf_fast

    elif self.leftfront_dist < d and self.front_dist < d and self.rightfront_dist < d:
      self.wall_following_state = "turn left" 
      msg.angular.z = self.turning_speed_wf_fast
			
    elif self.leftfront_dist < d and self.front_dist > d and self.rightfront_dist < d:
      self.wall_following_state = "search for wall"
      msg.linear.x = self.forward_speed
      msg.angular.z = -self.turning_speed_wf_slow # turn right to find wall
    
    else:
      pass 

    # Send velocity command to the robot
    self.publisher_.publish(msg)	

def main(args=None):

    # Initialize rclpy library
    rclpy.init(args=args)
    
    # Create the node
    controller = Controller()

    # Spin the node so the callback function is called
    # Pull messages from any topics this node is subscribed to
    # Publish any pending messages to the topics
    rclpy.spin(controller)

    # Destroy the node explicitly
    # (optional - otherwise it will be done automatically
    # when the garbage collector destroys the node object)
    controller.destroy_node()
    
    # Shutdown the ROS client library for Python
    rclpy.shutdown()

if __name__ == '__main__':
    main()

Save and close the Python program.

Create a Launch File

Now let’s create a launch file.

mkdir ~/dev_ws/src/warehouse_robot_controller_pkg/launch/
cd ~/dev_ws/src/warehouse_robot_controller_pkg/launch/

Create a file named controller_estimator.launch.py

gedit controller_estimator.launch.py

Add the following code.

import os
from launch import LaunchDescription
from launch_ros.actions import Node


def generate_launch_description():

  return LaunchDescription([
    Node(package='warehouse_robot_controller_pkg', executable='robot_controller',
      output='screen'),
    Node(package='warehouse_robot_controller_pkg', executable='robot_estimator',
      output='screen'),
  ])

Save the file and close it.

Build the Package

Return to the root of your workspace:

cd ~/dev_ws/

Build the package.

colcon build --packages-select warehouse_robot_controller_pkg

Launch

Ok, now we are ready to launch.

Open a new terminal window.

Move to the root of the workspace.

cd ~/dev_ws/

Type the following command to launch the gazebo world.

ros2 launch warehouse_robot_spawner_pkg gazebo_world.launch.py

Open a new terminal window, and launch the controller.

ros2 launch warehouse_robot_controller_pkg controller_estimator.launch.py

The robot will struggle at times around corners, but that is fine.

9-robot-in-warehouseJPG-1

Let’s see what topics are active. Open a new terminal, and type:

ros2 topic list -t
18-topic-listJPG

Moving the Robot Around Manually

By the way, if you ever want to move the robot around manually using the keyboard and already have the turtlebot3 package installed, you can use the following commands.

Open a new terminal window, and type:

ros2 launch warehouse_robot_spawner_pkg gazebo_world.launch.py

Then open another terminal window, and type:

ros2 run turtlebot3_teleop teleop_keyboard --ros-args --remap /cmd_vel:=/demo/cmd_vel

The command will remap commands sent from the keyboard to the /cmd_vel topic to the /demo/cmd_vel topic (which is the topic that the robot gets its velocity commands from as you saw in the sdf file).

To install the Turtlebot3 package, you would need to use the following command:

sudo apt install ros-foxy-turtlebot3*

That’s it. Keep building!

How To Detect Objects Using Semantic Segmentation

In this tutorial, we will build a program to categorize each pixel in images and videos. These categories could include things like car, person, sidewalk, bicycle, sky, or traffic sign. This process is known as semantic segmentation. By the end of this tutorial, you will be able to generate the following output:

final_semantic-segmentation-gif

Semantic segmentation helps self-driving cars and other types of autonomous vehicles gain a deep understanding of their surroundings so they can make decisions without human intervention.

Our goal is to build an early prototype of a semantic segmentation application that could be deployed inside an autonomous vehicle. To accomplish this task, we’ll use a special type of neural network called ENet (Efficient Neural Network) (you don’t need to know the details of ENet to accomplish this task).

Here is the list of classes we will use for this project:

  • Unlabeled
  • Road
  • Sidewalk
  • Building
  • Wall
  • Fence
  • Pole
  • TrafficLight
  • TrafficSign
  • Vegetation
  • Terrain
  • Sky
  • Person
  • Rider
  • Car
  • Truck
  • Bus
  • Train
  • Motorcycle
  • Bicycle

Real-World Applications

  • Self-driving cars and other types of autonomous vehicles
  • Medical (brain and lung tumor detection)

Let’s get started!

Prerequisites

Installation and Setup

We now need to make sure we have all the software packages installed. Check to see if you have OpenCV installed on your machine. If you are using Anaconda, you can type:

conda install -c conda-forge opencv

Alternatively, you can type:

pip install opencv-python

Make sure you have NumPy installed, a scientific computing library for Python.

If you’re using Anaconda, you can type:

conda install numpy

Alternatively, you can type:

pip install numpy

Install imutils, an image processing library.

pip install imutils

Download Required Folders and Samples Images and Videos

This link contains the required files you will need to run this program. Download all these files and put them in a folder on your computer. 

Code for Semantic Segmentation on Images

In the same folder where you downloaded all the stuff in the previous section, open a new Python file called semantic_segmentation_images.py.

Here is the full code for the system. The only thing you’ll need to change (if you wish to use your own image) in this code is the name of your desired input image file on line 12. Just copy and paste it into your file.

# Project: How To Detect Objects in an Image Using Semantic Segmentation
# Author: Addison Sears-Collins
# Date created: February 24, 2021
# Description: A program that classifies pixels in an image. The real-world
#   use case is autonomous vehicles. Uses the ENet neural network architecture.

import cv2 # Computer vision library
import numpy as np # Scientific computing library 
import os # Operating system library 
import imutils # Image processing library

ORIG_IMG_FILE = 'test_image_1.jpg'
ENET_DIMENSIONS = (1024, 512) # Dimensions that ENet was trained on
RESIZED_WIDTH = 600
IMG_NORM_RATIO = 1 / 255.0 # In grayscale a pixel can range between 0 and 255

# Read the image
input_img = cv2.imread(ORIG_IMG_FILE)

# Resize the image while maintaining the aspect ratio
input_img = imutils.resize(input_img, width=RESIZED_WIDTH)

# Create a blob. A blob is a group of connected pixels in a binary 
# image that share some common property (e.g. grayscale value)
# Preprocess the image to prepare it for deep learning classification
input_img_blob = cv2.dnn.blobFromImage(input_img, IMG_NORM_RATIO,
  ENET_DIMENSIONS, 0, swapRB=True, crop=False)
	
# Load the neural network (i.e. deep learning model)
print("Loading the neural network...")
enet_neural_network = cv2.dnn.readNet('./enet-cityscapes/enet-model.net')

# Set the input for the neural network
enet_neural_network.setInput(input_img_blob)

# Get the predicted probabilities for each of the classes (e.g. car, sidewalk)
# These are the values in the output layer of the neural network
enet_neural_network_output = enet_neural_network.forward()

# Load the names of the classes
class_names = (
  open('./enet-cityscapes/enet-classes.txt').read().strip().split("\n"))

# Print out the shape of the output
# (1, number of classes, height, width)
#print(enet_neural_network_output.shape)

# Extract the key information about the ENet output
(number_of_classes, height, width) = enet_neural_network_output.shape[1:4] 

# number of classes x height x width
#print(enet_neural_network_output[0])

# Find the class label that has the greatest probability for each image pixel
class_map = np.argmax(enet_neural_network_output[0], axis=0)

# Load a list of colors. Each class will have a particular color. 
if os.path.isfile('./enet-cityscapes/enet-colors.txt'):
  IMG_COLOR_LIST = (
    open('./enet-cityscapes/enet-colors.txt').read().strip().split("\n"))
  IMG_COLOR_LIST = [np.array(color.split(",")).astype(
    "int") for color in IMG_COLOR_LIST]
  IMG_COLOR_LIST = np.array(IMG_COLOR_LIST, dtype="uint8")
	
# If the list of colors file does not exist, we generate a 
# random list of colors
else:
  np.random.seed(1)
  IMG_COLOR_LIST = np.random.randint(0, 255, size=(len(class_names) - 1, 3),
    dtype="uint8")
  IMG_COLOR_LIST = np.vstack([[0, 0, 0], IMG_COLOR_LIST]).astype("uint8")
  
# Tie each class ID to its color
# This mask contains the color for each pixel. 
class_map_mask = IMG_COLOR_LIST[class_map]

# We now need to resize the class map mask so its dimensions
# is equivalent to the dimensions of the original image
class_map_mask = cv2.resize(class_map_mask, (
  input_img.shape[1], input_img.shape[0]),
	interpolation=cv2.INTER_NEAREST)

# Overlay the class map mask on top of the original image. We want the mask to
# be transparent. We can do this by computing a weighted average of
# the original image and the class map mask.
enet_neural_network_output = ((0.61 * class_map_mask) + (
  0.39 * input_img)).astype("uint8")
	
# Create a legend that shows the class and its corresponding color
class_legend = np.zeros(((len(class_names) * 25) + 25, 300, 3), dtype="uint8")
	
# Put the class labels and colors on the legend
for (i, (cl_name, cl_color)) in enumerate(zip(class_names, IMG_COLOR_LIST)):
  color_information = [int(color) for color in cl_color]
  cv2.putText(class_legend, cl_name, (5, (i * 25) + 17),
    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
  cv2.rectangle(class_legend, (100, (i * 25)), (300, (i * 25) + 25),
                  tuple(color_information), -1)

# Combine the original image and the semantic segmentation image
combined_images = np.concatenate((input_img, enet_neural_network_output), axis=1) 

# Resize image if desired
#combined_images = imutils.resize(combined_images, width=1000)

# Display image
#cv2.imshow('Results', enet_neural_network_output) 
cv2.imshow('Results', combined_images) 
cv2.imshow("Class Legend", class_legend)
print(combined_images.shape)
cv2.waitKey(0) # Display window until keypress
cv2.destroyAllWindows() # Close OpenCV

To run the code, type the following command:

python semantic_segmentation_images.py

Here is the output I got:

results-images-semantic-segmentation

How the Code Works

The first thing we need to do is to import the necessary libraries.

import cv2 # Computer vision library
import numpy as np # Scientific computing library 
import os # Operating system library 
import imutils # Image processing library

We set our constants: name of the image file you want to perform semantic segmentation on, the dimensions of the images that the ENet neural network was trained on, the width we want to resize our input image to, and the ratio that we use to normalize the color values of each pixel.

ORIG_IMG_FILE = 'test_image_1.jpg'
ENET_DIMENSIONS = (1024, 512) # Dimensions that ENet was trained on
RESIZED_WIDTH = 600
IMG_NORM_RATIO = 1 / 255.0 # In grayscale a pixel can range between 0 and 255

Read the input image, resize it, and create a blob. A blob is a group of pixels that have similar intensity values.

# Read the image
input_img = cv2.imread(ORIG_IMG_FILE)

# Resize the image while maintaining the aspect ratio
input_img = imutils.resize(input_img, width=RESIZED_WIDTH)

# Create a blob. A blob is a group of connected pixels in a binary 
# image that share some common property (e.g. grayscale value)
# Preprocess the image to prepare it for deep learning classification
input_img_blob = cv2.dnn.blobFromImage(input_img, IMG_NORM_RATIO,
  ENET_DIMENSIONS, 0, swapRB=True, crop=False)

We load the pretrained neural network, set the blob as its input, and then extract the predicted probabilities for each of the classes (i.e. sidewalk, person, car, sky, etc.).

# Load the neural network (i.e. deep learning model)
enet_neural_network = cv2.dnn.readNet('./enet-cityscapes/enet-model.net')

# Set the input for the neural network
enet_neural_network.setInput(input_img_blob)

# Get the predicted probabilities for each of the classes (e.g. car, sidewalk)
# These are the values in the output layer of the neural network
enet_neural_network_output = enet_neural_network.forward()

We load the class list.

# Load the names of the classes
class_names = (
  open('./enet-cityscapes/enet-classes.txt').read().strip().split("\n"))

Get the key parameters of the ENet output.

# Extract the key information about the ENet output
(number_of_classes, height, width) = enet_neural_network_output.shape[1:4] 

Determine the highest probability class for each image pixel.

# Find the class label that has the greatest probability for each image pixel
class_map = np.argmax(enet_neural_network_output[0], axis=0)

We want to create a class legend that is color coded.

# Load a list of colors. Each class will have a particular color. 
if os.path.isfile('./enet-cityscapes/enet-colors.txt'):
  IMG_COLOR_LIST = (
    open('./enet-cityscapes/enet-colors.txt').read().strip().split("\n"))
  IMG_COLOR_LIST = [np.array(color.split(",")).astype(
    "int") for color in IMG_COLOR_LIST]
  IMG_COLOR_LIST = np.array(IMG_COLOR_LIST, dtype="uint8")
	
# If the list of colors file does not exist, we generate a 
# random list of colors
else:
  np.random.seed(1)
  IMG_COLOR_LIST = np.random.randint(0, 255, size=(len(class_names) - 1, 3),
    dtype="uint8")
  IMG_COLOR_LIST = np.vstack([[0, 0, 0], IMG_COLOR_LIST]).astype("uint8")

Each pixel will need to have a color, which depends on the highest probability class for that pixel.

# Tie each class ID to its color
# This mask contains the color for each pixel. 
class_map_mask = IMG_COLOR_LIST[class_map]

Make sure the class map mask has the same dimensions as the original input image.

# We now need to resize the class map mask so its dimensions
# is equivalent to the dimensions of the original image
class_map_mask = cv2.resize(class_map_mask, (
  input_img.shape[1], input_img.shape[0]),
	interpolation=cv2.INTER_NEAREST)

Create a blended image of the original input image and the class map mask. In this example, I used 61% of the class map mask and 39% of the original input image. You can change those values, but make sure they add up to 100%.

# Overlay the class map mask on top of the original image. We want the mask to
# be transparent. We can do this by computing a weighted average of
# the original image and the class map mask.
enet_neural_network_output = ((0.61 * class_map_mask) + (
  0.39 * input_img)).astype("uint8")

Create a legend that shows each class and its corresponding color.

# Create a legend that shows the class and its corresponding color
class_legend = np.zeros(((len(class_names) * 25) + 25, 300, 3), dtype="uint8")
	
# Put the class labels and colors on the legend
for (i, (cl_name, cl_color)) in enumerate(zip(class_names, IMG_COLOR_LIST)):
  color_information = [int(color) for color in cl_color]
  cv2.putText(class_legend, cl_name, (5, (i * 25) + 17),
    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
  cv2.rectangle(class_legend, (100, (i * 25)), (300, (i * 25) + 25),
                  tuple(color_information), -1)

Create the final image we want to display. The original input image is combined with the semantic segmentation image.

# Combine the original image and the semantic segmentation image
combined_images = np.concatenate((input_img, enet_neural_network_output), axis=0) 

Display the image. 

# Display image
#cv2.imshow('Results', enet_neural_network_output) 
cv2.imshow('Results', combined_images) 
cv2.imshow("Class Legend", class_legend)
print(combined_images.shape)
cv2.waitKey(0) # Display window until keypress
cv2.destroyAllWindows() # Close OpenCV

Code for Semantic Segmentation on Videos

Open a new Python file called semantic_segmentation_videos.py.

Here is the full code for the system. The only thing you’ll need to change (if you wish to use your own video) in this code is the name of your desired input video file on line 13, and the name of your desired output video file on line 17. Make sure the input video is 1920 x 1080 pixels in dimensions and is in mp4 format, otherwise it won’t work.

# Project: How To Detect Objects in a Video Using Semantic Segmentation
# Author: Addison Sears-Collins
# Date created: February 25, 2021
# Description: A program that classifies pixels in a video. The real-world
#   use case is autonomous vehicles. Uses the ENet neural network architecture.

import cv2 # Computer vision library
import numpy as np # Scientific computing library 
import os # Operating system library 
import imutils # Image processing library

# Make sure the video file is in the same directory as your code
filename = '4_orig_lane_detection_1.mp4'
file_size = (1920,1080) # Assumes 1920x1080 mp4

# We want to save the output to a video file
output_filename = 'semantic_seg_4_orig_lane_detection_1.mp4'
output_frames_per_second = 20.0 

ENET_DIMENSIONS = (1024, 512) # Dimensions that ENet was trained on
RESIZED_WIDTH = 1200
IMG_NORM_RATIO = 1 / 255.0 # In grayscale a pixel can range between 0 and 255

# Load the names of the classes
class_names = (
  open('./enet-cityscapes/enet-classes.txt').read().strip().split("\n"))
	
# Load a list of colors. Each class will have a particular color. 
if os.path.isfile('./enet-cityscapes/enet-colors.txt'):
  IMG_COLOR_LIST = (
    open('./enet-cityscapes/enet-colors.txt').read().strip().split("\n"))
  IMG_COLOR_LIST = [np.array(color.split(",")).astype(
    "int") for color in IMG_COLOR_LIST]
  IMG_COLOR_LIST = np.array(IMG_COLOR_LIST, dtype="uint8")
	
# If the list of colors file does not exist, we generate a 
# random list of colors
else:
  np.random.seed(1)
  IMG_COLOR_LIST = np.random.randint(0, 255, size=(len(class_names) - 1, 3),
    dtype="uint8")
  IMG_COLOR_LIST = np.vstack([[0, 0, 0], IMG_COLOR_LIST]).astype("uint8")

def main():

  # Load a video
  cap = cv2.VideoCapture(filename)

  # Create a VideoWriter object so we can save the video output
  fourcc = cv2.VideoWriter_fourcc(*'mp4v')
  result = cv2.VideoWriter(output_filename,  
                           fourcc, 
                           output_frames_per_second, 
                           file_size) 
	
  # Process the video
  while cap.isOpened():
		
    # Capture one frame at a time
    success, frame = cap.read() 
		
    # Do we have a video frame? If true, proceed.
    if success:
		
      # Resize the frame while maintaining the aspect ratio
      frame = imutils.resize(frame, width=RESIZED_WIDTH)

      # Create a blob. A blob is a group of connected pixels in a binary 
      # frame that share some common property (e.g. grayscale value)
      # Preprocess the frame to prepare it for deep learning classification
      frame_blob = cv2.dnn.blobFromImage(frame, IMG_NORM_RATIO,
                     ENET_DIMENSIONS, 0, swapRB=True, crop=False)
	
      # Load the neural network (i.e. deep learning model)
      enet_neural_network = cv2.dnn.readNet('./enet-cityscapes/enet-model.net')

      # Set the input for the neural network
      enet_neural_network.setInput(frame_blob)

      # Get the predicted probabilities for each of 
      # the classes (e.g. car, sidewalk)
      # These are the values in the output layer of the neural network
      enet_neural_network_output = enet_neural_network.forward()

      # Extract the key information about the ENet output
      (number_of_classes, height, width) = (
        enet_neural_network_output.shape[1:4]) 

      # Find the class label that has the greatest 
      # probability for each frame pixel
      class_map = np.argmax(enet_neural_network_output[0], axis=0)

      # Tie each class ID to its color
      # This mask contains the color for each pixel. 
      class_map_mask = IMG_COLOR_LIST[class_map]

      # We now need to resize the class map mask so its dimensions
      # is equivalent to the dimensions of the original frame
      class_map_mask = cv2.resize(class_map_mask, (
        frame.shape[1], frame.shape[0]), 
        interpolation=cv2.INTER_NEAREST)

      # Overlay the class map mask on top of the original frame. We want 
      # the mask to be transparent. We can do this by computing a weighted 
      # average of the original frame and the class map mask.
      enet_neural_network_output = ((0.90 * class_map_mask) + (
        0.10 * frame)).astype("uint8")
	
      # Combine the original frame and the semantic segmentation frame
      combined_frames = np.concatenate(
        (frame, enet_neural_network_output), axis=1) 

      # Resize frame if desired
      combined_frames = imutils.resize(combined_frames, width=1920)

      # Create an appropriately-sized video frame. We want the video height
      # to be 1080 pixels
      adjustment_for_height = 1080 - combined_frames.shape[0]
      adjustment_for_height = int(adjustment_for_height / 2)
      black_img_1 = np.zeros((adjustment_for_height, 1920, 3), dtype = "uint8")
      black_img_2 = np.zeros((adjustment_for_height, 1920, 3), dtype = "uint8")

      # Add black padding to the video frame on the top and bottom
      combined_frames = np.concatenate((black_img_1, combined_frames), axis=0) 
      combined_frames = np.concatenate((combined_frames, black_img_2), axis=0) 
      
			# Write the frame to the output video file
      result.write(combined_frames)
		
    # No more video frames left
    else:
      break
			
  # Stop when the video is finished
  cap.release()
	
  # Release the video recording
  result.release()

main()

To run the code, type the following command:

python semantic_segmentation_videos.py

Video Output

Here is the output:

How the Code Works

This code is pretty much the same as the code for images. The only difference is that we run the algorithm on each frame of the input video rather than a single input image.

I put detailed comments inside the code so that you can understand what is going on.

That’s it. Keep building!

How to Detect and Classify Road Signs Using TensorFlow

In this tutorial, we will build an application to detect and classify traffic signs. By the end of this tutorial, you will be able to build this:

9_road_sign_output

Our goal is to build an early prototype of a system that can be used in a self-driving car or other type of autonomous vehicle.

Real-World Applications

self-driving-car-road-sign-detection
  • Self-driving cars/autonomous vehicles

Prerequisites

  • Python 3.7 or higher
  • You have TensorFlow 2 Installed. I’m using Tensorflow 2.3.1.
    • Windows 10 Users, see this post.
    • If you want to use GPU support for your TensorFlow installation, you will need to follow these steps. If you have trouble following those steps, you can follow these steps (note that the steps change quite frequently, but the overall process remains relatively the same).
    • This post can also help you get your system setup, including your virtual environment in Anaconda (if you decide to go this route).

Helpful Tip

rabbit-holes-resized

When you work through tutorials in robotics or any other field in technology, focus on the end goal. Focus on the authentic, real-world problem you’re trying to solve, not the tools that are used to solve the problem

Don’t get bogged down in trying to understand every last detail of the math and the libraries you need to use to develop an application. 

Don’t get stuck in rabbit holes. Don’t try to learn everything at once.  

You’re trying to build products not publish research papers. Focus on the inputs, the outputs, and what the algorithm is supposed to do at a high level. As you’ll see in this tutorial, you don’t need to learn all of computer vision before developing a robust road sign classification system.

Get a working road sign detector and classifier up and running; and, at some later date when you want to add more complexity to your project or write a research paper, then feel free to go back to the rabbit holes to get a thorough understanding of what is going on under the hood.

Trying to understand every last detail is like trying to build your own database from scratch in order to start a website or taking a course on internal combustion engines to learn how to drive a car. 

Let’s get started!

Find a Data Set

The first thing we need to do is find a data set of road signs.

We will use the popular German Traffic Sign Recognition Benchmark data set. This data set consists of more than 43 different road sign types and 50,000+ images. Each image contains a single traffic sign.

Download the Data Set

Go to this link, and download the data set. You will see three data files. 

  • Training data set
  • Validation data set
  • Test data set

The data files are .p (pickle) files. 

What is a pickle file? Pickling is where you convert a Python object (dictionary, list, etc.) into a stream of characters. That stream of characters is saved as a .p file. This process is known as serialization.

Then, when you want to use the Python object in another script, you can use the Pickle library to convert that stream of characters back to the original Python object. This process is known as deserialization.

Training, validation, and test data sets in computer vision can be large, so pickling them in order to save them to your computer reduces storage space.

Installation and Setup

We need to make sure we have all the software packages installed. 

Make sure you have NumPy installed, a scientific computing library for Python.

If you’re using Anaconda, you can type:

conda install numpy

Alternatively, you can type:

pip install numpy

Install Matplotlib, a plotting library for Python.

For Anaconda users:

conda install -c conda-forge matplotlib

Otherwise, you can install like this:

pip install matplotlib

Install scikit-learn, the machine learning library:

conda install -c conda-forge scikit-learn 

Write the Code

Open a new Python file called load_road_sign_data.py

Here is the full code for the road sign detection and classification system:

# Project: How to Detect and Classify Road Signs Using TensorFlow
# Author: Addison Sears-Collins
# Date created: February 13, 2021
# Description: This program loads the German Traffic Sign 
#              Recognition Benchmark data set

import warnings # Control warning messages that pop up
warnings.filterwarnings("ignore") # Ignore all warnings

import matplotlib.pyplot as plt # Plotting library
import matplotlib.image as mpimg
import numpy as np # Scientific computing library 
import pandas as pd # Library for data analysis
import pickle # Converts an object into a character stream (i.e. serialization)
import random # Pseudo-random number generator library
from sklearn.model_selection import train_test_split # Split data into subsets
from sklearn.utils import shuffle # Machine learning library
from subprocess import check_output # Enables you to run a subprocess
import tensorflow as tf # Machine learning library
from tensorflow import keras # Deep learning library
from tensorflow.keras import layers # Handles layers in the neural network
from tensorflow.keras.models import load_model # Loads a trained neural network
from tensorflow.keras.utils import plot_model # Get neural network architecture

# Open the training, validation, and test data sets
with open("./road-sign-data/train.p", mode='rb') as training_data:
  train = pickle.load(training_data)
with open("./road-sign-data/valid.p", mode='rb') as validation_data:
  valid = pickle.load(validation_data)
with open("./road-sign-data/test.p", mode='rb') as testing_data:
  test = pickle.load(testing_data)

# Store the features and the labels
X_train, y_train = train['features'], train['labels']
X_valid, y_valid = valid['features'], valid['labels']
X_test, y_test = test['features'], test['labels']

# Output the dimensions of the training data set
# Feel free to uncomment these lines below
#print(X_train.shape)
#print(y_train.shape)

# Display an image from the data set
i = 500
#plt.imshow(X_train[i])
#plt.show() # Uncomment this line to display the image
#print(y_train[i])

# Shuffle the image data set
X_train, y_train = shuffle(X_train, y_train)

# Convert the RGB image data set into grayscale
X_train_grscale = np.sum(X_train/3, axis=3, keepdims=True)
X_test_grscale  = np.sum(X_test/3, axis=3, keepdims=True)
X_valid_grscale  = np.sum(X_valid/3, axis=3, keepdims=True)

# Normalize the data set
# Note that grayscale has a range from 0 to 255 with 0 being black and
# 255 being white
X_train_grscale_norm = (X_train_grscale - 128)/128 
X_test_grscale_norm = (X_test_grscale - 128)/128
X_valid_grscale_norm = (X_valid_grscale - 128)/128

# Display the shape of the grayscale training data
#print(X_train_grscale.shape)

# Display a sample image from the grayscale data set
i = 500
# squeeze function removes axes of length 1 
# (e.g. arrays like [[[1,2,3]]] become [1,2,3]) 
#plt.imshow(X_train_grscale[i].squeeze(), cmap='gray') 
#plt.figure()
#plt.imshow(X_train[i])
#plt.show()

# Get the shape of the image
# IMG_SIZE, IMG_SIZE, IMG_CHANNELS
img_shape = X_train_grscale[i].shape
#print(img_shape)

# Build the convolutional neural network's (i.e. model) architecture
cnn_model = tf.keras.Sequential() # Plain stack of layers
cnn_model.add(tf.keras.layers.Conv2D(filters=32,kernel_size=(3,3), 
  strides=(3,3), input_shape = img_shape, activation='relu'))
cnn_model.add(tf.keras.layers.Conv2D(filters=64,kernel_size=(3,3), 
  activation='relu'))
cnn_model.add(tf.keras.layers.MaxPooling2D(pool_size = (2, 2)))
cnn_model.add(tf.keras.layers.Dropout(0.25))
cnn_model.add(tf.keras.layers.Flatten())
cnn_model.add(tf.keras.layers.Dense(128, activation='relu'))
cnn_model.add(tf.keras.layers.Dropout(0.5))
cnn_model.add(tf.keras.layers.Dense(43, activation = 'sigmoid')) # 43 classes

# Compile the model
cnn_model.compile(loss='sparse_categorical_crossentropy', optimizer=(
  keras.optimizers.Adam(
  0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False)), metrics =[
  'accuracy'])
	
# Train the model
history = cnn_model.fit(x=X_train_grscale_norm,
  y=y_train,
  batch_size=32,
  epochs=50,
  verbose=1,
  validation_data = (X_valid_grscale_norm,y_valid))
	
# Show the loss value and metrics for the model on the test data set
score = cnn_model.evaluate(X_test_grscale_norm, y_test,verbose=0)
print('Test Accuracy : {:.4f}'.format(score[1]))

# Plot the accuracy statistics of the model on the training and valiation data
accuracy = history.history['accuracy']
val_accuracy = history.history['val_accuracy']
epochs = range(len(accuracy))
## Uncomment these lines below to show accuracy statistics
# line_1 = plt.plot(epochs, accuracy, 'bo', label='Training Accuracy')
# line_2 = plt.plot(epochs, val_accuracy, 'b', label='Validation Accuracy')
# plt.title('Accuracy on Training and Validation Data Sets')
# plt.setp(line_1, linewidth=2.0, marker = '+', markersize=10.0)
# plt.setp(line_2, linewidth=2.0, marker= '4', markersize=10.0)
# plt.xlabel('Epochs')
# plt.ylabel('Accuracy')
# plt.grid(True)
# plt.legend()
# plt.show() # Uncomment this line to display the plot

# Save the model
cnn_model.save("./road_sign.h5")

# Reload the model
model = load_model('./road_sign.h5')

# Get the predictions for the test data set
predicted_classes = np.argmax(cnn_model.predict(X_test_grscale_norm), axis=-1)

# Retrieve the indices that we will plot
y_true = y_test

# Plot some of the predictions on the test data set
for i in range(15):
  plt.subplot(5,3,i+1)
  plt.imshow(X_test_grscale_norm[i].squeeze(), 
    cmap='gray', interpolation='none')
  plt.title("Predict {}, Actual {}".format(predicted_classes[i], 
    y_true[i]), fontsize=10)
plt.tight_layout()
plt.savefig('road_sign_output.png')
plt.show()

How the Code Works

Let’s go through each snippet of code in the previous section so that we understand what is going on.

Load the Image Data

The first thing we need to do is to load the image data from the pickle files.

with open("./road-sign-data/train.p", mode='rb') as training_data:
  train = pickle.load(training_data)
with open("./road-sign-data/valid.p", mode='rb') as validation_data:
  valid = pickle.load(validation_data)
with open("./road-sign-data/test.p", mode='rb') as testing_data:
  test = pickle.load(testing_data)

Create the Train, Test, and Validation Data Sets

We then split the data set into a training set, testing set and validation set.

X_train, y_train = train['features'], train['labels']
X_valid, y_valid = valid['features'], valid['labels']
X_test, y_test = test['features'], test['labels']
print(X_train.shape)
print(y_train.shape)
1_uncomment_x_train_shape
i = 500
plt.imshow(X_train[i])
plt.show() # Uncomment this line to display the image
2_road_sign_display_image_from_dataset

Shuffle the Training Data

Shuffle the data set to make sure that we don’t have unwanted biases and patterns.

X_train, y_train = shuffle(X_train, y_train)

Convert Data Sets from RGB Color Format to Grayscale

Our images are in RGB format. We convert the images to grayscale so that the neural network can process them more easily.

X_train_grscale = np.sum(X_train/3, axis=3, keepdims=True)
X_test_grscale  = np.sum(X_test/3, axis=3, keepdims=True)
X_valid_grscale  = np.sum(X_valid/3, axis=3, keepdims=True)

i = 500
plt.imshow(X_train_grscale[i].squeeze(), cmap='gray') 
plt.figure()
plt.imshow(X_train[i])
plt.show()
3_grayscale_road_sign

Normalize the Data Sets to Speed Up Training of the Neural Network

We normalize the images to speed up training and improve the neural network’s performance.

X_train_grscale_norm = (X_train_grscale - 128)/128 
X_test_grscale_norm = (X_test_grscale - 128)/128
X_valid_grscale_norm = (X_valid_grscale - 128)/128

Build the Convolutional Neural Network

In this snippet of code, we build the neural network’s architecture.

cnn_model = tf.keras.Sequential() # Plain stack of layers
cnn_model.add(tf.keras.layers.Conv2D(filters=32,kernel_size=(3,3), 
  strides=(3,3), input_shape = img_shape, activation='relu'))
cnn_model.add(tf.keras.layers.Conv2D(filters=64,kernel_size=(3,3), 
  activation='relu'))
cnn_model.add(tf.keras.layers.MaxPooling2D(pool_size = (2, 2)))
cnn_model.add(tf.keras.layers.Dropout(0.25))
cnn_model.add(tf.keras.layers.Flatten())
cnn_model.add(tf.keras.layers.Dense(128, activation='relu'))
cnn_model.add(tf.keras.layers.Dropout(0.5))
cnn_model.add(tf.keras.layers.Dense(43, activation = 'sigmoid')) # 43 classes

Compile the Convolutional Neural Network

The compilation process sets the model’s architecture and configures its parameters.

cnn_model.compile(loss='sparse_categorical_crossentropy', optimizer=(
  keras.optimizers.Adam(
  0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False)), metrics =[
  'accuracy'])

Train the Convolutional Neural Network

We now train the neural network on the training data set.

history = cnn_model.fit(x=X_train_grscale_norm,
  y=y_train,
  batch_size=32,
  epochs=50,
  verbose=1,
  validation_data = (X_valid_grscale_norm,y_valid))
6-training-console-outputJPG

Display Accuracy Statistics

We then take a look at how well the neural network performed. The accuracy on the test data set was ~95%. Pretty good!

score = cnn_model.evaluate(X_test_grscale_norm, y_test,verbose=0)
print('Test Accuracy : {:.4f}'.format(score[1]))
8-test-accuracyJPG
accuracy = history.history['accuracy']
val_accuracy = history.history['val_accuracy']
epochs = range(len(accuracy))

line_1 = plt.plot(epochs, accuracy, 'bo', label='Training Accuracy')
line_2 = plt.plot(epochs, val_accuracy, 'b', label='Validation Accuracy')
plt.title('Accuracy on Training and Validation Data Sets')
plt.setp(line_1, linewidth=2.0, marker = '+', markersize=10.0)
plt.setp(line_2, linewidth=2.0, marker= '4', markersize=10.0)
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.grid(True)
plt.legend()
plt.show() # Uncomment this line to display the plot
7_training_validation_accuracy

Save the Convolutional Neural Network to a File

We save the trained neural network so that we can use it in another application at a later date.

cnn_model.save("./road_sign.h5")

Verify the Output

Finally, we take a look at some of the output to see how our neural network performs on unseen data. You can see in this subset that the neural network correctly classified 14 out of the 15 test examples.

# Reload the model
model = load_model('./road_sign.h5')

# Get the predictions for the test data set
predicted_classes = np.argmax(cnn_model.predict(X_test_grscale_norm), axis=-1)

# Retrieve the indices that we will plot
y_true = y_test

# Plot some of the predictions on the test data set
for i in range(15):
  plt.subplot(5,3,i+1)
  plt.imshow(X_test_grscale_norm[i].squeeze(), 
    cmap='gray', interpolation='none')
  plt.title("Predict {}, Actual {}".format(predicted_classes[i], 
    y_true[i]), fontsize=10)
plt.tight_layout()
plt.savefig('road_sign_output.png')
plt.show()
9_road_sign_output-1

That’s it. Keep building!