C++ Fundamentals for Robotics

In this post, we will learn the fundamentals of C++, one of the most popular languages (along with Python) for programming robots. Getting your head around these fundamentals will help you immensely when you learn ROS. The only prerequisite knowledge is that you have some basic programming experience in any language.

We will not cover the entire C++ language since it is enormous. We will go over the most important commands that you will use again and again over your robotics career. I will also provide links to some short C++ tutorials that do a great job explaining some specific aspects of the language.

Let’s get started!

Table of Contents

You Will Need

In order to complete this tutorial, you will need:

Directions

How to Install the GCC/G++ Compilers

C++ is a compiled language. What that means is that when you want to run a program that you write in C++, your machine needs to have a special program that translates the C++ that you write into language that the computer understands and can execute.

The built-in compiler for C/C++ (i.e. C language and C++ language) is called GCC/G++. Let’s see if the compiler is installed on your machine.

Within Ubuntu Linux, open up a new terminal window.

1-new-terminal-window

Type the following command to see if you have the C compiler (named GCC):

gcc
2-gcc-not-installed

GCC is not installed. Before installing it, let’s update the package list using this command:

sudo apt-get update

Now use this command to install a bunch of packages, including GCC, G++, and GNU Make:

sudo apt install build-essential

You might see some sort of error about things being locked if you try the following command. If you do, kill all processes that are using the APT package management tool using this command:

sudo killall apt apt-get

Remove the lock files:

sudo rm /var/lib/apt/lists/lock
sudo rm /var/cache/apt/archives/lock
sudo rm /var/lib/dpkg/lock*

Reconfigure the packages and update:

sudo dpkg --configure -a
sudo apt update

Now use this command to install numerous packages, including GCC, G++, and GNU Make:

sudo apt install build-essential
Press Y to continue.

Wait while the whole thing downloads.

Now, install the manual pages about using GNU/Linux for development (note: it might already be installed):

sudo apt-get install manpages-dev

Check to see if both GCC and G++ are installed.

whereis gcc
whereis g++
3-whereis-gcc-gpp

Check what version you have.

gcc --version
g++ --version
4-version-checking

Return to Table of Contents

How to Install the C/C++ Debugger

In this section, we will install the C/C++ debugger. It is called GNU Debugger (GDB) and enables us to detect problems or bugs in the code that we write.

In the terminal window, type the following command:

sudo apt-get install gdb

You will be asked to type your password, and then click Enter.

Type the following command to verify that it is installed:

gdb
5-install-gdb

Type this command to quit.

q

Exit the terminal.

exit

Return to Table of Contents

Write Your First Program in C++

On the Ubuntu Linux desktop, click the 9 dots in the bottom left of the screen and search for the text editor named gedit. Double-click on it.

6-search-for-gedit
7-search-for-gedit
8-double-click-gedit

Write the following code and save it as hello_world.cpp.

9-hello-world

This page on GeeksforGeeks explains what each line of code does.

You can close the text editor now by clicking on the x on the upper right corner of the screen.

Click the file cabinet on the left side of the screen. You should see your hello_world.cpp file. Drag and drop it on your Desktop.

1-drag-to-desktop

How to Compile and Run Your Program

We now need to compile the hello_world.cpp program so that your computer will be able to read it.

Since the file is located on the desktop, open a terminal window, and type the following command:

cd Desktop

Type the following command to see if it is there:

dir
10-check-desktop

Compile hello_world.cpp by typing the following command:

g++ hello_world.cpp

If you see an error, check your program to see if it is exactly like I wrote it.

If you don’t see an error, type the dir command to see if the a.out file is there, then run the program by typing the following command:

./a.out
11-run-first-program

Congratulations! You have created and run your first C++ program. 

In this example, the name of the executable file was a.out. If we wanted to use another name, we could do that too.

g++ hello_world.cpp -o hello

Type dir to see the new file.

12-hello-custom-name

Type the following command to run it.

./hello
13-hello-custom-name

Return to Table of Contents

How to Debug Code Written in C++

In this section we’ll learn how to debug code written in C++. Debug means to remove errors from a program.

Let’s create a file called product.cpp. We will multiply two numbers together and display the product.

Search for the gedit text editor in Ubuntu Linux. Open the application.

Write the program.

14-product-cpp

Compile the program using the following command. We have to add that -g option so that the code is built with debugging functionality.

g++ -g product.cpp -o product

Execute the code using the following command:

./product
15-product-output

Now, debug the code using the following command:

gdb product

Create a breakpoint at line 5. A breakpoint tells the debugger to run the code but stop at line 5.

b 5
16-breakpoint-line-5

Now, run the program and stop at the breakpoint.

r
17-run-it

Execute the next line of code.

n

Print the value of the first_number variable.

p first_number
18-print-first-number

Run the rest of the program.

r

Type:

n
19-type-r-then-n

Return to Table of Contents

Classes and Objects in C++

The main difference between C and C++ is that C++ enables the programmer to define what are known as classes. A class has its own data and functions (i.e. behavior, methods, etc..). A class often represents something that we might find in the real world.

For example, consider the Car class. There are many different types of cars one can buy, but all cars have certain things in common. The data for the car class might include the following:

  • Number of wheels
  • Make
  • Model
  • Year
  • Current speed

Functions for the car class might include the following:

  • Speed up
  • Slow down

Classes are blueprints (i.e. templates) for what are called objects. Each time we want to create a new car in a program, for example, instead of defining that car’s data and methods from scratch, we can use the Car class as a template to define the new car’s data (number of wheels, make, model, year, etc.), and we can use the car’s functions to change its data (e.g. speed up and slow down will change the current speed).

Geeks for Geeks has a good, brief  tutorial on creating classes and objects in C++. I recommend you go through it now.

After going to that tutorial above, I recommend you go through the “Basics of C++” tutorial at Geeks (just the Basics section, not the rest of the stuff on there) for Geeks to make sure you have practice with the tasks in C++ that you will do again and again. Go slowly so you understand what you are doing and why you are doing it. No need to hurry.

Now quickly skim the following tutorials. Just read the first couple of paragraphs on each page. No need to do a deep dive into any of these tutorials. I just want you to have a high-level understanding of what you can do in C++. You can come back to these tutorials when you need them during an actual project:

Return to Table of Contents

How to Create a C++ Project

In many large C++ projects, you will have to compile multiple programs, each containing hundreds or even thousands of lines of code. In addition to being compiled, the programs might need to be linked together. In this section, I will show you how to do this using a tool known as a Linux makefile.

Let’s develop a program again that multiplies two numbers together. This time, we will create a class called Product. The product class has only one method (or function) called calculate. The calculate method takes as input two numbers, and it outputs the product of those two numbers.

Best practice when creating classes in C++ is to split the class into two files: one file to declare the Product class (i.e. tell the compiler that a class exists called product) and one file to implement the Product class (i.e. to implement the Product class’s data and methods). We then need a third file that creates an object of the Product class and performs the multiplication. This last file is called main.cpp:

  1. Product.h: This is a header file for the Product class. We declare the Product class here (both data and methods) rather than in the Product.cpp file. 
    • Header files help make your projects more organized and speed up compile time. 
  2. Product.cpp: We already declared the data and methods of the product class in the Product.h file, but we now need to implement them. Product.cpp is the program where we implement the data and methods of the Product class.
  3. main.cpp: This is the main code of the project that is going to get built into an executable file that your machine understands. Here is where we create an object of the Product class and perform the multiplication. 

In case you are confused at the difference between declaring something and implementing something, take a look at the example below of my Product.h file:

20-product-h

Above we have only declared the Product class. We have not yet implemented it. Let’s do that now in the Product.cpp file:

21-product-cpp

Now that we have implemented the Product class, we need to create one more file, the main.cpp file. This file is where we perform the actual multiplication of two numbers. Make sure your code looks exactly like what I have below.

22-main-cpp

Now that we have our class declaration, class implementation, and main program, we are ready to compile the code and execute it. We can do all that in two lines in the Linux terminal. Move to the directory where your files are located. Mine are located on my desktop, so I open up a Linux terminal window and type:

cd Desktop

Type the dir command to get a list of the files in your desktop.

And then compile the code using the following command:

g++ Product.cpp main.cpp -o main
23-compile

Type dir

24-executable-main

Notice that you have an executable file called main. This is the main program that includes all that code you developed above in a nice neat package that is ready to be executed. Let’s execute it now by typing:

./main
25-output-33

If you got “Output = 33”, congratulations! If you did not get that answer or go an error, go back to the code and make sure it is written exactly as I have written above.

Now remove that executable file:

rm main

rm above means “remove.”

Return to Table of Contents

How to Create a Linux Makefile

In the example above, each time we make changes to any of the three source files, we have to create a new executable file by typing the “g++ Product.cpp main.cpp -o main” command. Now imagine if we had a project that had 100 source files. Having to type out the command to compile 100 source files each time we made a tiny change to just one of the source files would get annoying really fast!

Fortunately, Linux has some solutions for this. One of the solutions is called a Linux Makefile. A makefile is a text file (or small program) that contains the commands that you would ordinarily need to type out manually to link, compile, and build the executable for all your source code. A makefile is run in Linux using the make command. When you run the make command on a makefile, all your source code files are linked, compiled, and built into an executable file (like we did in the previous section).

Let’s suppose that you have 100 source files, and you have made a tiny change to one of those source files. One really cool thing about running the make command is that it knows which files have changed since the last time you built the executable. make will only rebuild files that changed since the previous version. So rather than rebuilding all 100 files each time you need to link, compile, and build your project, make will ensure that only one file is rebuilt, and everything else remains unchanged. You can imagine how much time make will save you.

The official manual for creating Makefiles is here at GNU.org in case you ever need a detailed reference to refer to. I’ll show you below how to create and use a makefile for the source code we developed in the last section.

Open up a new terminal window. Move to your desktop, and create a new directory called product_project using the mkdir (i.e. make directory) command.

Move to that directory using the cd product_project command.

26-cd-product-project

Type gedit to open up the text editor.

Create the following makefile exactly as I have written below. Save it as makefile. The syntax is complicated, but don’t worry about trying to memorize this. Just refer back to this tutorial when you need to create a makefile in the future. You can also consult the makefile manual.

27-makefile
28-save-as-makefile

After you have saved the makefile, open a new terminal window, move to the Desktop, and move your three programs (Product.cpp, Product.h and main.cpp) to this product_project directory using the mv (i.e. move) command. We need to move all cpp and h files.

mv ~/Desktop/*.cpp ~/Desktop/product_project/
mv ~/Desktop/*.h ~/Desktop/product_project/
29-move-files

Now we need to execute the makefile.

make
30-make

You will now have three new files in your directory: executable file named main and two object files named main.o and Product.o.

Now that the program is built, we can run it.

./main
31-run-main

Return to Table of Contents

How to Create a CMake File

An alternative to using a Linux makefile to build your C++ project is to use a software tool known as CMake. CMake is one of the most popular tools for building C++ projects because it is relatively user-friendly (compared to the Linux makefile process I described in the previous section). Here is a link to the CMake documentation.

Let’s see how to build a C++ project using CMake.

First, go to the desktop and create a product_project_2 folder. Move (or copy and paste) your three source code files (Product.h, Product.cpp, and main.cpp) to that new folder. 

32-new-folder

Install CMake.

sudo apt-get install cmake

Make sure you are in the product_project_2 folder. Then create the following text file and save it as CMakeLists.txt. This file builds the executable file called main from main.cpp and Product.cpp.

33-cmake-file

Close the text editor to return to the terminal window.

Create a new folder.

mkdir build

Move to the build folder.

Type the following command:

cmake ..
34-run-cmake

Now you need to build the project. Type:

make
35-build-target

Run the code:

./main

If everything worked properly, you should see Output = 33

36-run-the-code

Learn C++ Fundamentals

At this stage, I recommend you go through short tutorials that cover the basics of C++. Here are the concepts you should work through:

Conclusion

Congratulations! We have covered a lot of ground. You have come a long way from the beginning of this tutorial and now have a solid foundation in C++, one of the most popular languages for building robots.

Keep building!

Return to Table of Contents

The Complete Guide to Linux Fundamentals for Robotics

In this tutorial, you will learn the most common commands and tools you will use again and again as you work with ROS on Linux for your robotics projects. 

While there are hundreds of Linux commands, there are really only a handful that you will use repeatedly. I’m a firm believer in the Pareto Principle (also known as the 80/20 rule). Don’t waste your time memorizing a bunch of commands you may never use; instead focus on learning the fundamentals, the handful of commands and tools that you will use most frequently. Everything else, you can look it up when you need it.

This tutorial has a lot of steps, but be patient and take your time as you work all the way through it. By the end of this tutorial, you will have the rock-solid confidence to move around in Linux with ease.

Without further ado, let’s get started!

Table of Contents

Prerequisites

Explore the Folder Hierarchy

Open up a fresh Linux terminal window.

Let’s check out the directory (i.e. folder) structure of the catkin_ws folder, the workspace folder for ROS. 

Install the tree program.

sudo apt-get install tree

Type:

tree catkin_ws

You should see a hierarchy of all the folders underneath the catkin_ws folder.

1-tree-catkin-wsJPG

If at anytime, you want to see the hierarchy of all folders underneath a specific folder, just type:

tree <path to the folder>

Return to Table of Contents

Navigate Between Folders

Ok, now we need to move to our catkin_ws/src folder.

cd catkin_ws/src 

If you don’t have Git installed, install it now: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

You might have nothing in your catkin_ws/src directory at this stage. In order to learn Linux, it is best if we work with an actual ROS project with real folders and files so that we get comfortable with how Linux works. 

Fortunately, the Robot Ignite Academy (they provide excellent ROS courses by the way) has some projects and simulations that they made publicly available here: https://bitbucket.org/theconstructcore/. Let’s download their Linux files.

Type this command (all on one line):

git clone https://bitbucket.org/theconstructcore/linux_course_files.git
2-install-git-filesJPG

Type the following command to see what files you have in there:

dir

You should have a folder named linux_course_files

3-linux-course-filesJPG

Linux systems are made up of two main parts:

  1. Files: Used to store data
  2. Folders: Used to store files and other folders

Ok, let’s move to the folder that contains the Python file bb8_keyboard.py. We have to change directories (i.e. move folders) to get to where the program is located. Let’s do that now.

cd linux_course_files/move_bb8_pkg/src

Type the following command to see what is inside that folder:

dir
4-see-what-is-insideJPG

Let’s check the path to this directory.

pwd
5-check-pathJPG

Now, we move up one folder, to the move_bb8_pkg folder.

cd ..
6-move-up-a-directoryJPG

How do we move back to the src folder?

cd src
7-move-backJPG

How do we get back to the home folder?

cd ~

Note that, we could have also done:

cd /home/ros

~ is equal to /home/ros.

Now that we are at the home folder, let’s see what the official path to this directory is:

pwd
8-path-to-home-folderJPG

List all the folders in the home directory.

ls
9-list-folders-in-home-directoryJPG

That ls command doesn’t list the hidden files. We need to type this command to list all the files.

ls -la

The . in front of the files means that they are hidden. You won’t see them using the regular ls command.

10-list-hidden-filesJPG

We could have also done:

ls --all

To get all the possibilities of commands you can type with ls, just type:

ls --help

To get the manual, we could have alternatively typed:

man ls

To get out of the manual, type q.

Return to Table of Contents

Create New Folders and Files

Ok, let’s learn how to make a new folder. Type the following command:

mkdir test_folder

If you type dir, you should see a new folder in there named test_folder.

Move to that folder.

cd test_folder

Create a new text file named new_file.txt.

touch new_file.txt
11-new-fileJPG

Open the new file in a text editor.

gedit new_file.txt

Close the text editor.

Now, go to the linux_course_files folder.

cd  ~/catkin_ws/src/linux_course_files/

Create a new folder called scripts.

mkdir scripts

Move that file to the scripts folder

cd scripts

Create a new Python program called hello_world.py

touch hello_world.py

Open the program using the text editor.

gedit hello_world.py

Write a simple program that prints “Hello World” every two seconds.

13-hello-worldJPG

Save the program, and run it by typing:

python hello_world.py

Now go to the catkin_ws/src/linux_course_files/ folder.

cd  ~/catkin_ws/src/linux_course_files/

Move the scripts folder to the move_bb8_pkg folder.

14-scripts-folder-thereJPG

The syntax is:

mv <file/folder we want to move> <destination>

So we type:

mv scripts move_bb8_pkg 

Note that in this case, we were in the linux_course_files folder, which contains both the scripts and move_bb8_pkg folders. For this reason, we did not have to type out the entire path of both folders (e.g. the entire path of the scripts folder is ~/catkin_ws/src/linux_course_files/scripts).

Now let’s move to the move_bb8_pkg folder to see if the scripts folder is in there.

cd move_bb8_pkg

Note, in the command above, if we would have typed cd m (i.e. the first letter of the folder of interest) and then pressed the TAB button on our keyboard, the system would have automatically filled in the rest. This saves us time from having to type out cd move_bb8_pkg.

The scripts folder is in there, containing the hello_world.py program.

cd scripts
dir
15-with-hello-worldJPG

Let’s copy the hello_world.py file into a new Python file:

cp  hello_world.py hello_world_copy.py
16-copyJPG

You have a new file called hello_world_copy.py, which has all the contents of hello_world.py.

The syntax for the copy operation in Linux is:

cp <file we want to copy> <name of the new file>

If you want to copy a folder, you can do the following command:

cp -r <folder we want to copy> <name of the new folder>

Remember, any time you want to see what other options you have with the cp command, just type:

cp --help

If you ever want to remove a file, here is the command:

rm <file to remove>

To remove a folder, you type:

rm -r <folder you want to remove>

Return to Table of Contents

Explore Permissions in Linux

Let’s take a look at how permissions work in Linux. Let’s open a new Linux terminal and change directories to our scripts folder.

cd  ~/catkin_ws/src/linux_course_files/move_bb8_pkg/scripts

Now type:

ls -la

Here is the output:

17-permissions-ls-laJPG

Notice the hello_world.py file. In the beginning of the line, you see the following:

-rw-r--r--

Linux has three permission types for files and directories:

  • Read: Denoted by the letter r. This means that the user can read the file or directory.
  • Write: Denoted by the letter w. This means that the user can write or modify the file or directory.
  • Execute: Denoted by the letter x. This means that the user can execute the file.

There are also three different user permission groups:

  • Owner: That’s me!
  • Group: Whatever group the file or directory was assigned to.
  • All Users: This permission group includes the rest of the users.

Given this information above, let’s translate the line of hello_world.py containing -rw-r–r–. Everything is in order, reading from left to right:

  • Owner has read and write privileges (i.e. rw-)
  • The group has read privileges (i.e. r–)
  • All other users have only read privileges (r–).

Notice how no users, including the owner, have execute privileges on the hello_world.py file.

Try executing the hello_world.py file now, and see what you get:

rosrun <package name> <program you want to run>
rosrun move_bb8_pkg hello_world.py

You get this ugly message about not being able to find the executable. How do we change that?

file_didnt_run

You must use the chmod command.

chmod +x hello_world.py

Now type:

ls -la
18-chmod-lslaJPG

Can you see that little x, which means that execution permissions have been added to the file? The following permissions are now in place:

  • Owner: Read, Write and Execute (rwx)
  • Group: Read and Execute (r-x)
  • All Other Users: Read and Execute(r-x)

Now, run hello_world.py again using the ROS command rosrun.

rosrun move_bb8_pkg hello_world.py

Voila! You should see your program working now, with “Hello World” messages scrolling down the screen.

20-voilaJPG

Return to Table of Contents

Create a Bash Script

Up until now, when we wanted to run commands in Linux, we opened up a terminal window and typed the command manually and then ran it. Examples of the commands we have used so far include cd (to change directories), ls (to list the folders and files inside a directory), and mv (to move a folder or file from one place to another), etc. 

Fortunately, Linux has something called bash scripts. Bash scripts are text files that can contain commands that we would normally type out manually in the Linux terminal. Then, when you want to run a set of commands, all you have to do is run the bash script (which contains the commands that you want to execute). Pretty cool right? Software engineers are always looking for ways to automate tedious tasks!

Let’s take a look now at bash scripts. Open a new terminal window. Move to your scripts folder.

cd ~/catkin_ws/src/linux_course_files/move_bb8_pkg/scripts

Let’s create a new file named bash_script.sh.

touch bash_script.sh

Type the following command, and you should see the bash_script.sh file inside:

dir

Now open bash_script.sh.

gedit bash_script.sh

Type these two lines of code inside that file and click Save.

21-type-this-bash-scriptJPG

The first line of code is:

#!/bin/bash

This line of code tells Linux that this file is a bash script.

The .sh file extension is what you use for bash scripts. The echo command just tells the Linux terminal to print the message that follows to the screen.

Now, let’s close out the text editor and go back to the terminal window.

Type the following command:

./bash_script.sh

Uh oh! We have an error. Something about permissions. Let’s find out why.

Type the following command:

ls -la
22-read-write-permissionsJPG

Notice how the bash_script.sh file only has read and write permissions because there is no x. Let’s fix that by giving ourselves execute permissions on this file.

chmod +x bash_script.sh

Now type:

ls -la
23-change-permissionsJPG

You should see that we now have execute permissions.

Let’s try to run the program again.

./bash_script.sh
24-run-program-againJPG

We can also pass arguments to bash scripts.

Create a new bash script file named demo.sh.

touch demo.sh

Open the file:

gedit demo.sh

Add this code:

25-new-bash-scriptJPG

The $1 means the first argument. We can have $1, $2, $3…$N…depending on how many arguments you want to pass to the script. In our example above, we are only passing one argument. This argument is stored in the ARG1 variable.

The fi at the end of the if statement is necessary to close out the if statement.

Save the file.

Change the file’s permissions.

chmod +x demo.sh

Now run the program, passing in the argument one ‘AutomaticAddison’ one space to the right of the command.

./demo.sh AutomaticAddison
26-new-bash-script-runJPG

Return to Table of Contents

Explore the .bashrc File

Type the following command to go back home.

cd

Type the following command to list all files.

ls -la

You see that .bashrc file? Open it up in a text editor.

gedit .bashrc
27-bashrcJPG

Hmmm. What is this file with all this crazy code in it?

The .bashrc is a special bash script which is always located in your home directory. It contains an assortment of commands, aliases, variables, configuration, settings, and useful functions. 

The .bashrc script runs automatically any time you open up a new terminal, window or pane in Linux. However, if you have a terminal window open and want to rerun the .bashrc script, you have to use the following command:

source .bashrc

Return to Table of Contents

Explore Environment Variables

Open a fresh, new terminal window and type:

export

You will see a list of all the environment variables in your system with their corresponding values. Environment variables are variables that describe the environment in which programs run. The programs that run on your computer use environment variables to answer questions such as: What is the username of this computer? What version of ROS is installed? Where is ROS installed?, etc.

There are lots of environment variables. How do we filter this list to get only the variables that contain ROS in them? Type the following command:

export | grep ROS
28-ros-grepJPG

The ROS_PACKAGE_PATH variable tells the system where a program would find ROS packages.

If at any point, you want to change a variable’s value, you use the following syntax:

export ROS_PACKAGE_PATH= “<some new path here>"

The grep command is pretty cool. You can use it in conjunction with other commands too. Go to your catkin_ws/src folder.

cd ~/catkin_ws/src

Type:

ls
29-listJPG

You should see a list of all files. Now type this:

ls | grep hello
30-grep-with-lsJPG

You will see that all files containing the string ‘hello’ in the name are shown on the screen.

Return to Table of Contents

Understand Processes in Linux

In this section, we’ll take a look at Linux processes. A process is a computer program in action. A computer program consists of data and a set of instructions for your computer to execute. 

At any given time multiple processes are running on your Linux system. There are foreground processes and background processes. 

  • Foreground processes (also known as interactive processes) require user input and will only initialize if requested by a user in a terminal window. 
  • Background processes (also known as non-interactive or automatic processes) launch automatically, without requiring prior user input.

Return to Table of Contents

Launch a Foreground Process

Let’s see the processes that are currently running on our system. Open a new terminal window and type:

ps faux
31-ps-fauxJPG

Now, let’s start a new foreground process. In a new terminal tab, type the following command:

rosrun move_bb8_pkg hello_world.py
32-launch-hello-worldJPG

Open a new tab, and type:

ps faux | grep hello_world

The first result is the hello_world.py program we are running. 

33-hello-world-processJPG

Go back to the hello_world.py terminal and kill the process by pressing:

Ctrl + C

Now return to the other terminal window and type this command:

ps faux | grep hello_world
34-process-is-goneJPG

The process is now gone because we killed it (That grep–color=auto hello_world result has nothing to do with the hello_world.py program, so you can ignore that).

What happens if we have a process and all we want to do is to suspend it rather than kill it? What do we do?

Let’s take a look at that now.

In a new terminal window, launch the hello_world.py program again.

rosrun move_bb8_pkg hello_world.py

In a new terminal tab, type:

ps faux | grep hello_world
35-process-idJPG

The number that is in the second column is the process ID (PID). The PID is 17208 (yours will be different). 

Now return to the window where “Hello World” keeps printing out and suspend (i.e. push the foreground process to the background) by typing:

Ctrl + Z

36-suspend-processJPG

In a new terminal tab, type:

ps faux | grep hello_world

You will notice that the hello_world.py process is still an active process, but it is now running in the background.

Now let’s kill the process. The syntax is kill <PID>. In my case, I type:

kill 17208

If you type ps faux | grep hello_world , you will notice the process is still not killed. The reason it is still not killed is because it is in the background. We need to go back to the window where we were running the hello_world.py script and resume the suspended process by typing:

bg
37-bg-commandJPG

In a new terminal tab, type:

ps faux | grep hello_world

You will see that the process has been killed.

Return to Table of Contents

Launch a Background Process

Now, let’s launch our hello_world.py program as a background process rather than a foreground process. Type the following in a new terminal window.

rosrun move_bb8_pkg hello_world.py &
38-launch-background-processJPG

The PID (process ID) is 17428.

Now try to kill the process.

Ctrl + C

Didn’t work did it? Now try:

Ctrl + Z

Still didn’t work.

The reason the Ctrl + C or Ctrl + Z commands did not work is because they only work on foreground processes. We launched the process in the background. To kill it, we need to open up a new terminal tab and type kill <PID>:

kill 17428

To verify that the process is killed, type:

ps faux | grep hello_world

Return to Table of Contents

Understand Secure Shell (SSH Protocol) in Linux

In this section, we will explore the SSH protocol. SSH protocol provides a secure way to access a computer over an unsecured network. Your computer (the “client”) can access another computer (the “server”) to transfer files to it, run programs, carry out commands, etc.

If you have been following me for a while, you might remember when I connected my Windows-based laptop computer (“client”) to a Raspberry Pi (“server”) using the SSH protocol. I was then able to control the Raspberry Pi directly from my laptop.

The most common use case for SSH protocol is when you want to control a robot remotely. For example, imagine a robot that has an onboard computer (e.g. a Raspberry Pi). You can use SSH to run programs on the Raspberry Pi via your personal laptop computer.

Let’s see how SSH works. We will connect to a Raspberry Pi that is already setup with SSH from our Ubuntu Linux virtual box machine. The Raspberry Pi is the (“remote” server), and Ubuntu Linux, which is running in a Virtual Box on my Windows 10 machine, is the “local” client.

Note: You don’t have to go out and buy a Raspberry Pi right now. I just want to give you an idea of what the steps would be when you encounter a situation when you need to set up SSH communication between your local machine (e.g. personal laptop running Ubuntu Linux in a Virtual Box) and a remote server (e.g. Raspberry Pi) mounted on a robot.

The first thing we need to do is to install the openssh-server package on the Ubuntu Linux machine. 

Type:

sudo apt update
sudo apt install openssh-server

Enter your password and press Y.

Type the following:

sudo systemctl status ssh

You should see a message that says active (running).

39-ssh-runningJPG

Press q to exit.

Open the SSH port:

sudo ufw allow ssh

Check if ssh is installed.

ssh
41-check-ssh-installedJPG

On the Raspberry Pi, in a terminal window I type:

hostname -I
42-rpi-ip-addressJPG

I see my Raspberry Pi’s IP address is:

192.168.0.17

You can also type the command:

ip a

You will use the IP address next to inet

45-raspberry-pi-ip-addressJPG

I want to connect to it from my Ubuntu Linux virtual box machine. The syntax for connecting to a remote computer via SSH is:

ssh <user>@<host>

where user is the account I want to login to on the remote machine, and host is the IP address of the remote machine (server) I want to login to. In my case, from an Ubuntu Linux terminal window, I type:

ssh pi@192.168.0.17
43-connect-via-sshJPG

You will need to type the password for that username on the Raspberry Pi.

You can see that I am now connected to my Raspberry Pi. Type the following command to get a list of all the folders in the home directory.

ls
43b-list-files-on-rpiJPG

When I want to disconnect from the Raspberry Pi, I type:

exit
44-exit-remote-sshJPG

Typing the exit command gets me back to my normal user (i.e. ros).

Return to Table of Contents

Explore the “sudo” and “apt” Commands

Linux provides a tool called the Advanced Package Tool (APT). The Advanced Package Tool automates the retrieval, configuration and installation of software packages.

Let’s update the list of available software packages using this command in a new terminal window.

apt-get update 
46-we-have-a-problemJPG

Hmm. That didn’t work. We have a problem. The problem is that we don’t have permissions to access the package database. Now, run this command:

sudo apt-get update

Type your password. Here is the output:

47-it-workedJPG

It worked!

What does “sudo” mean?

sudo is an abbreviated form of “super user do.” Putting “sudo” in front of a command tells Linux to run the command as a super user (i.e. root user) or another user.

Return to Table of Contents

Keep Building!

Congratulations on reaching the end of this tutorial. I recommend you bookmark this page and come back to it regularly as you work with ROS on Linux. We have covered the most essential commands and tools you will use again and again as you work with ROS on Linux  for your robotics projects. Keep building!

Real-Time Object Recognition Using a Webcam and Deep Learning

*** This tutorial is two years old and may no longer work properly. You can find an updated tutorial for object recognition at this link***

In this tutorial, we will develop a program that can recognize objects in a real-time video stream on a built-in laptop webcam using deep learning.

object-detection-recognition-video-demo

Object recognition involves two main tasks:

  1. Object Detection (Where are the objects?): Locate objects in a photo or video frame
  2. Image Classification (What are the objects?): Predict the type of each object in a photo or video frame

Humans can do both tasks effortlessly, but computers cannot.

Computers require a lot of processing power to take full advantage of the state-of-the-art algorithms that enable object recognition in real time. However, in recent years, the technology has matured, and real-time object recognition is now possible with only a laptop computer and a webcam.

Real-time object recognition systems are currently being used in a number of real-world applications, including the following:

  • Self-driving cars: detection of pedestrians, cars, traffic lights, bicycles, motorcycles, trees, sidewalks, etc.
  • Surveillance: catching thieves, counting people, identifying suspicious behavior, child detection.
  • Traffic monitoring: identifying traffic jams, catching drivers that are breaking the speed limit.
  • Security: face detection, identity verification on a smartphone.
  • Robotics: robotic surgery, agriculture, household chores, warehouses, autonomous delivery.
  • Sports: ball tracking in baseball, golf, and football.
  • Agriculture: disease detection in fruits.
  • Food: food identification.

There are a lot of steps in this tutorial. Have fun, be patient, and be persistent. Don’t give up! If something doesn’t work the first time around, try again. You will learn a lot more by fighting through to the end of this project. Stay relentless!

By the end of this tutorial, you will have the rock-solid confidence to detect and recognize objects in real time on your laptop’s GPU (Graphics Processing Unit) using deep learning.

Let’s get started!

Table of Contents

You Will Need

Install TensorFlow CPU

We need to get all the required software set up on our computer. I will be following this really helpful tutorial.

Open an Anaconda command prompt terminal.

1-open-command-promptJPG

Type the command below to create a virtual environment named tensorflow_cpu that has Python 3.6 installed. 

conda create -n tensorflow_cpu pip python=3.6

Press y and then ENTER.

A virtual environment is like an independent Python workspace which has its own set of libraries and Python version installed. For example, you might have a project that needs to run using an older version of Python, like Python 2.7. You might have another project that requires Python 3.7. You can create separate virtual environments for these projects.

Now, let’s activate the virtual environment by using this command:

conda activate tensorflow_cpu
2-activate-virtual-environmetJPG

Type the following command to install TensorFlow CPU.

pip install --ignore-installed --upgrade tensorflow==1.9

Wait for Tensorflow CPU to finish installing. Once it is finished installing, launch Python by typing the following command:

python
3-launch-pythonJPG

Type:

import tensorflow as tf

Here is what my screen looks like now:

4-import-tensorflowJPG

Now type the following:

hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()

You should see a message that says: “Your CPU supports instructions that this TensorFlow binary….”. Just ignore that. Your TensorFlow will still run fine.

Now run this command to complete the test of the installation:

print(sess.run(hello))
6-test-installationJPG

Press CTRL+Z. Then press ENTER to exit.

Type:

exit

That’s it for TensorFlow CPU. Now let’s install TensorFlow GPU.

Return to Table of Contents

Install TensorFlow GPU

Your system must have the following requirements:

  • Nvidia GPU (GTX 650 or newer…I’ll show you later how to find out what Nvidia GPU version is in your computer)
  • CUDA Toolkit v9.0 (we will install this later in this tutorial)
  • CuDNN v7.0.5 (we will install this later in this tutorial)
  • Anaconda with Python 3.7+

Here is a good tutorial that walks through the installation, but I’ll outline all the steps below.

Install CUDA Toolkit v9.0

The first thing we need to do is to install the CUDA Toolkit v9.0. Go to this link.

Select your operating system. In my case, I will select Windows, x86_64, Version 10, and exe (local).

7-select-target-platformJPG

Download the Base Installer as well as all the patches. I downloaded all these files to my Desktop. It will take a while to download, so just wait while your computer downloads everything.

8-base-installerJPG
9-patchesJPG

Open the folder where the downloads were saved to.

10-downloaded-to-desktopJPG

Double-click on the Base Installer program, the largest of the files that you downloaded from the website.

Click Yes to allow the program to make changes to your device.

Click OK to extract the files to your computer.

11-click-okJPG
12-extract-filesJPG

I saw this error window. Just click Continue.

13-error-message-continueJPG

Click Agree and Continue.

14-agree-and-continueJPG

If you saw that error window earlier… “…you may not be able to run CUDA applications with this driver…,” select the Custom (Advanced) install option and click Next. Otherwise, do the Express installation and follow all the prompts.

15-custom-advancedJPG

Uncheck the Driver components, PhysX, and Visual Studio Integration options. Then click Next.

Click Next.

16-installation-location-click-nextJPG

Wait for everything to install.

17-prepare-for-installationJPG

Click Close.

18-installer-finishedJPG

Delete  C:\Program Files\NVIDIA Corporation\Installer2.

19-delete-installer-2JPG

Double-click on Patch 1.

Click Yes to allow changes to your computer.

Click OK.

20-click-okJPG

Click Agree and Continue.

21-agree-and-continueJPG

Go to Custom (Advanced) and click Next.

22-custom-advancedJPG

Click Next.

23-click-nextJPG

Click Close.

The process is the same for Patch 2. Double-click on Patch 2 now.

Click Yes to allow changes to your computer.

Click OK.

Click Agree and Continue.

Go to Custom (Advanced) and click Next.

Click Next.

24-click-nextJPG

Click Close.

25-click-closeJPG

The process is the same for Patch 3. Double-click on Patch 3 now.

Click Yes to allow changes to your computer.

Click OK.

Click Agree and Continue.

Go to Custom (Advanced) and click Next.

Click Next.

Click Close.

The process is the same for Patch 4. Double-click on Patch 4 now.

Click Yes to allow changes to your computer.

Click OK.

Click Agree and Continue.

Go to Custom (Advanced) and click Next.

Click Next.

After you’ve installed Patch 4, your screen should look like this:

26-installed-patch-4JPG

Click Close.

To verify your CUDA installation, go to the command terminal on your computer, and type:

nvcc --version
26-verify-cuda-versionJPG

Return to Table of Contents

Install the NVIDIA CUDA Deep Neural Network library (cuDNN)

Now that we installed the CUDA 9.0 base installer and its four patches, we need to install the NVIDIA CUDA Deep Neural Network library (cuDNN). Official instructions for installing are on this page, but I’ll walk you through the process below.

Go to https://developer.nvidia.com/rdp/cudnn-download

Create a user profile if needed and log in.

27-become-a-memberJPG

Go to this page: https://developer.nvidia.com/rdp/cudnn-download

Agree to the terms of the cuDNN Software License Agreement.

28-agree-to-termsJPG

We have CUDA 9.0, so we need to click cuDNN v7.6.4 (September 27, 2019), for CUDA 9.0.

29-download-cudnnJPG

I have Windows 10, so I will download cuDNN Library for Windows 10.

30-cudnn-windows10JPG

In my case, the zip file downloaded to my Desktop. I will unzip that zip file now, which will create a new folder of the same name…just without the .zip part. These are your cuDNN files. We’ll come back to these in a second.

31-unzipJPG

Before we get going, let’s double check what GPU we have. If you are on a Windows machine, search for the “Device Manager.”

32-my-gpuJPG

Once you have the Device Manager open, you should see an option near the top for “Display Adapters.” Click the drop-down arrow next to that, and you should see the name of your GPU. Mine is NVIDIA GeForce GTX 1060.

33-nvidia-control-panelJPG

If you are on Windows, you can also check what NVIDIA graphics driver you have by right-clicking on your Desktop and clicking the NVIDIA Control Panel. My version is 430.86. This version fits the requirements for cuDNN.

Ok, now that we have verified that our system meets the requirements, lets navigate to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0, your CUDA Toolkit directory.

34-navigate-to-cuda-toolkit-directoryJPG

Now go to your cuDNN files, that new folder that was created when you did the unzipping. Inside that folder, you should see a folder named cuda. Click on it.

35-named-cudaJPG

Click bin.

36-click-binJPG

Copy cudnn64_7.dll to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin. Your computer might ask you to allow Administrative Privileges. Just click Continue when you see that prompt.

Now go back to your cuDNN files. Inside the cuda folder, click on include. You should see a file named cudnn.h.

37-click-includeJPG
38-cudnnhJPG

Copy that file to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\include. Your computer might ask you to allow Administrative Privileges. Just click Continue when you see that prompt.

Now go back to your cuDNN files. Inside the cuda folder, click on lib -> x64. You should see a file named cudnn.lib. 

39-cudnn-libJPG

Copy that file to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\lib\x64. Your computer might ask you to allow Administrative Privileges. Just click Continue when you see that prompt.

If you are using Windows, do a search on your computer for Environment Variables. An option should pop up to allow you to edit the Environment Variables on your computer.

Click on Environment Variables.

40-environment-variablesJPG

Make sure you CUDA_PATH variable is set to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0.

41-cuda-path-setJPG

I recommend restarting your computer now.

Return to Table of Contents

Install TensorFlow GPU

Now we need to install TensorFlow GPU. Open a new Anaconda terminal window. 

42-anaconda-terminal-windowJPG

Create a new Conda virtual environment named tensorflow_gpu by typing this command:

conda create -n tensorflow_gpu pip python=3.6

Type y and press Enter.

43-conda-virtual-environmentJPG

Activate the virtual environment.

conda activate tensorflow_gpu
44-activate-virtual-envJPG

Install TensorFlow GPU for Python.

pip install --ignore-installed --upgrade tensorflow-gpu==1.9

Wait for TensorFlow GPU to install.

Now let’s test the installation. Launch the Python interpreter.

python

Type this command.

import tensorflow as tf

If you don’t see an error, TensorFlow GPU is successfully installed.

45-test-installation-1JPG

Now type this:

hello = tf.constant('Hello, TensorFlow!')
46-now-type-thisJPG

And run this command. It might take a few minutes to run, so just wait until it finishes:

sess = tf.Session()
47-all-finishedJPG

Now type this command to complete the test of the installation:

print(sess.run(hello))
48-complete-the-testJPG

You can further confirm whether TensorFlow can access the GPU, by typing the following into the Python interpreter (just copy and paste into the terminal window while the Python interpreter is running).

tf.test.is_gpu_available(
    cuda_only=True,
    min_cuda_compute_capability=None
)
49-further-testJPG

To exit the Python interpreter, type:

exit()
50-exit-the-editorJPG

And press Enter.

Return to Table of Contents

Install TensorFlow Models

Now that we have everything setup, let’s install some useful libraries. I will show you the steps for doing this in my TensorFlow GPU virtual environment, but the steps are the same for the TensorFlow CPU virtual environment.

Open a new Anaconda terminal window. Let’s take a look at the list of virtual environments that we can activate.

conda env list
51-conda-env-listJPG

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

Install the libraries. Type this command:

conda install pillow lxml jupyter matplotlib opencv cython

Press y to proceed.

Once that is finished, you need to create a folder somewhere that has the TensorFlow Models  (e.g. C:\Users\addis\Documents\TensorFlow). If you have a D drive, you can also save it there as well.

In your Anaconda terminal window, move to the TensorFlow directory you just created. You will use the cd command to change to that directory. For example:

cd C:\Users\addis\Documents\TensorFlow

Go to the TensorFlow models page on GitHub: https://github.com/tensorflow/models.

Click the button to download the zip file of the repository. It is a large file, so it will take a while to download.

52-download-zipJPG

Move the zip folder to the TensorFlow directory you created earlier and extract the contents.

Rename the extracted folder to models instead of models-master. Your TensorFlow directory hierarchy should look like this:

TensorFlow

  • models
    • official
    • research
    • samples
    • tutorials

Return to Table of Contents

Install Protobuf

Now we need to install Protobuf, which is used by the TensorFlow Object Detection API to configure the training and model parameters.

Go to this page: https://github.com/protocolbuffers/protobuf/releases

Download the latest *-win32.zip release (assuming you are on a Windows machine).

53-download-latestJPG

Create a folder in C:\Program Files named it Google Protobuf.

Extract the contents of the downloaded *-win32.zip, inside C:\Program Files\Google Protobuf

54-extract-contentsJPG

Search for Environment Variables on your system. A window should pop up that says System Properties.

55-system-propertiesJPG

Click Environment Variables.

Go down to the Path variable and click Edit.

56-path-variable-editJPG

Click New.

57-click-newJPG

Add C:\Program Files\Google Protobuf\bin

You can also add it the Path System variable.

Click OK a few times to close out all the windows.

Open a new Anaconda terminal window.

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your \TensorFlow\models\research\ directory and run the following command:

for /f %i in ('dir /b object_detection\protos\*.proto') do protoc object_detection\protos\%i --python_out=.

Now go back to the Environment Variables on your system. Create a New Environment Variable named PYTHONPATH (if you don’t have one already). Replace C:\Python27amd64 if you don’t have Python installed there. Also, replace <your_path> with the path to your TensorFlow folder.

C:\Python27amd64;C:\<your_path>\TensorFlow\models\research\object_detection
58-pythonpathJPG

For example:

C:\Python27amd64;C:\Users\addis\Documents\TensorFlow

Now add these two paths to your PYTHONPATH environment variable:

C:\<your_path>\TensorFlow\models\research\
C:\<your_path>\TensorFlow\models\research\slim

Return to Table of Contents

Install COCO API

Now, we are going to install the COCO API. You don’t need to worry about what this is at this stage. I’ll explain it later.

Download the Visual Studios Build Tools here: Visual C++ 2015 build tools from here: https://go.microsoft.com/fwlink/?LinkId=691126

Choose the default installation.

59-visual-studioJPG

After it has installed, restart your computer.

60-setup-completedJPG

Open a new Anaconda terminal window.

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your \TensorFlow\models\research\ directory and run the following command to install pycocotools (everything below goes on one line):

pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
61-pycocotools-installedJPG

If it doesn’t work, install git: https://git-scm.com/download/win

Follow all the default settings for installing Git. You will have to click Next several times.

Once you have finished installing Git, run this command (everything goes on one line):

pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI

Return to Table of Contents

Test the Installation

Open a new Anaconda terminal window.

I’m going to activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your \TensorFlow\models\research\object_detection\builders directory and run the following command to test your installation.

python model_builder_test.py

You should see an OK message.

62-test-your-installationJPG

Return to Table of Contents

Install LabelImg

Now we will install LabelImg, a graphical image annotation tool for labeling object bounding boxes in images.

Open a new Anaconda/Command Prompt window.

Create a new virtual environment named labelImg by typing the following command:

conda create -n labelImg

Activate the virtual environment.

conda activate labelImg

Install pyqt.

conda install pyqt=5

Click y to proceed.

Go to your TensorFlow folder, and create a new folder named addons.

63-new-folder-named-addonsJPG

Change to that directory using the cd command.

Type the following command to clone the repository:

git clone https://github.com/tzutalin/labelImg.git

Wait while labelImg downloads.

You should now have a folder named addons\labelImg under your TensorFlow folder.

Type exit to exit the terminal.

Open a new terminal window.

Activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your TensorFlow\addons\labelImg directory.

Type the following commands, one right after the other.

conda install pyqt=5
conda install lxml
pyrcc5 -o libs/resources.py resources.qrc
exit

Test the LabelImg Installation

Open a new terminal window.

Activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your TensorFlow\addons\labelImg directory.

Type the following commands:

python labelImg.py

If you see this window, you have successfully installed LabelImg. Here is a tutorial on how to label your own images. Congratulations!

64-label-imgJPG

Return to Table of Contents

Recognize Objects Using Your WebCam

Approach

Note: This section gets really technical. If you know the basics of computer vision and deep learning, it will make sense. Otherwise, it will not. You can skip this section and head straight to the Implementation section if you are not interested in what is going on under the hood of the object recognition application we are developing.

In this project, we use OpenCV and TensorFlow to create a system capable of automatically recognizing objects in a webcam. Each detected object is outlined with a bounding box labeled with the predicted object type as well as a detection score.

The detection score is the probability that a bounding box contains the object of a particular type (e.g. the confidence a model has that an object identified as a “backpack” is actually a backpack).

The particular SSD with Inception v2 model used in this project is the ssd_inception_v2_coco model. The ssd_inception_v2_coco model uses the Single Shot MultiBox Detector (SSD) for its architecture and the Inception v2 framework for feature extraction.

Single Shot MultiBox Detector (SSD)

Most state-of-the-art object detection methods involve the following stages:

  1. Hypothesize bounding boxes 
  2. Resample pixels or features for each box
  3. Apply a classifier

The Single Shot MultiBox Detector (SSD) eliminates the multi-stage process above and performs all object detection computations using just a single deep neural network.

Inception v2

Most state-of-the-art object detection methods based on convolutional neural networks at the time of the invention of Inception v2 added increasingly more convolution layers or neurons per layer in order to achieve greater accuracy. The problem with this approach is that it is computationally expensive and prone to overfitting. The Inception v2 architecture (as well as the Inception v3 architecture) was proposed in order to address these shortcomings.

Rather than stacking multiple kernel filter sizes sequentially within a convolutional neural network, the approach of the inception-based model is to perform a convolution on an input with multiple kernels all operating at the same layer of the network. By factorizing convolutions and using aggressive regularization, the authors were able to improve computational efficiency. Inception v2 factorizes the traditional 7 x 7 convolution into 3 x 3 convolutions.

Szegedy, Vanhoucke, Ioffe, Shlens, & Wojna, (2015) conducted an empirically-based demonstration in their landmark Inception v2 paper, which showed that factorizing convolutions and using aggressive dimensionality reduction can substantially lower computational cost while maintaining accuracy.

Data Set

The ssd_inception_v2_coco model used in this project is pretrained on the Common Objects in Context (COCO) data set (COCO data set), a large-scale data set that contains 1.5 million object instances and more than 200,000 labeled images. The COCO data required 70,000 crowd worker hours to gather, annotate, and organize images of objects in natural environments.

Software Dependencies

The following libraries form the object recognition backbone of the application implemented in this project:

  • OpenCV, a library of programming functions for computer vision.
  • Pillow, a library for manipulating images.
  • Numpy, a library for scientific computing.
  • Matplotlib, a library for creating graphs and visualizations.
  • TensorFlow Object Detection API, an open source framework developed by Google that enables the development, training, and deployment of pre-trained object detection models.

Return to Table of Contents

Implementation

Now to the fun part, we will now recognize objects using our computer webcam.

Copy the following program, and save it to your TensorFlow\models\research\object_detection directory as object_detection_test.py .

# Import all the key libraries
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import cv2

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from utils import label_map_util
from utils import visualization_utils as vis_util

# Define the video stream
cap = cv2.VideoCapture(0)  

# Which model are we downloading?
# The models are listed here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
MODEL_NAME = 'ssd_inception_v2_coco_2018_01_28'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to the frozen detection graph. 
# This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add the correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')

# Number of classes to detect
NUM_CLASSES = 90

# Download Model
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
    file_name = os.path.basename(file.name)
    if 'frozen_inference_graph.pb' in file_name:
        tar_file.extract(file, os.getcwd())

# Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

# Loading label map
# Label maps map indices to category names, so that when our convolution network 
# predicts `5`, we know that this corresponds to `airplane`.  Here we use internal 
# utility functions, but anything that returns a dictionary mapping integers to 
# appropriate string labels would be fine
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)


# Helper code
def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)
	
# Detection
with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        while True:

            # Read frame from camera
            ret, image_np = cap.read()
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            # Extract image tensor
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            # Extract detection boxes
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            # Extract detection scores
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            # Extract detection classes
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            # Extract number of detectionsd
            num_detections = detection_graph.get_tensor_by_name(
                'num_detections:0')
            # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores),
                category_index,
                use_normalized_coordinates=True,
                line_thickness=8)

            # Display output
            cv2.imshow('object detection', cv2.resize(image_np, (800, 600)))

            if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break


print("We are finished! That was fun!")

Open a new terminal window.

Activate the TensorFlow GPU virtual environment.

conda activate tensorflow_gpu

cd into your TensorFlow\models\research\object_detection directory.

At the time of this writing, we need to use Numpy version 1.16.4. Type the following command to see what version of Numpy you have on your system.

pip show numpy

If it is not 1.16.4, execute the following commands:

pip uninstall numpy
pip install numpy==1.16.4

Now run, your program:

python object_detection_test.py

In about 30 to 90 seconds, you should see your webcam power up and object recognition take action. That’s it! Congratulations for making it to the end of this tutorial!

object_detection_resultsJPG

Keep building!

Return to Table of Contents