Difference Between Supervised and Unsupervised Learning

In this post, I will explain the difference between supervised and unsupervised learning.

What is Supervised Learning?
What is Unsupervised Learning?

What is Supervised Learning?

Imagine you have a computer. The computer is really good at doing math and making complex calculations. You want to “train” your computer to predict the price for any home in the United States. So you search around the Internet and find a dataset. The dataset contains the following information for 100,000 houses that sold during the last 30 days in various cities across the United States:

Square footage
Number of bathrooms
Number of bedrooms
Number of garages
Year the house was constructed
Average size of the house’s windows
Sale price

Variables 1 through 6 above are the dataset’s features (also known as input variables, attributes, independent variables, etc.). Variable 7, the house sale price, is the output variable (or target variable) that we want our computer to get good at predicting.

So the question is…how do we train our computer to be a good house price predictor? We need to write a software program. The program needs to take as input the 100,000-house dataset that I mentioned earlier. This program needs to then find a mathematical relationship between variables 1-6 (features) and variable 7 (output variable). Once the program has found a relationship between the features (input) and the output, it can predict the sale price of a house it has never seen before.

Let’s take a look at an analogy. Supervised learning is like baking a cake.

Suppose you had a cake machine that was able to cook many different types of cake from the same set of ingredients. All you have to do as the cake chef is to just throw the ingredients into the machine, and the cake machine will automatically make the cake.

The “features,” the inputs to the cake machine, are the following ingredients:

Butter
Sugar
Vanilla Extract
Flour
Chocolate
Eggs
Salt

The output is the type of cake:

Vanilla Cake
Pound Cake
Chocolate Cake
Dark Chocolate Cake

Different amounts of each ingredient will produce different types of cake. How does the cake machine know what type of cake to produce given a set of ingredients?

Fortunately, a machine learning engineer has written a software program (containing a supervised learning algorithm) that is running inside the cake machine. The program was pre-trained on a dataset containing 1 million cakes. Each entry (i.e. example or training instance) in this gigantic dataset contained two key pieces of data:

How much of each ingredient was used in the making of that given cake
The type of cake that was produced

During the training phase of the program, the program found a fairly accurate mathematical relationship between the amount of each ingredient and the cake type. And now, when the cake machine is provided with a new set of ingredients by the chef, it automatically “knows” what type of cake to produce. Pretty cool huh!

What I described above is called supervised learning. It is called supervised learning because the input data (which the supervised learning algorithm used to train) is already labeled with the “correct” answers (e.g. the type of cake in our example above; or the sale price values for those 100,000 homes from the earlier example I presented in this post.). We supervised the learning algorithm by “telling” it what the output (cake type) should be for 1 million different sets of input values (ingredients).

The fundamental idea of a supervised learning algorithm is to learn a mathematical relationship between inputs and outputs so that it can predict the output value given an entirely new set of input values.

Let’s take a look at a common supervised learning algorithm: linear regression. The goal of linear regression is to find a line that best fits the relationship between input and output. For example, the learning algorithm for linear regression could be trained on square footage and sale price data for 100,000 homes. It would learn the mathematical relationship (e.g. a straight line in the form y = mx + b) between square footage and the sale price of a home.

With this relationship (i.e. line of best fit) in hand, the algorithm can now easily predict the sale price of any home just by being provided with the home’s square footage value.

For example, let’s say we wanted to find the price of a house that is 2000 ft². We feed 2,000 ft²into the algorithm. The algorithm predicts a sale price of $500,000.

As you can imagine, before we make any predictions using a supervised learning algorithm, it is imperative to train it on a lot of data. Lots and lots of data. The more the merrier.

In the example above, to get that best fit line, we want to feed it with as many examples of houses as possible during training. The more data it has, the better its predictions will be when given new data. This, my friends, is supervised learning, and it is incredibly powerful. In fact, supervised learning is the bread and butter of most of the state-of-the-art machine learning techniques today, such as deep learning.

Now, let’s look at unsupervised learning.

Return to Table of Contents

What is Unsupervised Learning?

Let’s suppose you have the following dataset for a group of 13 people. For each person, we have the following features:

Height (in inches)
Hair Length (in inches)

Let’s plot the data to see what it looks like:

In contrast to supervised learning, in this case there is no output value that we are trying to predict (e.g. house sale price, cake type, etc.). All we have are features (inputs) with no corresponding output variables. In machine learning jargon, we say that the data points are unlabeled.

So instead of trying to force the dataset to fit a straight line or some sort of predetermined mathematical model, we let an unsupervised learning algorithm find a pattern all on its own. And here is what we get:

Aha! It looks like the algorithm found some sort of pattern in the data. The data is clustered into two distinct groups. What these clusters mean, we do not know because the data points are unlabeled. However, what we do suspect given our prior knowledge of this dataset, is that the blue dots are males, and the red dots are females given the attributes are height and hair length.

What I have described above is known as unsupervised learning. It is called unsupervised because the input dataset is unlabeled. There is no output variable we are trying to predict. There is no prior mathematical model we are trying to fit the data to. All we want to do is let the algorithm find some sort of structure or pattern in the data. We let the data speak for itself.

Any time you are given a dataset and want to group similar data points into clusters, you’re going to want to use an unsupervised learning algorithm.

Return to Table of Contents

How to Connect Arduino to ROS

How do we connect ROS to an actual embedded system that operates in the real, physical world? I’ll show you how to do this now using Arduino.

Arduino is a popular microcontroller for building electronics projects. A microcontroller is a bunch of circuits that do stuff, such as accepting data input, doing calculations, and producing output. With respect to robotics development, Arduino would be the “brain” of a robot.

Fortunately, ROS can integrate with Arduino. We will install some software that will enable your Arduino to be a bonafide ROS node that can do everything a normal node can do, such as publish and subscribe to ROS messages.

Here are the official steps for interfacing Arudino with ROS, but I’ll walk you through the process below.

How to Install Arduino on Ubuntu Linux
How to Blink an LED (Light-Emitting Diode) Using ROS and Arduino

Directions

How to Install Arduino on Ubuntu Linux

First, let’s download the Arduino IDE (Linux 64 bit version) to our computer. I will follow these instructions for installing the Arduino IDE on Ubuntu. Go to this website, and download the software:

Save the file. It will be saved as tar.xz format to your Downloads folder.

Open a new terminal window.

Move to the Downloads folder (or wherever you saved the tar.xz file).

cd Downloads

Run this command to extract the files (substitute FILENAME with the name of the file you just downloaded):

tar xvf <FILENAME>

In my case, I will run:

tar xvf arduino-1.8.10-linux64.tar.xz

You will see a bunch of file names print out to your screen.

If you type the dir command, you will see the new folder. Let’s move that folder to the home directory. You can cut and paste it into the home directory using the file manager (the file cabinet on the left of the screen. Go there, then go to the Downloads folder, cut the file and paste it into the Home directory.

Now open up a new terminal window and type:

cd arduino-1.8.10

Type dir to see what files are inside.

To install the IDE, type:

sudo ./install.sh

Here is the output. You should see a “done” message. A desktop icon will also be present. You can activate it by clicking on it and allowing permissions at the prompt.

Now, get your Arduino and connect it to the USB port on your computer.

Start Arduino by going into a new terminal window and typing:

arduino

You might see an error message like this:

Failed to load module “canberra-gtk-module” … but already installed

To resolve that error, press CTRL+C and close the terminal window.

Open a new terminal window, and type:

sudo apt-get install libcanberra-gtk-module

Now open a new terminal window, and type:

arduino

Let’s see if everything works by trying to blink the light-emitting diode (LED) on your computer.

Go to File – > Examples -> 01.Basics, and choose Blink.

Try to upload that code to your Arduino by clicking the right arrow on the upper left of your screen.

You should see an error about the “Serial port not selected”.

Close out of your Ubuntu Virtual Machine completely. Do not save the state.

Set Up the Serial Port for VirtualBox With Ubuntu

Assuming you are using Windows, go to your Device Manager. Search for that in your computer in the bottom left of your desktop.

Under Device Manager, you should see Ports (COM & LPT). Note that Arduino is port 3 (Make sure your Arduino board is connected to the USB port of your computer).

Open your VirtualBox.

Click Settings and go down to Serial Ports.

Make sure your settings look exactly like this:

Note that the little box next to “Enable Serial Port” is checked.

Side Note: Any time you unplug your Arduino….say perhaps after using it within Ubuntu Linux or if you shutdown your PC….be sure to disable the “Enable Serial Port” option before restarting Ubuntu Linux in your Virtual Box. Otherwise, your Ubuntu Linux session will NOT launch. I’ve made this mistake numerous times, and it is frustrating.

After you are done, click OK.

Restart the VirtualBox with Ubuntu.

Open a new terminal window and type:

ls -al /dev/ttyS0

Here is the output:

Now we need to give the IDE permissions to access the device.

In a new terminal window, find out your username. Type:

whoami

Now type the following commands, replacing YOUR_USER_NAME with what you found above:

sudo usermod -a -G dialout YOUR_USER_NAME

sudo chmod 660 /dev/ttyS0

Reboot your machine:

sudo reboot

Open up a terminal window and launch Arduino by typing:

arduino

Go to Tools -> Port, and you should see /dev/ttyS0. This is your Arduino board that is connected to the USB port of your computer. Make sure /dev/ttyS0 is checked.

Now open the Blink sketch again. Go to File – > Examples -> 01.Basics, and choose Blink.

Click the Upload button…the right arrow in the upper left of your screen.

The LED on your Arduino should be blinking! If it is not, go back to the beginning of this tutorial and follow the steps carefully.

To turn off the blinking light, open up a new sketch and upload it to your board. Go to File -> New.

Integrate Arduino With ROS

Now that we know Arduino is working, we need to integrate it with ROS. Here are the official instructions. I’ll walk you through the steps below.

Let’s install the necessary packages.

Close Arduino. Then type the following commands in a new terminal window (these will take a while to download so be patient):

sudo apt-get install ros-melodic-rosserial-arduino

sudo apt-get install ros-melodic-rosserial

Open the IDE by typing arduino and go to File -> Preferences. Make a note of the Sketchbook location. Mine is:

/home/ros/Arduino

Open a new terminal window and go to the sketchbook location you noted above. I’ll type:

cd Arduino

Type the dir command to see the list of folders.

Go to the libraries directory.

cd libraries

Within that directory, run the following command to build the Arduino library that will be used by ROS (don’t leave out that period that comes at the end of the command):

rosrun rosserial_arduino make_libraries.py .

Type the dir command to see the list of folders. You should now see the ros_lib library.

Make sure the Arduino IDE is closed. Now open it again.

You should see some sample code. Now, let’s take a look at the Blink example.

Go to File -> Examples -> ros_lib

Return to Table of Contents

How to Blink an LED (Light-Emitting Diode) Using ROS and Arduino

The Blink example is analogous to a “Hello World” program. Blinking an LED is the most basic thing we can do to make sure the hardware is working properly and that it accepts the software we are developing on our laptop. The goal of the Blink example is to toggle an LED on and off.

In this example, Arduino is going to be considered a Subscriber node. It will subscribe to a topic called toggle_led. Publishing a message to that topic causes the LED to turn on. Publishing a message to the topic again causes the LED to turn off.

Go to File -> Examples -> ros_lib and open the Blink sketch.

Now we need to upload the code to Arduino. Make sure your Arduino is plugged into the USB port on your computer.

Upload the code to your Arduino using the right arrow button in the upper left of your screen. When you upload the code, your Arduino should flicker a little bit.

Open a new terminal window and type:

roscore

In a new terminal window, launch the ROS serial server. This command is explained here on the ROS website. It is necessary to complete the integration between ROS and Arduino:

rosrun rosserial_python serial_node.py /dev/ttyS0

Now let’s turn on the LED by publishing a single empty message to the /toggle_led topic. Open a new terminal window and type:

rostopic pub toggle_led std_msgs/Empty --once

The LED on the Arduino should turn on. Note the yellow light is on (my Arduino is inside a protective case).

Now press the Up arrow in the terminal and press ENTER to run this code again. You should see the LED turn off. You might also see a tiny yellow light blinking as well. Just ignore that one…you’re interested in the big yellow light that you’re able to turn off and on by publishing single messages to the /toggle_led topic.

Return to Table of Contents

That’s it! You have now seen how you can integrate Arduino with ROS. To turn off your Arduino, all you need to do is disconnect it.

Implementing the ROS Turtlesim Project With rospy

You might have seen my previous tutorial where we ran the built-in ROS turtlesim program. We are going to explore this application further in this tutorial.

Turtlesim isn’t the most exciting application, but it is a popular tool for learning the basics of ROS before working with real robots (Click here for more information on turtlesim from the official ROS website). You can think of the turtle as an actual robot. All of the things you can do with this turtle, you can do with a real, physical robot. Let’s get started!

Move the Turtle
Get the Turtle’s Position
Move the Turtle a Specific Distance
Work With ROS Services and ROS Parameters
Change the Background Color and Reset the Workspace

Directions

Move the Turtle

Let’s run turtlesim with rospy, the Python library for ROS.

Open up a new terminal window and type:

roscore

Start the turtlesim node by going to a new terminal window and typing:

rosrun turtlesim turtlesim_node

Open yet another terminal window and check out the list of topics that are currently active:

rostopic list

To move the turtle, we need to publish to the /turtle1/cmd_vel topic. But what type of messages can we publish to this topic? Let’s find out.

rostopic type /turtle1/cmd_vel

Here is the output:

What this means is that we need to publish messages of type geometry_msgs/Twist to the /turtle1/cmd_vel topic to get the turtle to move like we want it to.

So what does ‘geometry_msgs/Twist’ mean? We need to check out the definition of this data type.

rosmsg show geometry_msgs/Twist

The message you see on your terminal window expresses the velocity of the turtle in 3D space, broken into its linear and angular parts. With respect to the turtle, the two pieces of data that we can control are the linear velocity in the x-direction and the angular velocity along the z axis because the robot cannot move along either the z or y axes.

Check out this link at the ROS website if you want to dive into more detail about the different kinds of geometric data types, including Twist.

Let’s get the turtle moving. We need to create a node (i.e. program) and make sure that node publishes the velocity values to the /turtle1/cmd_vel topic. Let’s do that now.

Close out turtlesim using CTRL+C.

Open up a new terminal window and open the Linux text editor.

gedit

Type this Python code (credit to Lentin Joseph, author of Robot Operating System for Absolute Beginners).

Save the file as move_turtle.py. Make sure to save it to your catkin_ws/src/hello_world/scripts folder.

The code above imports rospy (ROS library for Python), Twist (for us to work with the linear and angular velocity of the turtle), and sys (library that enables us to pass parameters via the Linux command line).

In a terminal window, type:

cd catkin_ws/src/hello_world/scripts

Make the node executable.

chmod +x move_turtle.py

chmod +x move_turtle.py

Open up a new terminal tab, launch ROS.

roscore

Open a new terminal tab, and launch the turtlesim node:

rosrun turtlesim turtlesim_node

Let’s get the turtle to move with a linear velocity of 3.0 m/s and an angular velocity of 1.5 radians/s. We need to pass those two values as arguments in the terminal. Open another terminal window and type:

rosrun hello_world move_turtle.py 3.0 1.5

You should see the turtle moving in circles, around and around.

Open a new terminal window, and let’s see the ROS computation graph. Type this command:

rqt_graph

When you have had enough, stop the program by typing CTRL+C, and close all terminal windows.

Return to Table of Contents

Get the Turtle’s Position

Now that we know how to work with the velocity of the turtle, let’s take a look at how we can get the position of the robot. To do that, we need to work with the /turtle1/pose topic.

Launch ROS.

roscore

Start the turtlesim node by going to a new terminal window and typing:

rosrun turtlesim turtlesim_node

Let’s display the messages that are published to the /turtle1/pose topic.

rostopic echo /turtle1/pose

Here is the output to the screen. You can see the position and velocity information.

We don’t want to have to type a command every time we want to get position and velocity information, so let’s modify move_turtle.py so that it subscribes to the /turtle1/pose topic.

First, find out what the message type is of the /turtle1/pose topic.

rostopic type /turtle1/pose

Let’s also get the message definition:

rosmsg show turtlesim/Pose

Press CTRL+C to stop the processes. You can exit all terminal windows.

Let’s get the pose using an actual Python-based node. Open up a new terminal window and open the Linux text editor.

gedit

Type this Python code (credit to Lentin Joseph, author of Robot Operating System for Absolute Beginners).

Save it as move_turtle_get_pose_.py to your catkin_ws/src/hello_world/scripts folder. Here is the code:

In a terminal window, type:

cd catkin_ws/src/hello_world/scripts

Make the node executable.

chmod +x move_turtle_get_pose_.py

chmod +x move_turtle_get_pose_.py

Open up a new terminal tab, launch ROS.

roscore

Open a new terminal tab, and launch the turtlesim node:

rosrun turtlesim turtlesim_node

Open a new terminal tab, and launch the move_turtle_get_pose node:

rosrun hello_world move_turtle_get_pose.py 1.0 0.5

You should see data output.

In a new terminal window, check out the computation graph:

rqt_graph

You can see that the move_turtle node is publishing position data to the /turtle1/pose topic. This topic is subscribed by the /turtlesim node. Similarly, the turtlesim node is publishing velocity data to the /turtle1/cmd_vel topic. This topic is being subscribed by the move_turtle node.

Press CTRL+C on all windows, and close the terminal screens.

Return to Table of Contents

Move the Turtle a Specific Distance

Now let’s move the turtle a distance that we specify.

Open up a new terminal window and open the Linux text editor.

gedit

Type this Python code (credit to Lentin Joseph, author of Robot Operating System for Absolute Beginners).

Save it as move_distance.py to your catkin_ws/src/hello_world/scripts folder. Here is the code:

In a terminal window, type:

cd catkin_ws/src/hello_world/scripts

Make the node executable. Type the command twice as sometimes you might get an error if you type it only once.

chmod +x move_distance.py

chmod +x move_distance.py

Open up a new terminal tab, launch ROS.

roscore

Open a new terminal tab, and launch the turtlesim node:

rosrun turtlesim turtlesim_node

Open a new terminal window, and launch the move_distance node. The arguments are linear velocity, angular velocity, and distance that you want the turtle to travel:

rosrun hello_world move_distance.py 0.4 0.0 7.0

Here is the data output:

And here is what your simulation screen should look like:

Press CTRL+C on all windows, and close the terminal screens.

Return to Table of Contents

Work With ROS Services and ROS Parameters

Ros Services

Up until now, we have been using what is called the publish/subscribe model for inter-node communication. A node publishes to a topic, and nodes that need the data published to that topic have to subscribe to that topic. This one-way communication model is flexible, but it is inefficient when we want two-way communication between nodes … for example if one node requests data from another node and wants a reply from that node.

Request / reply in ROS is done via what is known as a Service. A service is defined by a pair of messages: one for the request and one for the reply. If you want to know more details about ROS Services, check out this page on the ROS website.

Let’s check out a list of available ROS services in the turtlesim node. Open a new terminal window and type:

roscore

Start the turtlesim node by going to a new terminal tab and typing:

rosrun turtlesim turtlesim_node

Now in another terminal tab, type:

rosservice list

You should see this on your screen. We want to incorporate the reset service into our code. When the code calls the reset service, the workspace resets. Let’s find out the type of the reset service.

rosservice type /reset

stdsrvs/Empty is the type. It is a built-in service type in ROS. Let’s display some information on this service type.

rossrv show std_srvs/Empty

You can see that the field is empty. This is what we expected because no actual data is exchanged during this service. All the /reset service does is reset the workspace.

ROS Parameters

Before we implement the workspace reset service described above, let’s talk about what ROS parameters are.

Parameters are globally available values such as integers, floats, strings or booleans. We can also have parameters like background color.

Nodes (i.e. programs in C++/Python that exist inside ROS packages), can use parameters during runtime.

Parameters are stored inside the Parameter Server which is like a dictionary that contains the name of the parameter and the corresponding value of the parameter.

Let’s take a look at parameters in more detail now.

Open a new terminal tab and type the following command to retrieve a list of parameters:

rosparam list

Let’s get the value of the red background color parameter, for example:

rosparam get /background_b

That’s how you retrieve the value of a parameter. The syntax is:

rosparam get <name_of_parameter>

If you would like to change the value of a parameter, you do that as follows:

rosparam set <name_of_parameter> <desired_value>

The topic of the background color parameter is /turtle1/color_sensor. Let’s see what this topic is all about by displaying some information on it:

rostopic echo /turtle1/color_sensor

Press CTRL+C on all windows, and close the terminal screens.

Return to Table of Contents

Change the Background Color and Reset the Workspace

Let’s use some Python code that modifies the parameter for the background color and resets the workspace by calling the /reset service.

Open up a new terminal window and open the Linux text editor.

gedit

Type this Python code (credit to Lentin Joseph, author of Robot Operating System for Absolute Beginners).

Save it as turtle_service_param.py to your catkin_ws/src/hello_world/scripts folder.

In a terminal window, type:

cd catkin_ws/src/hello_world/scripts

Make the node executable.

chmod +x turtle_service_param.py

chmod +x turtle_service_param.py

Open up a new terminal tab, launch ROS.

roscore

Open a new terminal tab, and launch the turtlesim node:

rosrun turtlesim turtlesim_node

Open a new terminal tab, and launch the code:

rosrun hello_world turtle_service_param.py

Here is what your screen should look like:

Here is what the terminal outputs:

If you press CTRL+C on the same terminal window, and then rerun this command, you will see we get a different color:

rosrun hello_world turtle_service_param.py

Each time you run the command above, you will get a message that the service has been executed.

When you are finished, press CTRL+C on all windows, and close the terminal screens.

Congratulations! You have completed the turtlesim tutorial.

Return to Table of Contents