Predicting Steering Angles using Deep Learning and Behavioral Cloning

The Self-Driving Car engineer program designed by Udacity is currently the only machine learning program which is focused entirely on autonomous driving. The program offers worldclass traning staff and prominent partners like Nvidia or Mercedes Benz.

Besides interesting lessons and exercises the program expects students to prove their deep learning skills in real world projects.

The third project of this program is called Behavioral Cloning and we will build and train a deep neural network to drive a car like us!

The goals / steps of this project

  • Use the simulator to collect data of good driving behavior
  • Build, a convolution neural network in Keras that predicts steering angles from images
  • Train and validate the model with a training and validation set
  • Test that the model successfully drives around track one without leaving the road
  • Summarize the results with a written report

In this project one has to create a machine learning model which is able to mimic a human driver. To be more specific: the model has to be able to drive a car safely through an unseen track in a simulator. The simulator provided by Udacity has two modes:

In manual mode a human driver can control the car with a game controller or keyboard. The simulator will record multiple images every seconds to disk of three on-board cameras. It will also record telemetry data like steering, break, throttle and speed to a CSV file.


In autonomous mode the simulator streams its camera pictures to a python script using SocketIO and can receive commands for throttle and steering. Our goal is to design a neural network with the deep learning framework Keras to remote control the car safely through the track.

The simulator also offers two different tracks. To train our model we will use data of the first track. To validate it we will use the second track which is more demanding.


Loading the dataset

We are using only the samples for the Track 1 provided by Udacity. It can be downloaded from here. The Track 2 is used to test the trained model.

The dataset is stored in the ‘driving_log.csv’ file. This csv file has paths to the three left, center and right camera image files on the disk, and their corresponding steering angles.

After loading and exploring the dataset using the histogram of steering angles we can see that the dataset is very un-balanced with most of the entries for driving straight.

The dataset is biased towards small steering angles and certain directions as shown in the histogram below.


Partitioning the dataset

Before the dataset is partioned the samples are shuffled for a better generalization. The dataset is then split into training (80%), validation (19%) and testing (1%).

Dataset augmentation

Data preparation is required when working with neural network and deep learning models. We will use data augmentation to generate new images, especially for the ones with a lower value in the histogram of steering angles.

Most of the methods are wrappers of implementations in Keras.
The data augmentation methods used in this project include:

  • zoom
  • rotation
  • height and width shift
  • horizontal flip
  • channel shift
  • crop and resize
  • brightness shift

All these transformations are randomly applied in memory on the original image using a generator, a Keras technique to avoid storing all these new images on the disk.

A very important way to introduce images with new steering angles is to use the side cameras to simulate recovery paths.
If you only use the center camera pictures your car will soon leave the track and crash. This is due to the fact that when recording only the ideal driving path, it will not know what to do when being slightly off. One way to solve this would be to record new data while driving from the side of the street back to the middle.
Bellow we can see the original image in the top-left and the result of each method independently.
And here we can see several images after randomly combining all these methods. This transformed images are sent as input to the neural network, and never the original images.


Using these data augmentation methods an infinite number of new images can be generated. In this project images with steering angles near 0.0 (driving straight) are less augmented than the ones at the margins.
This histogram equalization is needed to give the model the chance to learn how to recover in the cases when the car is near the margins.

Keras generator for data augmentation

The memory is a limited resource and is not possible and efficient to load all images at once. Therefore we will utilize Keras’ generator to sample images such that all angles have the similar probability no matter how they are represented in the dataset. This alleviates any problem we may encounter due to model having a bias towards driving straight.
To implement this, a hit dictionary for each angle range is maintained, and each hit bin is allowed a maximum percentage of the the total number of already augmented images.

Given a row from the .csv file, this method randomly reads the center, left or right image. It applies all image augmentation methods to it and returns the augmented image and transformed steering angle. This method is used in the Keras generator as shown below.

Model architecture

We have to design a neural network which takes camera pictures as inputs and outputs steering angles. You can come up with your own neural network architecture. If you want to save time you can use the following two architectures as a starting point:

* – Learning a Driving Simulator
* End to End Learning for Self-Driving Cars by nvidia

I used the network as reference which uses a simple yet elegant architecture to predict steering angles.’s approach to Artificial Intelligence for self-driving cars is based on an agent that learns to clone driver behaviors and plans maneuvers by simulating future events in the road.


The network has a normalization layer followed by 3 convolutional layers with 8×8 and 5×5 kernel size and strides of [2,2]. After the convolutional layers the output gets flatten and then processed by one fully connected layer. The whole network will have roughly 1 milion parameters and will offer great training performance on modest hardware.

The whole network can be implemented in keras with just a few lines of code:

To make the architecture more robust and to prevent overfitting dropout layers are added to the network. Dropout disables neurons in your network by a given probability and prevents co-adaption of features.

A ‘mean_squared_error’ loss function is used. As for the optimization algorithm Adam is used and configured with a learning rate of 0.0001. This learning rate was selected after many trial and error. With larger learning rates (bigger or equal to 0.001) the car was making abrupt steering angle changes.

Training strategy

The dataset provided by the project contains a csv file of metadatas and a list of images. The csv file was first parsed and image paths and steering angles were extracted, shuffled and then splitted into training, validation and test data.

An image generator that takes the image metadata as input and generates batches of training data was implemented.
By using the right and left camera images with a 0.20 degrees shift the dataset size became 3 times larger. The input images were augmented using different methods such as zoom, scale, brightness shift, horizontal flip, height and width shift and color channel shift. Images were as well cropped out to keep only the region of interest under the horizon and above the car board. The images were then resized to 120 × 60 pixels before being passed into the model.

The ADAM optimizer with a learning rate of 0.0001 was used. The data is trained over 10 epochs of 24000 samples each.

Model fitting

One of the key strategies of this solution is to group the steering angles into a fixed number of hit bins and to augument only a certain number of images of each bin. This way each angle range will be equaly learned by the model. This will prevent the model of having a bias towards driving straight. For better results the bin hits are reinitialized after each epoch. To implement this a LifecycleCallback used as shown bellow.

The network was trained for only 10 epochs using 25,000 augmented example per epoch. Training takes roughly 10 minutes on a GTX970 GPU.

The histogram after data augmentation

In the initial histogram we could see that the steering angles were mostly near the 0.0 angle. We noted then that, in order to teach our car to take turns, we have to generate many of new images with non-zero angles by augmentation.

The new histogram after augmentation is much better balanced.

The model loss

As shown in the plot bellow, both training and validation losses systematically dropped during the 10 training epochs. This is a clear sign that the model did not overfit the data and generalized well.

Visualization of convolutional layers

Bellow we can see what the first 3 convolutional layers learned.

Conv Layer 1


Conv Layer 2

Conv Layer 3


Testing in simulation

The best way to test the model is to let the car drive autonomously on both training Track 1 and test unseen Track 2.
During the Track 1 the car drives very smooth and does not leave the road. In the case of unseen Track 2 it stays on the road most of the time, but it also hits the margins twice.
The model was configured and trained good, but it still needs improvements to generalize better in the cases of unseen roads.

Track 1

Track 2




Leave a Reply

Your email address will not be published.