Self-Driving Car Engineer Nanodegree

Project: Vehicle Detection

The goals / steps of this project are the following:

Histogram of Oriented Gradients (HOG)

1. Explain how (and identify where in your code) you extracted HOG features from the training images.

The code for this step is contained in the code cells #3 (get_hog_features) and #4 (extract_features) of the IPython notebook. Both the functions were provided and explained in the lessons and I did not need to make any changes.

I started by reading in all the vehicle and non-vehicle images. Here is an example of one of each of the vehicle and non-vehicle classes:

alt text

I then explored different color spaces and different skimage.hog() parameters (orientations, pixels_per_cell, and cells_per_block). I grabbed random images from each of the two classes and displayed them to get a feel for what the skimage.hog() output looks like.

Here is an example using the YCrCb color space and HOG parameters of orientations=9, pixels_per_cell=(8, 8) and cells_per_block=(2, 2):

alt text

2. Explain how you settled on your final choice of HOG parameters.

I used YCrCb colorspace and did not try others. I also used all HOG channels. I tried various values of parameters orient, pix_per_cell and cell_per_block in the classifier and settled for these parameters because they gave the best results.

### Tweak these parameters and see how the results change.
color_space = 'YCrCb' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
orient = 9  # HOG orientations
pix_per_cell = 8 # HOG pixels per cell
cell_per_block = 2 # HOG cells per block
hog_channel = 'ALL' # Can be 0, 1, 2, or "ALL"

3. Describe how (and identify where in your code) you trained a classifier using your selected HOG features (and color features if you used them).

Again I used the extract_features method from the lessons as is in code cell #4. Since the image dataset was small, I decided to use all the three types of features (HOG, histogram and spatial). I also decided to use the YCrCb colorspace because it was giving the best results. For spatial and histogram features I used the default parammeters because they were already tweaked in the lessons. For HOG I chose ALL hog channels with 9 orientations. This is because selecting a single hog channel was not working for the classifier. I used a Linear SVM which is known to work well for HOG features. I scaled the features to zero mean and unit variance using StandardScaler and split 20% of data randomly into a test set. The code for the classifier is in cell #8 of the IPython notebook.

The statistics of the whole process are printed below:

49.05 seconds to extract car features...
50.92 seconds to extract non-car features...
6.36 seconds to scale/transform features...
Using: 9 orientations 8 pixels per cell and 2 cells per block
Feature vector length: 8460
31.16 seconds to train Linear SVM...
Test Accuracy of SVC =  0.9904

1. Describe how (and identify where in your code) you implemented a sliding window search. How did you decide what scales to search and how much to overlap windows?

For testing on images I used the sliding_window functions provided in the lessons. Its in cell #5 of the IPython notebook. I used a single window size (96, 96), and an overlap of (0.75, 0.75). For the project I did not find the need to use multiple scales.

Here are the example outputs:

alt text alt text

However for actual video processing I used the find_cars method with a scale of 1.5 (also in cell #5) because it uses HOG subsampling search and is more efficient.

2. Show some examples of test images to demonstrate how your pipeline is working. What did you do to optimize the performance of your classifier?

I decided to use all the three types of features (HOG, histogram and spatial) so that the classifier does not overfit. I also decided to use the YCrCb colorspace because it was giving the best results. Here are the example images.

png

alt text alt text alt text alt text alt text alt text

Video Implementation

Here’s a link to my video result The code for my video processing pipeline is in code cell #10 of the IPython Notebook.

2. Describe how (and identify where in your code) you implemented some kind of filter for false positives and some method for combining overlapping bounding boxes.

Here’s an example result showing the heatmap from a series of frames of video, the result of scipy.ndimage.measurements.label() and the bounding boxes then overlaid on the last frame of video:

Here are six frames and their corresponding heatmaps:

alt text alt text alt text alt text alt text alt textalt text alt text alt text alt text alt text alt text

Here is the output of scipy.ndimage.measurements.label() on the integrated heatmap from all six frames:

alt text

Here the resulting bounding boxes are drawn onto the last frame in the series:

alt text

Discussion

1. Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?

The great thing about this project was that most of the code was already provided and explained in the lessons. Most of the code works as it is and the only challenging part is tuning the various parameters. The only change I made is in the find_cars function where instead of drawing rectangles on the image I am returning the list of bounding boxes similar to search_windows method so that we can add heat and apply thresholds to these bounding boxes as well. The other challenge was to eliminate false positives by keeping a queue of 30 frames. Currently I eliminate any detections that does not persist for 15 frames as false positives but that may vary from video to video and can fail in cases where a vehicle suddenly comes and goes. I would like to try these things to make it more robust: