# Vehicle Detection Project
This is a project for Udacity self-driving car Nanodegree program. The aim of this project is to detect the vehicles in a dash camera video. The implementation of the project is in the file vehicle_detection.ipynb. This implementation is able to achieve 21FPS without batching processing. The final video output is [here](https://www.youtube.com/watch?v=PncSIx8AHTs).
In this README, each step in the pipeline will be explained in details.
## Introduction to object detection
Detecting vehicles in a video stream is an object detection problem. An object detection problem can be approached as either a classification problem or a regression problem. As a classification problem, the image are divided into small patches, each of which will be run through a classifier to determine whether there are objects in the patch. Then the bounding boxes will be assigned to locate around patches that are classified with high probability of present of an object. In the regression approach, the whole image will be run through a convolutional neural network to directly generate one or more bounding boxes for objects in the images.
| classification | regression |
|----------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
| Classification on portions of the image to determine objects, generate bounding boxes for regions that have positive classification results. | Regression on the whole image to generate bounding boxes |
| 1. sliding window + HOG 2. sliding window + CNN 3. region proposals + CNN | generate bounding box coordinates directly from CNN |
| RCNN, Fast-RCNN, Faster-RCNN | SSD, YOLO |
In this project, we will use tiny-YOLO v1, since it's easy to implement and are reasonably fast.
## The tiny-YOLO v1
### Architecture of the convolutional neural network
The tiny YOLO v1 is consist of 9 convolution layers and 3 full connected layers. Each convolution layer consists of convolution, leaky relu and max pooling operations. The first 9 convolution layers can be understood as the feature extractor, whereas the last three full connected layers can be understood as the "regression head" that predicts the bounding boxes.
![model](./output_images/mode_yolo_plot.jpg)
There are a total of 45,089,374 parameters in the model and the detail of the architecture is in list in this table
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
convolution2d_1 (Convolution2D) (None, 16, 448, 448) 448 convolution2d_input_1[0][0]
____________________________________________________________________________________________________
leakyrelu_1 (LeakyReLU) (None, 16, 448, 448) 0 convolution2d_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 16, 224, 224) 0 leakyrelu_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 32, 224, 224) 4640 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
leakyrelu_2 (LeakyReLU) (None, 32, 224, 224) 0 convolution2d_2[0][0]
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D) (None, 32, 112, 112) 0 leakyrelu_2[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 64, 112, 112) 18496 maxpooling2d_2[0][0]
____________________________________________________________________________________________________
leakyrelu_3 (LeakyReLU) (None, 64, 112, 112) 0 convolution2d_3[0][0]
____________________________________________________________________________________________________
maxpooling2d_3 (MaxPooling2D) (None, 64, 56, 56) 0 leakyrelu_3[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 128, 56, 56) 73856 maxpooling2d_3[0][0]
____________________________________________________________________________________________________
leakyrelu_4 (LeakyReLU) (None, 128, 56, 56) 0 convolution2d_4[0][0]
____________________________________________________________________________________________________
maxpooling2d_4 (MaxPooling2D) (None, 128, 28, 28) 0 leakyrelu_4[0][0]
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D) (None, 256, 28, 28) 295168 maxpooling2d_4[0][0]
____________________________________________________________________________________________________
leakyrelu_5 (LeakyReLU) (None, 256, 28, 28) 0 convolution2d_5[0][0]
____________________________________________________________________________________________________
maxpooling2d_5 (MaxPooling2D) (None, 256, 14, 14) 0 leakyrelu_5[0][0]
____________________________________________________________________________________________________
convolution2d_6 (Convolution2D) (None, 512, 14, 14) 1180160 maxpooling2d_5[0][0]
____________________________________________________________________________________________________
leakyrelu_6 (LeakyReLU) (None, 512, 14, 14) 0 convolution2d_6[0][0]
____________________________________________________________________________________________________
maxpooling2d_6 (MaxPooling2D) (None, 512, 7, 7) 0 leakyrelu_6[0][0]
____________________________________________________________________________________________________
convolution2d_7 (Convolution2D) (None, 1024, 7, 7) 4719616 maxpooling2d_6[0][0]
____________________________________________________________________________________________________
leakyrelu_7 (LeakyReLU) (None, 1024, 7, 7) 0 convolution2d_7[0][0]
____________________________________________________________________________________________________
convolution2d_8 (Convolution2D) (None, 1024, 7, 7) 9438208 leakyrelu_7[0][0]
____________________________________________________________________________________________________
leakyrelu_8 (LeakyReLU) (None, 1024, 7, 7) 0 convolution2d_8[0][0]
____________________________________________________________________________________________________
convolution2d_9 (Convolution2D) (None, 1024, 7, 7) 9438208 leakyrelu_8[0][0]
__________________________________________________________________________
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
测试集,训练集,车辆识别,人工智能 人工智能是一门极富挑战性的科学,从事这项工作的人必须懂得计算机知识,心理学和哲学。人工智能是包括十分广泛的科学,它由不同的领域组成,如机器学习,计算机视觉等等,总的说来,人工智能研究的一个主要目标是使机器能够胜任一些通常需要人类智能才能完成的复杂工作。
资源推荐
资源详情
资源评论
收起资源包目录
CarND-Vehicle-Detection-master.zip (14个子文件)
CarND-Vehicle-Detection-master
test_images
test4.jpg 196KB
test6.jpg 227KB
test5.jpg 238KB
test1.jpg 212KB
test3.jpg 144KB
test2.jpg 170KB
output_images
save_output_here.txt 112B
mode_yolo_plot.jpg 1.03MB
detection_on_test_images.png 523KB
net_output.png 19KB
vehicle detection.ipynb 986KB
README.md 11KB
utils
utils.py 4KB
__init__.py 21B
共 14 条
- 1
资源评论
- zhongxh20102018-05-25很不错的文档,
- yuanmd2018-04-06不错,很有用的资料。感谢分享~
m0_37664554
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功