# SSD in TensorFlow: Traffic Sign Detection and Classification
## Overview
Implementation of [Single Shot MultiBox Detector (SSD)](https://arxiv.org/abs/1512.02325) in TensorFlow, to detect and classify traffic signs. This implementation was able to achieve 40-45 fps on a GTX 1080 with an Intel Core i7-6700K.
*Note this project is still work-in-progress*. The main issue now is model overfitting. I am currently working on pre-training on VOC2012 first, then performing transfer learning over to traffic sign detection.
Currently only stop signs and pedestrian crossing signs are detected. Example detection images are below.
![example1](inference_out/stop_1323896809.avi_image12.png)
![example2](inference_out/pedestrian_1323896918.avi_image9.png)
![example3](inference_out/stop_1323804419.avi_image31.png)
![example4](inference_out/stop_1323822840.avi_image5.png)
![example5](inference_out/pedestrianCrossing_1330547304.avi_image1.png)
![example6](inference_out/pedestrianCrossing_1333395817.avi_image21.png)
The model was trained on the [LISA Traffic Sign Dataset](http://cvrr.ucsd.edu/LISA/lisa-traffic-sign-dataset.html), a dataset of US traffic signs.
## Dependencies
* Python 3.5+
* TensorFlow v0.12.0
* Pickle
* OpenCV-Python
* Matplotlib (optional)
## How to run
Clone this repository somewhere, let's refer to it as `$ROOT`
To run predictions using the pre-trained model:
* [Download the pre-trained model](https://drive.google.com/open?id=0BzaCOTL9zhUlekM3NWU1bmNqeVk) to `$ROOT`
* `cd $ROOT`
* `python inference.py -m demo`
* This will take the images from sample_images, annotate them, and display them on screen
* To run predictions on your own images and/or videos, use the `-i` flag in inference.py (see the code for more details)
* Note the model severly overfits at this time
Training the model from scratch:
* Download the [LISA Traffic Sign Dataset](http://cvrr.ucsd.edu/LISA/lisa-traffic-sign-dataset.html), and store it in a directory `$LISA_DATA`
* `cd $LISA_DATA`
* Follow instructions in the LISA Traffic Sign Dataset to create 'mergedAnnotations.csv' such that only stop signs and pedestrian crossing signs are shown
* `cp $ROOT/data_gathering/create_pickle.py $LISA_DATA`
* `python create_pickle.py`
* `cd $ROOT`
* `ln -s $LISA_DATA/resized_images_* .`
* `ln -s $LISA_DATA/data_raw_*.p .`
* `python data_prep.py`
* This performs box matching between ground-truth boxes and default boxes, and packages the data into a format used later in the pipeline
* `python train.py`
* This trains the SSD model
* `python inference.py -m demo`
## Differences between original SSD implementation
Obivously, we are only detecting certain traffic signs in this implementation, whereas the original SSD implemetation detected a greater number of object classes in the PASCAL VOC and MS COCO datasets. Other notable differences are:
* Uses AlexNet as the base network
* Input image resolution is 400x260
* Uses a dynamic scaling factor based on the dimensions of the feature map relative to original image dimensions
## Performance
As mentioned above, this SSD implementation was able to achieve 40-45 fps on a GTX 1080 with an Intel Core i7 6700K.
The inference time is the sum of the neural network inference time, and Non-Maximum Suppression (NMS) time. Overall, the neural network inference time is significantly less than the NMS time, with the neural network inference time generally between 7-8 ms, whereas the NMS time is between 15-16 ms. The NMS algorithm implemented here has not been optimized, and runs on CPU only, so further effort to improve performance can be done there.
## Dataset characteristics
The entire LISA Traffic Sign Dataset consists of 47 distinct traffic sign classes. Since we are only concered with a subset of those classes, we only use a subset of the LISA dataset. Also, we ignore all training samples where we do not find a matching default box, further reducing our dataset's size. Due to this process, we end up with very little data to work with.
In order to improve on this issue, we can perform image data augmentation, and/or pre-train the model on a larger dataset (e.g. VOC2012, ILSVRC)
## Training process
Given the small size of our pruned dataset, I chose a train/validation split of 95/5. The model was trained with Adadelta optimizers, with the default parameters provided by TensorFlow. The model was trained over 200 epochs, with a batch size of 32.
## Areas of improvement
There are multiple potential areas of improvement in this project:
* Pre-train the model on VOC2012 and/or ILSVRC
* Image data augmentation
* Hyper-parameter tuning
* Optimize NMS alogorithm, or leverage existing optimized NMS algorithm
* Implement and report mAP metric
* Try different base networks
* Expand to more traffic sign classes
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
Implementation_of_Single_Shot_MultiBox_Detector_in_ssd_tensorflow_traffic_sign_detection.zip (25个子文件)
DataXujing-ssd_tensorflow_traffic_sign_detection-e5f3241
inference_out
stop_1323822840.avi_image5.png 362KB
stop_1323804419.avi_image31.png 296KB
pedestrianCrossing_1330547304.avi_image1.png 632KB
pedestrianCrossing_1333395817.avi_image21.png 727KB
pedestrian_1323896918.avi_image9.png 348KB
stop_1323896809.avi_image12.png 303KB
viz_model.py 721B
LICENSE 1KB
sample_images
stop_1323822840.avi_image5.png 410KB
stop_1323803184.avi_image16.png 340KB
stop_1323804419.avi_image31.png 346KB
pedestrianCrossing_1330547304.avi_image1.png 1.09MB
pedestrianCrossing_1333395817.avi_image21.png 1.14MB
pedestrian_1323896918.avi_image9.png 393KB
pedestrianCrossing_1333395693.avi_image8.png 1.12MB
stop_1323896809.avi_image12.png 381KB
stop_1323804592.avi_image12.png 274KB
data_prep.py 5KB
model.py 10KB
inference.py 6KB
data_gathering
create_pickle.py 3KB
signnames.csv 28B
settings.py 2KB
train.py 8KB
README.md 5KB
共 25 条
- 1
资源评论
好家伙VCC
- 粉丝: 2075
- 资源: 9145
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功