car-recognition-yolo-master.zip资源-CSDN文库

共11个文件

py：3个

md：2个

txt：2个

Yolo

图像识别

需积分: 5 73 浏览量 2023-01-29 09:46:32 上传评论收藏 34.7MB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

car-recognition-yolo-master.rar （11个子文件）

car-recognition-yolo-master

car-recognition-yolo_code

src

keras_yolo.py 16KB

images.rar 32.36MB

Auto_driving_car_yolo.py 12KB

object_classes.txt 3B

yolo_自动驾驶_车辆识别介绍.docx 1.62MB

coco_classes.txt 625B

yolo_utils.py 3KB

yolo_自动驾驶_车辆识别介绍_20190124175811.pdf 873KB

README.md 115B

LICENSE 1KB

README.md 17KB

# 基于Yolo实现的汽车识别 **使用YOLO算法进行对象识别（使用Keras框架）** # 一、问题描述假设你现在在做自动驾驶的汽车，你想着首先应该做一个汽车检测系统，为了搜集数据，你已经在你的汽车前引擎盖上安装了一个照相机，在你开车的时候它会每隔几秒拍摄一次前方的道路。您已经将所有这些图像收集到一个文件夹中，并通过在您找到的每辆车周围画边界框来标记它们。下面是一个关于边框的例子： ![](http://www.writebug.com/myres/static/uploads/2021/10/19/46a2ccb12491c7705643022f205098f2.writebug) 假如你想让YOLO识别80个类别的物体（见coco_classes.txt文件），你可以把分类标c从1-80进行标记，或者把它变成80维的向量（80个数字），在对应的位置上写0或者1，因为YOLO的模型训练起来比较久，我这里使用预训练好的模型进行使用。 # 二、YOLO算法 YOLO(“你只看一次”)是一种流行的算法，因为它既能实现高精度，又能实时运行。这种算法“只看一次”图像，从某种意义上来说只需要一个正向传播通过网络进行预测。然后经过非最大值抑制(non-max-suppression)后，输出识别的对象和边框。 ## 2.1 Model detail 首先要知道的是： - 输入的是批量图片，shape is (m,608,608,3) - 输出是一个包含识别类的bounding box列表，每个bounding box有个数字（Pc，bx，by，bh，bw，c），如果c扩展为80-dimensional vector，每个bounding box将有85数字我将使用5个anchor boxes，所以算法的大致流程是这样的：IMAGE(m,608,608,3)—>DEEP CNN—>ENCODING(m,19,19,5,85) 下面是ENCODING编码的情况： ![](http://www.writebug.com/myres/static/uploads/2021/10/19/f6c70ae9dde1041f1851efdf71709d28.writebug) 如果对象的中心/中点在单元格内，那么单元格就负责识别该对象。因为我们使用了5个anchor boxes，每个cell（19×19 cells）对5个anchor boxes进行encode。Anchor boxes由高度和宽度定义。为了简单起见，我们flatten(19,19,5,85）编码的后面两个维度，所以DEEP CNN的输出变成了(19,19,425)，如下图： ![](http://www.writebug.com/myres/static/uploads/2021/10/19/f84068048f0ce88938990b55ae3b6d4b.writebug) 现在，对于每个框(每个单元格)，我们将计算以下elementwise乘积，并提取该框包含某个类的概率，如下： ![](http://www.writebug.com/myres/static/uploads/2021/10/19/6a3f8577c30d84357e8e7ef3a7c1c8dd.writebug) 我们来看一下可视化的预测图片： ![](http://www.writebug.com/myres/static/uploads/2021/10/19/b46c0a68fca624f053b94f55559046d4.writebug) 每个单元格会输出5个anchor boxes，总的来说，观察一次图像（一次前向传播），该模型需要预测19×19×5=1805个anchor boxes，不同颜色代表不同的分类，在上图中只绘制了模型所猜测的高概率的anchor boxes，但是anchor boxes依旧是太多了，我们希望算法的输出为更少的anchor boxes，所以这就要用到non_max_supppression，具体步骤如下： - 舍弃掉低概率的anchor boxes(meaning，anchor boxes没那么大的信心确定为该类) - 当几个anchor boxes相互重叠并检测同一个物体时，只选择一个anchor box ## 2.2 Filtering with a threshold on class scores（过滤类分数的阈值）应用第一个阈值过滤器，你将会并且掉任何anchor box的class “score”低于阈值的anchor boxes 模型一共有19×19×5×85个数字。每个anchor由85个数字组成（Pc，bx，by，bh，bw，80-dimensions），将维度（19,19,5,85）或者（19,19,425）换成下面的维度有利于下一步的操作： - **box_confidence**：tensor of shape (19, 19, 5,1)包含19x19单元格中每个单元格预测的5个锚框中的所有的锚框的P - **boxes**：tensor of shape (19, 19, 5, 4)包含了所有的锚框的（px,py,ph,pw） - **box_class_probs**：tensor of shape (19, 19, 5, 80)containing the detection probabilities (c1,c2,...c80)for each of the 80 classes for each of the 5 anchor boxes per cell ```python def yolo_filter_boxes(box_confidence, boxes, box_class_probs, thresthod = .6): """ Filters YOLO boxes by thresholding on object and class confidence. :param box_confidence: --tensor of shape (19, 19, 5,1) :param boxes: -- tensor of shape (19, 19, 5, 4) :param box_class_probs: -- tensor of shape (19, 19, 5, 80) :param thresthod: -- real value, if [highest class probability score < threshold],then get rid of the corresponding box] :return: scores -- tensor of shape(None, ),containing the class probability score for selected boxes boxes -- tensor of shape(None, 4),containing(b_x, b_y, b_h, b_w) coordinates of selected boxes classes -- tensor of shape(None, ),containing the index of the class detected by the selected boxes """ ## First step：计算锚框的得分 box_scores = box_confidence * box_class_probs #(19,19,5,80) ## Second step：找到最大值的锚框索引以及对应的最大值的锚框 box_classes = K.argmax(box_scores,axis=-1) #axis = -1表示对最后一维操作 box_class_scores = K.max(box_scores,axis=-1) ## Third step：根据阈值创建掩码 filtering_mask = (box_class_scores>=thresthod) ## 对scores， boxes 以及classes使用掩码 scores = tf.boolean_mask(box_class_scores,filtering_mask) boxes = tf.boolean_mask(boxes, filtering_mask) classes = tf.boolean_mask(box_classes, filtering_mask) return scores, boxes, classes ``` ## 2.3 non_max_suppression 即使是通过score阈值过滤了一些score较低的分类，，但是依旧还是有很大anchor被保留下来，这里我们就要进行第二次过滤，如下图所示，将左边的图片变成右边的图片，这就叫做non_maximum suppression(非最大值抑制)--NMS。 ![](http://www.writebug.com/myres/static/uploads/2021/10/19/b1fb4560d9992dda18f39bcc4dac242e.writebug) 上图例子中，模型预测了3辆车，但是实际上这3辆车都是同一辆车，我们使用NMS将会去选择3个anchor中最高概率的1个anchor。那么如何实现NMS呢？我们需要运用Intersection over Union（I0U）交并比，如下： ![](http://www.writebug.com/myres/static/uploads/2021/10/19/53738486bb0338f81f757bcb6c4f6bce.writebug) Implement iou(). Some hints: - In this exercise only, we define a box using its two corners (upper left and lower right): (x1, y1, x2, y2) rather than the midpoint and height/width - To calculate the area of a rectangle you need to multiply its height (y2 - y1) by its width (x2 - x1) You'll also need to find the coordinates (xi1, yi1, xi2, yi2) of the intersection of two boxes. Remember that: - xi1 = maximum of the x1 coordinates of the two boxes - yi1 = maximum of the y1 coordinates of the two boxes - xi2 = minimum of the x2 coordinates of the two boxes - yi2 = minimum of the y2 coordinates of the two boxes ```python def iou(box1, box2): """ 实现两个锚框的交并比的计算 :param box1: 第一个锚框，shape(x1,y1,x2,y2) :param box2: 第二个锚框，shape(x1,y1,x2,y2) :return: iou:实数，交并比 """ # 计算相交的区域的面积 xi1 = np.maximum(box1[0], box2[0]) yi1 = np.maximum(box1[1], box2[1]) xi2 = np.minimum(box1[2], box2[2]) yi2 = np.minimum(box1[3], box2[3]) inter_area = (xi1 - xi2) * (yi1 - yi2) # 计算并集 Union(A,B) = A + B - Inter(A, B) box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1]) box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1]) union_area = box1_area + box2_area - inter_area # 计算交并比 iou = inter_area / union_area return iou ``` 现在可以实现non-max suppression了，关键步骤如下： - 选择最高的分值的anchor box - 计算其他anchor box与选择出来的anchor box重叠的部分，剔除与该ancho

评论收藏

内容反馈