j基于目标检测的yolov1原始论文资源-CSDN文库

yolov1

需积分: 1 35 浏览量 2024-11-26 20:45:34 上传评论收藏 5MB PDF 举报

资源推荐

资源详情

资源评论

You Only Look Once:

Uniﬁed, Real-Time Object Detection

Joseph Redmon

∗

, Santosh Divvala

∗†

, Ross Girshick

, Ali Farhadi

∗†

University of Washington

∗

, Allen Institute for AI

†

, Facebook AI Research

http://pjreddie.com/yolo/

Abstract

We present YOLO, a new approach to object detection.

Prior work on object detection repurposes classiﬁers to per-

form detection. Instead, we frame object detection as a re-

gression problem to spatially separated bounding boxes and

associated class probabilities. A single neural network pre-

dicts bounding boxes and class probabilities directly from

full images in one evaluation. Since the whole detection

pipeline is a single network, it can be optimized end-to-end

directly on detection performance.

Our uniﬁed architecture is extremely fast. Our base

YOLO model processes images in real-time at 45 frames

per second. A smaller version of the network, Fast YOLO,

processes an astounding 155 frames per second while

still achieving double the mAP of other real-time detec-

tors. Compared to state-of-the-art detection systems, YOLO

makes more localization errors but is less likely to predict

false positives on background. Finally, YOLO learns very

general representations of objects. It outperforms other de-

tection methods, including DPM and R-CNN, when gener-

alizing from natural images to other domains like artwork.

1. Introduction

Humans glance at an image and instantly know what ob-

jects are in the image, where they are, and how they inter-

act. The human visual system is fast and accurate, allow-

ing us to perform complex tasks like driving with little con-

scious thought. Fast, accurate algorithms for object detec-

tion would allow computers to drive cars without special-

ized sensors, enable assistive devices to convey real-time

scene information to human users, and unlock the potential

for general purpose, responsive robotic systems.

Current detection systems repurpose classiﬁers to per-

form detection. To detect an object, these systems take a

classiﬁer for that object and evaluate it at various locations

and scales in a test image. Systems like deformable parts

models (DPM) use a sliding window approach where the

classiﬁer is run at evenly spaced locations over the entire

image [10].

More recent approaches like R-CNN use region proposal

1. Resize image.

2. Run convolutional network.

3. Non-max suppression.

Dog: 0.30

Person: 0.64

Horse: 0.28

Figure 1: The YOLO Detection System. Processing images

with YOLO is simple and straightforward. Our system (1) resizes

the input image to 448 × 448, (2) runs a single convolutional net-

work on the image, and (3) thresholds the resulting detections by

the model’s conﬁdence.

methods to ﬁrst generate potential bounding boxes in an im-

age and then run a classiﬁer on these proposed boxes. After

classiﬁcation, post-processing is used to reﬁne the bound-

ing boxes, eliminate duplicate detections, and rescore the

boxes based on other objects in the scene [13]. These com-

plex pipelines are slow and hard to optimize because each

individual component must be trained separately.

We reframe object detection as a single regression prob-

lem, straight from image pixels to bounding box coordi-

nates and class probabilities. Using our system, you only

look once (YOLO) at an image to predict what objects are

present and where they are.

YOLO is refreshingly simple: see Figure 1. A sin-

gle convolutional network simultaneously predicts multi-

ple bounding boxes and class probabilities for those boxes.

YOLO trains on full images and directly optimizes detec-

tion performance. This uniﬁed model has several beneﬁts

over traditional methods of object detection.

First, YOLO is extremely fast. Since we frame detection

as a regression problem we don’t need a complex pipeline.

We simply run our neural network on a new image at test

time to predict detections. Our base network runs at 45

frames per second with no batch processing on a Titan X

GPU and a fast version runs at more than 150 fps. This

means we can process streaming video in real-time with

less than 25 milliseconds of latency. Furthermore, YOLO

achieves more than twice the mean average precision of

other real-time systems. For a demo of our system running

in real-time on a webcam please see our project webpage:

http://pjreddie.com/yolo/.

Second, YOLO reasons globally about the image when

arXiv:1506.02640v5 [cs.CV] 9 May 2016

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余9页未读，立即下载

评论收藏

内容反馈

图灵追慕者

粉丝: 3925
资源: 159

j基于目标检测的yolov1原始论文

yolov论文-改进YOLOv5s的复杂交通场景路侧目标检测算法

基于改进YOLOv5的路面坑洼检测方法

基于深度学习YOLOv3网络的鸟类目标检测 完整代码+数据 可直接运行 毕设

YOLOv1v2v3论文.zip

YOLOV1论文要点总结.doc

YOLOV1论文要点总结.docx

自己整理的目标检测好论文.zip

YOLOv2中英文对照翻译1

yolov论文-基于YOLOv5s的水下生物识别算法研究

基于YOLO的小目标和遮挡目标检测研究(毕设&课设论文参考).pdf

基于改进YOLOv5s的猪脸识别检测方法.pdf

基于YOLOv5的电动自行车安全头盔佩戴实时检测研究(毕设&课设论文参考).pdf

yolov论文.zip

基于改进YOLOv3模型的软包装食品自动识别方法.docx

基于改进YOLOv3的猪脸识别.docx

基于改进YOLOv3的温室番茄果实识别估产方法.docx

基于改进YOLOv3的果园复杂环境下苹果果实识别.docx

融合深度学习的机器人目标检测与定位.pdf

基于YOLOv3的病死猪猪头的识别方法.docx

yolov3权重文件

基于yolov8的人体动作识别检测项目源码.zip

目标检测-20种常用深度学习算法论文、复现代码汇总.zip

Yolov3随机手写数字数据集

yolov3_spp.zip

YOLOv8 在PyTorch中

基于深度学习的多分辨率海洋目标检测方法.pdf

基于YOLOV3优化模型的复杂场景下茶树嫩芽识别.docx

基于改进YOLOv3的煤矸识别方法研究-论文

基于无人机图像和改进YOLOv3-SPP算法的森林火灾烟雾识别方法.docx

最新资源

基于深度学习YOLOv3网络的鸟类目标检测完整代码+数据可直接运行毕设