综述：自然场景图像中的文本检测资源-CSDN文库

146 浏览量 2021-03-07 00:26:32 上传评论收藏 170KB PDF 举报

自然场景图像中的文本检测技术是计算机视觉领域中的一个重要研究方向，它关注的是如何利用计算机自动地在自然场景图像中检测并识别出文本信息。在现实世界中，图像和视频作为多媒体文件的重要组成部分，包含了大量文本信息，这些信息对于人们获取信息具有非常重要的作用。例如，在图像检索、对象追踪等应用中，图像中的文本信息可以提供丰富的、准确的信息来源。文本检测和识别问题可以粗略地分为两大类：覆盖文本和场景文本。覆盖文本，简单来说，是人工叠加在图像或视频上的文本，如新闻中的字幕和电影中的字幕。这类文本与背景形成鲜明对比，且字体大小一致，因此，覆盖文本相对容易定位、提取和识别。相比之下，场景文本的检测和识别要复杂得多。场景文本是指自然环境中真实存在的文本，例如路标、街牌、商店招牌等，它们可能出现在各种复杂的背景之中，有时文本颜色接近背景颜色，可能受到自然环境和光照的影响，使得文本检测和识别成为一个具有挑战性的问题。随着智能手机和摄像设备的普及，利用图像进行物体的检测和识别已经成为人们生活中越来越重要的一部分。文本检测和识别的自动化不仅能够帮助人们从图像中获取信息，而且在多种应用中都能发挥巨大优势，如提高图像检索的准确性、帮助智能机器人更好地理解和与环境互动。目前，用于文本检测的技术方法多种多样，本综述主要介绍使用机器学习进行文本检测的方法，并对比这些方法。其中包括了对文本识别技术的简要介绍以及一些标准数据集的说明。总结了该领域一些值得研究的方向。机器学习和深度学习技术在自然场景图像的文本检测领域发挥着核心作用。机器学习的方法可以基于图像特征对文本区域进行定位和分类。通过训练数据集，机器学习模型可以学习到如何识别和区分文本与非文本区域。深度学习方法，如卷积神经网络（CNN），已经在图像处理和识别任务中展现出了强大的性能，它能够自动提取和学习文本图像中的特征表示，进而实现准确的文本定位和识别。在介绍文本检测技术的同时，本综述还探讨了文本识别技术，并列举了一些标准数据集，如ICDAR（国际文档分析与识别大会）发布的数据集，这些数据集被广泛用于评估和改进文本检测算法的性能。此外，还提出了一些该领域内的研究课题，如如何提高文本检测的准确率、如何改进算法以适应不同光照和天气条件下的文本检测等。总体来说，文本检测和识别是计算机视觉和文档分析领域的研究热点，随着技术的不断发展，该领域将会有更多新的理论和应用出现。研究者们正在致力于提升算法的鲁棒性、准确度和效率，以便文本检测技术能够在更多的实际应用中发挥作用，从而使得人们能够更便捷地从图像和视频中获取信息。

资源推荐

资源详情

资源评论

A Review: Text Detection in Natural Scene Image

Yue Sun, Abdusalam Dawut, Askar Hamdulla*

Institute of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China

*corresponding author’s email: askarhamdulla@sina.com

Abstract—Multimedia files such as images and videos

contain most of the textual information, which is an

important route for people to obtain information.

Therefore, the automatic detection and recognition of text

has been becoming more and more popular topic in

computer vision and document analysis. Aiming at the

problem of scene text detection and recognition, this paper

introduces a variety of text detection methods. It focusses on

the use of machine learning for text detection and compares

these methods. Then, there is a brief introduction for text

recognition and some standard datasets. Finally, some

valuable research topics in this field are proposed and

summarized.

Keywords: Text detection, text recognition, natural image,

machine learning, deep learning.

1. INTRODUCTION

With the popularization of digital devices such as

mobile phones and cameras, the detection and recognition

of things with images has become an increasingly

important part of people's life. Among the many carriers

containing information, multimedia files such as image

and video contain lots of text information which is an

important way for people to obtain information. The

abundant and accurate information contained by text is

very advantageous for all sorts of applications, such as

image retrieval, object tracking. Therefore, text detection

and recognition have become hot issues in computer

vision [1].

The text in the image can be roughly divided into two

categories: overlay text and scene text. Overlay text, in

other words, the text is artificially superimposed on

images or video, such as transcripts and title in news and

subtitles in movies. The background of the superimposed

text is simple, the text has a sharp contrast with the

background, and the font and size are same. Therefore, it

is very easy for the superimposed text to locate, extract

and identify the text in the image. Different from

superimposed text, scene text often has complex

background, sometimes the text color is like background

color, and the text is vulnerable to hardware devices,

illumination intensity, shooting angle, natural

environment and human interference. These factors make

the detection, location and recognition of scene text more

difficult.

The main reasons that affect the scene text detection

effect and recognition results are as follows [1]:

1) Multiple text attributes: For superimposed text, it

has same fonts, sizes, colors and spacing. Different from

overlay text, there are different fonts and colors in scene

text. In addition to this, it has diverse sizes and directions.

Even in the same background, these attributes are distinct.

2) Intricate background: In most cases, the

background and text are confusing, which makes it easy

to make mistakes when distinguishing between text and

non-text.

3) Other disturbance factors: These interference

factors increase the error rate of scene text detection such

as noise, blur, distortion, low resolution, uneven

illumination, partial occlusion etc..

In allusion to these problems, many researchers come

up with lots of methods to solve them.

2. RESEARCH PROGRESS ON SCENE TEXT DETECTION

In recent years, with the continuous development of

machine vision, scene text detection and recognition are

roughly divided into text detection, text recognition, and

end-to-end.

The general steps for text detection are as follows:

Figure 1. text detection procedures

Scene text detection methods can be roughly divided

into two categories: text positioning based on text

features (i.e, using traditional methods for text

localization), and text positioning based on machine

learning.

2.1 Text positioning based on text features

The text in the natural image has the following text

features: special texture features, the same color of

characters, a lot of edge features of characters, specific

stroke width. Thus, researchers have put forward some

methods for text location.

2.1.1 Text positioning based on texture

The texture-based method treats characters as a special

kind of texture and determines whether pixels belong to

text with image texture features. The reason is that

characters have a certain arrangement direction. This

method is described as follows: Firstly, the image is

divided into several non-overlapping sub-regions.

Input pictures

Pretreatment

Extract corresponding features

Determine whether it is text based on characteristics

Boxes out the text area

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余3页未读，立即下载

评论收藏

内容反馈

weixin_38569203

粉丝: 6
资源: 930

综述：自然场景图像中的文本检测

自然场景图像中的文本检测综述

自然场景图像中的文字检测综述

自然场景文本检测识别技术综述

自然场景下的文本识别1

自然场景文字检测识别

自然场景文字块检测

从自然场景中检测路标文本

自然场景文本检测与识别中的深度学习方法综述

最新《自然场景中文本检测与识别》综述论文

文本立场检测综述.docx

图像场景识别中深度学习方法综述.pdf

自然场景下标志牌文本提取

自然场景文字检测CTPN

雾天自然场景文字检测1

有效地本地化自然场景图像中的文本

从自然场景图像中提取文本：一项调查

通过多尺度自适应颜色聚类和非文本过滤进行自然场景文本检测

基于MSER和卷积神经网络的自然景物图像文本区域检测与识别。-研究论文

图像文本检测

基于深度学习的自然场景文本识别

图象垃圾邮件中文字区域检测提取技术综述

综述：目标检测二十年（2001-2021）

最新资源