嵌入式架构资源-CSDN文库

需积分: 16 20 浏览量 2014-08-16 16:50:09 上传评论收藏 1.68MB PDF 举报

在信息科技领域，嵌入式系统是指嵌入在各种设备内部的计算机系统，其设计目的是让设备能够进行特定的功能或操作。嵌入式系统通常具有专用性强、资源受限、实时性强、高可靠性、低功耗等特征，它广泛应用于消费电子、工业控制、交通、医疗等领域。根据提供的文件内容，我们可以总结出以下知识点：文档的标题是“嵌入式架构”，这里的“架构”指的是嵌入式系统的设计结构，包括硬件和软件的结合方式，以及它们之间的相互作用。文件描述提到一种“System-on-Chip Architecture”，即片上系统架构。片上系统通常是指将微处理器、内存、输入输出设备等所有电子系统设计集成到一个单一芯片上，这种架构可以减小系统的体积，提高性能，降低功耗，是嵌入式系统设计中的重要趋势。文件内容中提到的关键词包括“FPGA”，即现场可编程门阵列。FPGA是一种可以通过编程来配置的数字集成电路，它允许设计者在硬件层面重新定义电路的逻辑功能，非常适合进行图像特征的检测与匹配等复杂计算任务。而“SIFT”（尺度不变特征变换）和“BRIEF”（二进制鲁棒独立基本特征）是两种图像处理算法，分别用于特征检测和特征描述。SIFT算法能够检测图像中的特征点并描述其特征，而BRIEF算法用于生成特征点的二进制描述符，并进行匹配。另外，文章介绍了利用FPGA实现的嵌入式系统，能够进行实时视觉检测和匹配，这对需要在不同时间、不同视角拍摄的两幅图像之间建立对应关系的视频分析和计算机视觉系统来说，是一个基础性的任务。由于这些任务的计算复杂度很高，对大多数嵌入式系统来说是一项挑战。因此，该系统对FPGA架构进行了优化，以减少资源的占用，并实现了在FPGA上进行BRIEF特征描述和匹配。所提出的系统在720p视频上可以达到每秒60帧的特征检测与匹配速度，其处理速度可以满足甚至超过大多数实际的实时视频分析应用的需求。从技术角度来看，该系统的设计和实现具有重要的意义，因为它展示了如何针对特定的算法优化FPGA架构，以及如何利用FPGA的优势来提升嵌入式系统的性能。这不仅对于视频分析和计算机视觉领域有贡献，也为嵌入式系统架构的设计和优化提供了新的思路。文件内容提及的“实时性”是指系统能够及时响应外界事件的能力。实时视频分析要求系统必须在很短的时间内完成复杂的图像处理工作，这通常需要高速的处理器以及高效的算法来保证。而“高效性”与“有效性”是指所提出的嵌入式系统架构不仅效率高，而且在实际应用中表现出良好的性能。这些术语指向了嵌入式系统设计的两个重要目标：即系统必须能够在规定时间内完成任务，并且具有高效率和良好的用户体验。文章所介绍的嵌入式系统架构，以及其针对视觉特征检测和匹配任务的FPGA优化技术，充分体现了当前嵌入式系统在视频分析和计算机视觉领域的应用现状和未来发展的方向。通过优化硬件架构和算法，嵌入式系统正在变得越来越强大，能够执行以前只有高端计算机才能完成的任务，而与此同时，它保持了小型化和低成本的优势。

资源推荐

资源详情

资源评论

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 24, NO. 3, MARCH 2014 525

An Embedded System-on-Chip Architecture for

Real-time Visual Detection and Matching

Jianhui Wang, Sheng Zhong, Luxin Yan, Member, IEEE, and Zhiguo Cao

Abstract—Detecting and matching image features is a funda-

mental task in video analytics and computer vision systems. It

establishes the correspondences between two images taken at

different time instants or from different viewpoints. However,

its large computational complexity has been a challenge to most

embedded systems. This paper proposes a new FPGA-based em-

bedded system architecture for feature detection and matching.

It consists of scale-invariant feature transform (SIFT) feature

detection, as well as binary robust independent elementary

features (BRIEF) feature description and matching. It is able to

establish accurate correspondences between consecutive frames

for 720-p (1280 x 720) video. It optimizes the FPGA architecture

for the SIFT feature detection to reduce the utilization of

FPGA resources. Moreover, it implements the BRIEF feature

description and matching on FPGA. Due to these contributions,

the proposed system achieves feature detection and matching

at 60 frame/s for 720-p video. Its processing speed can meet

and even exceed the demand of most real-life real-time video

analytics applications. Extensive experiments have demonstrated

its efﬁciency and effectiveness.

Index Terms—Binary robust independent elementary features

(BRIEF), feature detection and matching, ﬁeld programmable

gate array (FPGA), scale-invariant feature transform (SIFT),

system-on-chip (SoC).

I. Introduction

FFICIENT detection and reliable matching of visual

features is a fundamental problem in computer vision

applications, such as object recognition, structure from motion,

image indexing, and visual localization. Real-time perfor-

mance is a critical demand to most of these applications,

which require the detection and matching of the visual fea-

tures in real time. Although feature detection and matching

methods have been studied in the literature, due to their

computational complexity, their pure software implementation

without using special hardware is far from satisfactory in their

performance for real time applications. This paper is focused

on a new hardware design to enable real-time performance of

Manuscript received March 21, 2013; revised June 19, 2013 and July 31,

2013; accepted August 5, 2013. Date of publication August 29, 2013; date

of current version March 4, 2014. This work was supported in part by the

National Pre-Research Foundation under Grant 625010221. This paper was

recommended by Associate Editor T.-S. Chang.

The authors are with the Science and Technology on Multi-Spectral

Information Processing Laboratory, School of Automation, Huazhong

University of Science and Technology, Wuhan 430074, China (e-mail:

wang.ddu@gmail.com; zhongsheng@hust.edu.cn; yanluxin@gmail.com;

zgcao@hust.edu.cn).

Color versions of one or more of the ﬁgures in this paper are available

online at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TCSVT.2013.2280040

establishing correspondences between two consecutive frames

of high-resolution video.

In the literature, there are many different methods to detect

local features in an image, such as Harris [1], scale-invariant

feature transform (SIFT) [2], and SURF [3]. SIFT is one of

the most efﬁcient methods to detect and describe distinctive

invariant features from images. Its signiﬁcant advantage over

other methods is that the SIFT feature is invariant to image

translation, scaling, and rotation, while at the same time quite

robust to illumination changes. However, it is known that it is

very difﬁcult, if not impossible, to achieve software-based real-

time computing of SIFT due to its computational complexity.

Recently, there have been some studies using special hardware

[4]–[7] to accelerate the detection part of the SIFT algorithm,

and some of these works may achieve satisfactory real-time

performance, such as the design in [7]. However, to the–

knowledge, a full-ﬂedged feature detection, description, and

matching system is yet to be designed. Despite the detection

part of SIFT, obtaining the SIFT feature descriptors is also

critical and it has been the performance bottleneck of the

whole system because it is very difﬁcult, if not impossible,

to integrate the description part of SIFT into FPGA. The main

challenge is its operational complexity, which prevents it from

being parallelized effectively.

There have been many modiﬁcations and variants of the

original SIFT descriptor to speed it up at the algorithmic

level. Broadly speaking, these methods can be divided into

two classes. One is to shorten the size of the SIFT feature by

applying dimensionality reduction, such as principal compo-

nent analysis (PCA) [8], to the original SIFT feature descriptor.

Another way is to quantize its ﬂoating-point coordinates into

integer codes on fewer bits, such as the results presented in

[9]–[11]. From these important contributions, Calonder et al.

[12] presented a method to extract feature descriptor very

efﬁciently, called binary robust independent elementary fea-

tures (BRIEF), which greatly reduced the memory demanded

to store the feature descriptors and the time consumed to

match the features, while yielding comparable recognition

accuracy.

In order to achieve real-time feature detection and matching

for 720-p video, we propose to replace the original SIFT

descriptor by the BRIEF descriptor in this paper. Considering

the space, power, and real-time constraints of an embedded

system, we implement the whole system on a single FPGA

chip. This system consists of SIFT detection, BRIEF descrip-

tion, and BRIEF matching. The proposed FPGA-based feature

1051-8215

 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.

See http://www.ieee.org/publications

standards/publications/rights/index.html for more information.

526 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 24, NO. 3, MARCH 2014

detection and matching system can establish correspondences

between two 1280 × 720 images within 16 ms, which can

meet the demands of most real-time applications. The main

contributions of this paper include the following aspects.

1) To the best of our knowledge, it is the ﬁrst hardware

design to implement visual feature detection, description

and matching on a single FPGA based on SIFT+BRIEF.

Moreover, it can complete these computations in real

time for 720-p video. More speciﬁcally, it can achieve

60 frames/s for 1280 × 720 images.

2) It combines the stability and repeatability of SIFT key-

points with the efﬁciency of BRIEF to meet the real-

time requirements from real-life computer vision appli-

cations.

3) Due to the optimization in the proposed design, the SIFT

feature detection part presented in this paper is one of

the fastest designs, while the FPGA resources utilization

is smallest.

The rest of the paper is organized as follows. Section II

presents related works. It starts from the early attempts in

this topic and reviews related implementations speciﬁcally

on FPGAs. Section III introduces the proposed SIFT feature

detection associated with BRIEF feature description method.

Section IV presents the technical details of our design. Ex-

perimental results are reported in Section V and Section VI

concludes the paper.

II. Related Works

Generally speaking, there are two approaches to estab-

lish correspondences for image sequences: intensity-based

and feature-based. Feature-based approaches ﬁnd correspond-

ing local visual features between images, such as points,

lines, and contours, while the intensity-based approaches

compare intensity patterns in images via correlation met-

rics. As shown in the works of Lowe [2], Sivic and

Zisserman [13], and Mikolajczyk and Schmid [14], feature-

based methods have shown very good results for image

matching [15]–[17]. However, due to their computational

complexity, it is very difﬁcult, if not impossible, to have

pure software implementations (without using any special-

ized hardware, e.g., GPU) of these methods for real-time

applications.

McCartney et al. [15] presented an image registration algo-

rithm for image sequences captured by unmanned aerial ve-

hicles (UAV). This design was a pure software-based system,

and it consumed about1stocomplete image registration for

640 × 480 images, in which the image feature detection and

matching steps consumed about 0.8 s. It was obtained from

a personal computer (PC) equipped with Intel Core 2 Dual

processor with 1.67 GHz processing speed and 2 GB of RAM.

Agrawal et al. [16] presented two feature detection methods

named as CenSurE-DOB and CenSurE-OCT, respectively, as

well as a feature description method called MUSURF. They

claimed that the feature detection part CenSurE-DOB took

23 ms and CenSurE-OCT 17 ms to detect center-surround

features from 512 × 382 images, and the feature description

step only took 10 ms. Their experiments were obtained from an

Intel Pentium-M 2 GHz machine. This design may be ﬁne in

real-time applications of low resolution image sequences; how-

ever, its performance is still far from the demands from high-

resolution image sequences. Moreover, the time consumed by

the feature matching process was excluded from the processing

time presented in [16].

Wang et al. [17] presented a multiple view kernel projection

(MVKP)-based real-time image matching design. It utilized a

kernel-projection scheme to describe the image patch centered

at a detected key-point. It treated feature matching as a

classiﬁcation problem, which leads to an online matching

speed ﬁve times faster than SIFT. It took about 0.1 s to

complete the feature description on a PC with Pentium VI

1.4-G CPU.

In order to achieve real-time performance, GPU-based soft-

ware implementation has been studied, such as the work

presented by Cornelis [18], Heymann [19], and Sinha [20]. The

work presented in [18] can achieve 100 frame/s for 640 × 480

images and took about 32 ms to match 3000 features to another

3000 features. However, these methods have to largely depend

on the performance of the GPU chip and other hardware

conﬁgurations, and their performances vary signiﬁcantly from

computer to computer.

As the real-time performance is critical to real-life computer

vision applications, it is natural to resort to effective hardware

design for efﬁcient feature detection and description. Bonato

et al. [5] presented an FPGA-based architecture for SIFT

feature detection. This implementation operated at 30 frame/s

on 320 × 240 images. However, its implementation of the

feature description part of SIFT was on the NIOS II soft-

core processor. This step took 11.7 ms per detected feature,

which makes it infeasible to perform as a full real-time SIFT

implementation. As a single image may have hundreds of

features, it is still far from satisfactory for the real-time

performance.

Zhong et al. [4] presented a design for SIFT feature de-

tection and description based on FPGA+DSP architecture for

320 × 256 images. The feature detection part of their design

can achieve satisfactory real-time performance. However, the

feature description part of their system was implemented in

DSP. Although it took 80 μs to detect one feature, it may be

behind a real-time performance when the number of features

in an image reaches 400 or more (this is the generally the case

for high resolution images like in HDTV).

Mizuno et al. [6] proposed a SIFT feature detection design

for HDTV based on ASIC. In this system, the input images

were stored in the external SDRAM and divided into several

regions of interest (ROIs), and the feature detection archi-

tecture detects features in each ROI. Since the ROI image

is much smaller than the input image, it can reduce the on-

chip memory effectively. However, it needed external memory

to store the input image. Moreover, the performance of the

feature description of this design was not reported, and it is

unclear if it can be done in real time.

The FPGA-based design presented by Svab et al. [21]

took about 100 ms to detect the SURF features from a

1024 × 768 image. In this design, the feature detection part

of SURF was implemented in FPGA logical blocks, but the

WANG et al.: EMBEDDED SYSTEM-ON-CHIP ARCHITECTURE FOR REAL-TIME VISUAL DETECTION AND MATCHING 527

feature description part was implemented in POWERPC with

ﬂoating-point arithmetic. Although 10 frames/s is satisfactory

for a few real-life applications, it is still far from true real-

time performance. Moreover, when considering the feature

matching part, it cannot satisfy the demand of real-time

performance.

The SIFT algorithm was modiﬁed in [22] to obtain a high-

speed feature detector. This system took 31 ms to detect

multiple features from 640 × 480 images. It implemented two

octaves with four scales of SIFT, and reduced the dimension

of the feature descriptor from 128 to 72 in order to obtain

the desired speed. Due to these approximations, this dedicated

design gives a near real-time performance. However, the

performance of the feature description of their design is not

presented, let alone the feature matching. Moreover, although

the feature descriptor of their design is reduced to 72 from

128, the memory requirement to store the feature descriptor is

still very demanding.

Schaeferling et al. [23] proposed a SURF-based object

recognition system on a single FPGA. One of the core parts of

this systems is Flex-SURF+. It contains an array of difference

elements (DEs) to overcome the irregular memory access

behavior exposed by SURF during the computation of image

ﬁlter responses. A special computing pipeline processes these

ﬁlter responses and determines the ﬁnal detection results. Flex-

SURF+ allows a tradeoff between area and speed, which

makes it efﬁcient with high-end FPGAs as well as low-end

ones, depending on the application requirements. By using

Flex-SURF+, the SURF detectors determinant calculation step

requires just 70 ms per frame. The minimum, average, and

maximum total execution time per 320×240 frame are 191 ms,

481 ms, and 1053 ms, respectively.

Huang et al. [7] presented a hardware-based SIFT feature

extraction architecture, which can extract SIFT features for

640 × 480 images within 33 ms when the number of feature

points to be extracted is fewer than 890. This system consists

of two interactive hardware components, one for feature detec-

tion and the other for feature description. This system took 3.4

ms to detect SIFT features from 640 × 480 images and took

33.1 μs to extract descriptor for one detected feature. How-

ever, the detail of the feature description was not presented.

Moreover, there was no report on the time consumption for

feature matching, which is also a computationally intensive

part for the system.

In this paper, we proposed a single FPGA-based design

to detect and match visual feature in real time for real-life

applications. Considering the stability and repeatability of the

image matching system, we select SIFT as the feature point

in our design. In order to achieve real-time performance,

we replace the SIFT descriptor by the BRIEF descriptor.

Moreover, we also implement a BRIEF feature matching

component in the FPGA. The processing speed of the proposed

system can comfortably achieve 60 frames/s for 1280 × 720

images, which is appealing to most real-life applications.

III. Method

The main purpose of our design is to achieve real-time

performance for image feature detection and matching on

Fig. 1. Process diagram of the proposed system.

720-p video. In this section, we introduce the process to

establish correspondences between consecutive video frames.

As the proposed system is based on SIFT key-point associated

with the BRIEF descriptors, we also brieﬂy review the SIFT

detection and BRIEF description in this section to make

this paper self-contained. For details please see [2], [12],

and [24].

A. SIFT Key-point Associated With BRIEF Feature

As mentioned in Section II, it is very difﬁcult for a

pure software implementation without using special hardware

to achieve real-time performance for high-resolution feature

detection and matching. It is natural to resort to effective

hardware design. In the proposed design, we implement the

whole feature detection and matching system on a single

FPGA. Hence, we must choose the appropriate feature de-

tection, description, and matching methods considering not

only their computational complexity, but the constraints on

the available FPGA resources. Among of the state-of-the-art

feature description approaches in the literature, such as SIFT,

SURF, and BRIEF, we found that BRIEF is a good choice as

it is very efﬁcient and accurate. Moreover, it is very suitable

for parallel computation also, which is a very important factor

in our FPGA implementation.

Although the BRIEF descriptor is associated with the SURF

key-point in the original BRIEF paper [12], we combine it

with SIFT key-point in this design. Since SIFT is more stable

and robust than SURF and the number of feature detected

by SIFT is less than that of SURF, it can greatly reduce the

memory requirement to store the feature descriptors. Although

detecting SURF feature points is faster than SIFT points

based on software, these two methods have almost the same

processing speed when implemented in hardware. For these

reasons, we choose SIFT instead of SURF as the feature

detection method in our system.

In order to establish correspondences between consecutive

video frames, we need to store the visual features of the

two frames. The main task of feature matching part in our

system is to ﬁnd feature matching pairs that have the minimum

Hamming distances. The process diagram of the proposed

system is presented in Fig. 1. The SIFT feature detection

module detects the SIFT key-points for every frame (one

the input 720-p video) and the BRIEF feature description

part extracts the BRIEF descriptors for each detected SIFT

key-point. Then, the extracted BRIEF feature descriptors are

stored. Finally the BRIEF feature matching part reads the

BRIEF feature descriptors from feature storage A and B (for

the two images, respectively) to ﬁnd the matching pairs. The

剩余13页未读，继续阅读

评论收藏

内容反馈

jssong66

粉丝: 0
资源: 6

嵌入式架构

彭东《深度探索嵌入式操作系统--从零开始设计、架构和开发》随书源码镜像文件

嵌入式架构的单片机系统设计-嵌入式与单片机的区别与联系精品样本word.docx

嵌入式架构以及2012趋势

行业文档-设计装置-基于嵌入式架构的老年人看护硬件平台.zip

嵌入式系统软件架构设计.pdf

嵌入式软硬件架构，值得一读

第16章：嵌入式架构设计理论与实践-定稿.pdf

从入职到架构师，嵌入式软件成长之路(2021-3-31).pdf

嵌入式架构模板，供想要了解嵌入式架构的人参考

嵌入式系统软件架构设计.doc

多核嵌入式架构使能毫瓦级嵌入式AI应用.pdf

(源码)基于C++的嵌入式多架构操作系统.zip

嵌入式系统架构发展与选择

嵌入式介绍与应用资料合集汇总39篇.zip

基于分布式的通用信号处理嵌入式软件架构.pdf

嵌入式C语言架构

基于SOC架构的嵌入式网络视频服务器的设计与实现.pdf

盘点几种主流嵌入式架构的代码压缩技术

嵌入式软件架构

嵌入式系统 硬件与软件架构

嵌入式 期末复习

嵌入式硬件系统—硬件架构

RustChinaConf2020-17.洛佳-《Rust语言与嵌入式开发》.pdf

大数据时代下的三种存储架构.docx

最新资源

嵌入式系统硬件与软件架构

嵌入式期末复习