关于上下文感知的移动视觉识别的调查资源-CSDN文库

37 浏览量 2021-03-01 23:44:04 上传评论收藏 1.66MB PDF 举报

随着移动设备如智能手机和平板电脑的使用量的显著增长，移动视觉识别作为一种新兴的服务正在迅速崛起，并在许多领域得到了广泛应用，例如移动购物和增强现实。移动视觉识别指的是用户拍摄物体的照片，然后获取有关被拍摄物体的相关信息，包括文本信息（如地标标签和相关描述）、相关图片（如同一地标的不同视角）或3D模型等。移动视觉识别尤其在移动购物、移动地标识别（为旅游者提供）以及移动位置识别（用于增强现实）等领域具有特别的用处。上下文感知移动视觉识别（Context-Aware Mobile Visual Recognition, CAMVR）利用移动设备轻松获取的丰富上下文信息，如位置、时间和方向信息，来促进移动视觉识别的速度并提高识别性能。这类技术的发展极大地提升了用户体验，并扩大了移动视觉识别的应用场景。在这篇综述文章中，作者们重点探讨了上下文感知移动视觉识别的最新进展，并回顾了与不同上下文信息、识别方法、识别类型以及各种应用场景相关的研究工作。文章指出，由于移动设备内置相机和网络连接的特性，它们非常吸引用户拍摄物体的照片，然后获取有关被拍摄物体的相关信息。此外，移动视觉识别在智能购物助理、旅游中的地标识别和增强现实中的位置识别等领域尤其有用。在这项技术的发展过程中，识别方法和应用场景是两个关键的研究方向。识别方法通常包括传统机器学习方法和深度学习方法，其中深度学习方法由于其在图像处理和识别任务中表现出色而受到了越来越多的关注。应用场景则随着技术的进步而不断扩展，例如通过识别用户所拍摄的物品，系统可以为用户提供购买选项、历史信息、用户评价等，甚至在实际购买之前，还可以提供商品在虚拟空间的试穿或试用。在这篇调查文章的最后部分，作者们讨论了移动视觉识别领域的未来研究方向。随着移动设备硬件性能的不断提升，如何更好地利用这些设备的资源来增强视觉识别的性能，以及如何设计更加智能的算法来提高识别的准确性，将是未来研究的重要课题。同时，用户隐私保护和数据安全问题也需要在技术发展的同时得到相应的重视和解决。移动视觉识别技术不仅改变了人们获取信息的方式，而且正在成为智能移动设备不可或缺的一部分，它将随着相关技术的进步继续推动移动计算领域的发展。随着研究的不断深入和技术的持续创新，上下文感知的移动视觉识别势必会为用户带来更加丰富和便捷的使用体验。

资源推荐

资源详情

资源评论

1 3

DOI 10.1007/s00530-016-0523-8

Multimedia Systems (2017) 23:647–665

SPECIAL ISSUE PAPER

A survey on context‑aware mobile visual recognition

Weiqing Min

· Shuqiang Jiang

· Shuhui Wang

· Ruihan Xu

· Yushan Cao

Luis Herranz

· Zhiqiang He

Published online: 7 July 2016

application scenarios. Finally, we discuss future research

directions in this ﬁeld.

Keywords Mobile visual recognition · Context · Survey

1 Introduction

Recent years have witnessed an explosive growth in the use

of mobile devices. Built-in cameras and network connectiv-

ity make it increasingly appealing for users to snap pictures

of objects, and then, obtain relevant information about the

captured objects, which is referred to as mobile visual rec-

ognition. For example, a user takes a photo of a landmark

and automatically obtains the textual information (e.g.,

landmark tags and relevant descriptions), related images

(e.g., different views of the same landmark), or a 3D model

[73] about the landmark. Mobile visual recognition is par-

ticularly useful in applications, such as mobile shopping

[40, 68], mobile landmark recognition for tourists [11], and

mobile location recognition for augmented reality [94].

Furthermore, such mobile visual recognition functionalities

have been shown in many commercial systems, such as

Google “Goggles”,

Amazon “Snaptell”,

and “Kooaba”.

Because of its great potential in the industry, mobile

visual recognition has received increasing attention in

academia. Girod et al. [33] proposed a complete mobile

visual search system, including feature extraction, feature

matching, and geometry veriﬁcation. For each block of

the search pipeline, they designed their solutions different

http://www.google.com/mobile/goggles.

http://www.snaptell.com.

http://www.kooaba.com.

Abstract The phenomenal growth of the usage of mobile

devices (e.g., mobile phones and tablet PCs) opens up a

new service, namely mobile visual recognition, which has

been widely used in many areas, such as mobile shop-

ping and augmented reality. The rich contextual informa-

tion (e.g., location, time and direction information), easily

acquired by the mobile devices, provides useful clues to

facilitate mobile visual recognition, including speeding up

the recognition time and improving the recognition perfor-

mance. This survey focuses on recent advances in Context-

Aware Mobile Visual Recognition (CAMVR) and reviews

related work regarding to different contextual informa-

tion, recognition methods, recognition types, and various

* Shuqiang Jiang

sqjiang@ict.ac.cn

Weiqing Min

weiqing.min@vipl.ict.ac.cn

Shuhui Wang

wangshuhui@ict.ac.cn

Ruihan Xu

rhxu@ict.ac.cn

Yushan Cao

caoyushan@enet.edu.cn

Luis Herranz

luis.herranz@vipl.ict.ac.cn

Zhiqiang He

lirong2@lenovo.com

Key Lab of Intelligent Information Processing, Institute

of Computing Technology, CAS, Beijing 100190, China

Higher Education Institution Teacher Online Training Center,

Beijing, China

Lenovo Corporate Research, Beijing 100085, China

648 W. Min et al.

1 3

from general visual recognition to facilitate mobile visual

search. Furthermore, they released a data set for perfor-

mance evaluation. Chatzilari et al. [10] performed an exten-

sive comparative study of different recognition approaches

on the mobile device by evaluating the performance of the

feature extraction and encoding algorithms. Compared with

general visual recognition, mobile visual recognition has its

unique challenges:

–– Limited network bandwidth With the development of

the Internet communicate technology, such as 4G, the

bandwidth of networks increased fast. However, there

is still a bottleneck in many areas, especially those

densely populated ones, where many people are using

mobile devices simultaneously. Many mobile visual

systems extract features in the mobile side. However,

the amount of visual features sent from the mobile side

to the server should be reduced to satisfy the real-time

query requirement, which probably leads to the deg-

radation of the recognition performance. Therefore,

under the limitation of the network bandwidth, how

to send compressed features without affecting the rec-

ognition performance is a challenging problem in the

mobile recognition environment.

–– Limited battery power Existing mobile devices have lim-

ited capacity of the power. Sending a feature vector of the

query image saves network bandwidth and further reduces

the transmission cost. However, computing features will

consume the power of the battery signiﬁcantly. Obviously,

this challenges the tolerant attitudes of users to a short bat-

tery running time, since recharging is usually inconvenient

for users, especially when they are traveling.

–– Diverse photo-taking conditions Because of different

camera conﬁgurations in the mobile device (e.g., dif-

ferent resolutions) and diverse indoor/outdoor condi-

tions (e.g., varying weather conditions), how to achieve

robust visual recognition under these conditions is also

very challenging.

To solve these problems, many existing works [13, 33, 41,

108] have developed different visual recognition methods

to improve the mobile visual recognition experience. These

methods directly extract the visual features for image rep-

resentation, including deep features [52]. To reduce the

amount of data sent from the mobile device to the server,

some encoding methods on the mobile side have been

developed to compress the visual features, such as SURF

[7], CHoG [8], and BoHB [40]. However, one shortcom-

ing of these approaches is that they mainly analyze the

content alone, while ignore the rich contextual information

(e.g., the GPS and time information) easily acquired by the

mobile device, which can speed up the recognition time

and improve the recognition performance.

In fact, mobile devices bring a lot of contextual infor-

mation, which can be categorized into two levels: one is

the internal contextual information which is intrinsically

contained in the mobile devices, such as stored textual/

visual content, camera, and other sensor’s parameters.

The other is the external contextual information which

could be easily acquired by the mobile device, such as

time and geo-location. Researchers have exploited many

of them to improve the recognition performance. Com-

monly used contexts include location, direction, time, text,

gravity, acceleration, and other camera parameters. For

example, in [95], content analysis is essentially ﬁltered

by a pre-deﬁned area centered at the GPS location of the

query image. Chen et al. [11] utilized the GPS informa-

tion to narrow the search space for landmark recognition.

Ji et al. [51] designed a GPS-based location discriminative

vocabulary coding scheme, which achieves extremely low-

bit-rate query transmission for mobile landmark search.

Chen et al. [19, 22] combined the visual information with

the contextual information, including the location and the

direction information for mobile landmark recognition.

Runge et al. [86] suggested the tags of images using the

location name and time period. Gui et al. [36] fused out-

puts of inertial sensors and computer vision techniques

for mobile scene recognition. In such cases, utilizing the

contextual information in mobile visual recognition can

speed up the recognition time and improve the recognition

performance.

In this survey, we give a comprehensive overview of

Context-Aware Mobile Visual Recognition (CAMVR). A

typical pipeline for CAMVR is shown in the top of Fig. 1.

For the client side, the input is the captured object (e.g.,

one landmark, food, clothes and painting) and the contex-

tual information acquired by the mobile phone (e.g., loca-

tion, time, and weather). After the input information is sent

to the server, one recognition method (e.g., classiﬁcation

and retrieval) from the server side is selected to recognize

the object and the relevant information is returned to the

user as the output. From the overall system, we can review

CAMVR from three different aspects, namely contextual

information, recognition method, and recognition types.

Based on the CAMVR system, there are great poten-

tial applications (in the bottom of Fig. 1), such as mobile

product search, mobile recommendation, and augmented

reality.

The rest of the survey is organized as follows: In Sect. 2

through Sect. 4, we survey the state-of-the-art approaches

of CAMVR according to different contextual information,

different recognition methods, and different recognition

types, respectively. In Sect. 5, we introduce various appli-

cation scenarios based on CAMVR. Finally, we conclude

the paper with a discussion of future research directions in

Sect. 6.

剩余18页未读，继续阅读

评论收藏

内容反馈

weixin_38611527

粉丝: 8
资源: 903

关于上下文感知的移动视觉识别的调查

用于人类活动识别的上下文感知中间件设计调查

通过交互式感知从一组有限的假设中引导机器人生态感知

空间序列光谱上下文意识跟踪

人机交互期末复习要点.pdf

人机交互eye track

基于VC2005的手写数字识别系统代码

机器视觉-基于目标追踪的智能运动分析与行为预测.zip

移动机器人多传感器测距系统研究与设计.pdf

导师推荐的视频跟踪的经典论文

计算机专业英语阅读课件

电子信息工程学科进展.pdf

显著性检测近年来顶会论文和代码

2200_国赛第六届工训_openmv_2200.com_lengthvoo_arduino_

2021年最新的海康威视IP相机选型手册

AI大模型赋能人形机器人，迈向通用人工智能的一大步.pdf

行业分类-设备装置-手写信息处理设备和手写信息处理方法.zip

空地机器人协同导航方法与实验研究.pdf

DonchianChannelsCloud - MetaTrader 5脚本.zip

双足机器人避障与步态规划研究.pdf

智能机器人

行业分类-设备装置-一种界面快速导航的方法及装置.zip

rader4_雷达_givingl4v_mtd_快时间慢时间.zip

智能工厂和智能制造.ppt

用人工智能打造虚拟生命（63页）.pdf

行业分类-设备装置-一种笔式用户界面中基于笔尖与笔身轨迹的命令扩展方法.zip

变革与重构：人工智能介入话语实践.pdf

20230519-国泰君安-产业深度：AI大模型赋能人形机器人，迈向通用人工智能的一大步(1).pdf

最新资源