UnderstandingMultipleFeatureswithHyper-cubeforDistinguishingUncertainObjectsinMobileCrowdsensing资源-CSDN文库

13 浏览量 2021-02-09 13:00:52 上传评论收藏 880KB PDF 举报

Understanding Multiple Features with Hyper-cube for Distinguishing Uncertain Objects in Mobile Crowdsensing 这篇文章的标题和描述指向的主题是关于移动众包感知（Mobile Crowdsensing, MCS）中对不确定对象的多特征理解，并通过超立方体结构区分这些对象的研究。在介绍这项研究之前，我们需要先了解几个关键概念和背景信息。移动众包感知是指利用移动设备收集和共享关于人体或物理世界的数据的应用程序。这些应用程序的目标是为用户带来利益。智能手机集成了各种传感器，并具备强大的计算、存储和通信能力，使得它们成为移动众包感知的理想的执行平台。在移动众包感知中，数据通常包含不确定性，与之对应的对象往往是不明确的。为了提高系统的性能，研究者们往往会增加特征数量。但是，使用的特征越多，相应地会引入更多的冗余和成本。因此，对于特定应用选择的特征数量是准确性和成本之间的权衡。文章提到的另一个重要概念是相对熵（Relative Entropy）。在信息论中，相对熵是衡量两个概率分布之间差异的方法。文章中提出，相比欧几里得距离，使用相对熵来定义超立方体顶点之间的边能更准确地测量数据概率分布之间的差异。文章的核心是提出一种模型，它将准确性和成本之间的权衡建模为一个优化问题。文章提出了一个使用超立方体结构的多特征感知模型。在这个模型中，不确定对象的每个特征被表示为超立方体顶点坐标的分量。同时，偏好使用相对熵来定义顶点间的边，而不是欧几里得距离。研究者们采用现实世界中的众感知识别案例的实数据进行了评估，这些数据是通过智能手机上的传感器收集的。关键词包括不确定对象、众感知、相对熵和多特征。在介绍部分，文章回顾了智能手机的流行性，这些设备不仅配备了各种传感器，还具有强大的计算、存储和通信能力。这种感知和计算能力的结合使得智能手机成为了移动众包感知的理想选择。例如，文献[2]中P.Dutta等人提出了一套系统（Com），用于通过众包感知技术收集关于环境、基础设施乃至社会活动的数据。研究背景涉及了移动众包感知在环境监测、基础设施检测和社会活动收集数据等方面的广泛应用。例如，研究人员可以使用众包感知技术来检测环境中的污染情况，监测基础设施的使用状况，甚至分析社会活动的热点和动态。本研究提出了一种新的方法，用以在特征选择上实现准确性和成本之间的平衡。通过使用超立方体结构和相对熵作为度量，研究者们为移动众包感知应用提供了一种能够有效处理不确定数据并识别特定对象的框架。这一研究在智能城市、健康监测、环境科学等多个领域都有潜在的应用价值。通过优化特征选择和成本，可以进一步提升智能手机等移动设备在收集和处理数据时的效率和准确性。

资源推荐

资源详情

资源评论

Understanding Multiple Features with Hypercube for

Distinguishing Uncertain Objects in Mobile

Crowdsensing

Liu Bin

∗

, Chao Song

∗

, Ming Liu

∗

, Nianbo Liu

∗

, Jinqi Zhu

†

∗

School of Computer Science and Engineering, University of Electronic Science and Technology of China, China

†

School of Computer and Information Engineering, Tianjin Normal University, China

Email:binliu520@gmail.com, {chaosong, csmliu, liunb}@uestc.edu.cn, jingpei719@163.com

Abstract—Uncertain data are inherent in mobile crowdsensing

applications, and the objects that they correspond to are usually

vaguely speciﬁed. In order to improve performance, we often

increase the number of features. However, the more features are

used, the more redundancy and cost are involved correspondingly.

Therefore, the number of features we selected for a speciﬁed

application is a tradeoff between the accuracy and the cost.

In this paper, we model such tradeoff between accuracy and

cost as an optimization problem. Moreover, for investigating this

problem, we propose to model the sensing with multiple features

under a hypercube structure. In our scheme, each feature of

uncertain objects is represented as a component of the vertex’s

coordinate in hypercube. At the same time, we prefer to deﬁne

the edges between vertices with relative entropy rather than

Euclidean distance. Because the former one could accurately

measures the difference between two probability distributions

of data. We evaluate our proposed schemes with real data

of a crowdsensing recognition case, which are collected by

smartphones with sensors.

Keywords—uncertain object, crowdsensing, relative entropy,

multiple features

I. INTRODUCTION

Recently, smartphones have become extremely popular.

They not only equip with various sensors, but also have

powerful capabilities of computing, storage, and communi-

cation. The integration of sensing and computing abilities

makes smartphone available of mobile crowdsensing. Mobile

crowdsensing(MCS) refers to applications that leverage mobile

devices to collect and share data about human body or the

physical world, towards a goal that can beneﬁt users. With

crowdsensing technique, we can sense environment, infrastruc-

ture, and even social activities [1]. For example, P. Dutta et

al. [2] propose a system (Common Sense) to monitor the air

condition in a city.

For crowdsensing applications, the data collected by the

smartphones are usually inherently uncertain. For example,

because of data randomness and limitation of measuring

sensors, the data we collected may be incomplete or even

may contain error. In some other cases, the data we collected

may correspond to objects which are only vaguely speciﬁed,

therefore we need to explore data representation uncertainties

[3]. Generally, an uncertain object can be represented by a

probability distribution of uncertain data.

In crowdsensing, the features can be extracted from sensory

data. For instance, the x-axis readings of gyroscope is a kind

of feature. Sometimes one sensor may include more than one

features. For example, ambient sound collected by microphone

is indicated by loudness, frequency and so on. For simplicity,

we unify our conception as feature when we mention sensor

or feature in this paper. In short, we deal with all the sensors

as features, whatever the sensor generates single or multiple

features.

In order to enhance performance of distinguishing un-

certain objects from uncertain data, we initially increase the

number of features to improve accuracy. For example in [4], in

order to improve localization service, the author manipulated at

least four sensors including microphone, camera, WiFi radio

and accelerometer. The aim is to combine multiple features

for reliable localization service. Multiple features were also

used in [5], the authors manipulate accelerometer, microphone,

GSM radio, GPS sensors within smartphones to detect trafﬁc

conditions such as potholes, bumps, braking, and honking.

However, the more features we introduced in our application,

the more energy we cost. Therefore, the number of features

we selected for a speciﬁed application is a tradeoff between

the accuracy and the cost. However, the existing solutions for

crowdsensing usually prefer to combine all potentially valuable

features or sensors to improve accuracy without considering

the tradeoff between energy spending and accuracy.

In this paper, we propose to utilize multiple features to

distinguish the uncertain objects for crowdsensing, under the

constraint of energy cost. In our scheme, we try to model the

tradeoff between accuracy and energy cost as an optimization

problem. There are two steps of achieving this goal. First, we

quantify accuracy and energy cost respectively. Second, we try

to maximize accuracy with the constraint conditions of energy

cost.

Moreover, for investigating this problem, we propose to

utilize the hypercube to explain the optimal solution of sensing

with multiple features. We map the features that we selected in

distinguishing an uncertain object into a hypercube space. In

the hypercube, each feature is represented as the corresponding

component of this uncertain object’s coordinate in hypercube.

Considering that the uncertain objects are usually represented

by probability distribution of uncertain data, we prefer to use

relative entropy as calibration rather than Euclidean distance in

constructing feature hypercubes. Because the former one could

accurately measures the difference between two probability

distributions of data.

The main contributions of this paper are multi-fold, which

include:

1) We model the tradeoff between accuracy and energy

cost of distinguishing the uncertain objects for crowd-

sensing as an optimization problem.

2) We quantify the accuracy of distinguishing uncertain

objects.

3) We investigate our model by leveraging the hy-

percube to explain the optimal solution of feature

selection.

4) We evaluate our proposed schemes with real data of

multiple features, and the result proves its low-cost,

high-accuracy performance.

The rest of paper is structured as follows: Section II

presents a brief overview of related works. In section III,

we introduce optimization problem between multiple feature

selection and energy cost in crowdsensing. In section IV, we

explain algebra solution of multiple features with hypercube.

Section V evaluates our schemes through the data we collected

in realistic scenarios, and Section VI concludes the paper with

directions for future work.

II. RELATED WORK

A. Uncertain Data in Crowdsensing

Uncertain data is a notion of data that contains speciﬁc

uncertainty. It typically found in the area of sensor networks.

There are three main models that could explain the generation

of uncertain data, attribute uncertainty, correlated uncertainty

and tuple uncertainty [6]. In most cases, data may contain

errors or may only be partially complete because of sources

imprecision including data randomness, limitation of measur-

ing sensors. Besides, the data we collected may correspond to

objects which are only vaguely speciﬁed, therefore we need to

explore data representation uncertainty [3].Modeling uncertain

data has been studied a lot in [7]–[9]. A database that provides

incomplete information consists of a set of possible instances

of the database.

Tao et al. [10] try to describe query result of d-dimensional

point x that belongs to an uncertain object O with a probability

threshold p

∈ [0, 1]. They suppose that uncertain object O

is associated with a probability density function p(x) and d-

dimensional uncertainty region Ω

. Then they calculate the

integration of probability density function on hyperspace Ω

Finally they deﬁne object O as a result if the integration is

larger than the threshold, otherwise it is not a result. In [8],

the authors deﬁne probabilistic database as a pair < x, p >,

where x is a ﬁnite set of possible database instances consistent

with a given schema, and p(I) is the probability associated

with any instance I ∈ x.

B. Energy-aware Mobile Computing

Energy awareness attract a lot of attention recently. One

important reason is the uptime of battery-powered mobile

devices, such as smartphones and tablets. In [11], the authors

found that we can achieve different energy consumption if

we implement algorithms in alternative ways. Regarding the

multiple sensor-based applications of smartphone, we have to

balance the tradeoff between energy cost and accuracy. For

example, there are several choices of localization by resorting

to different sensors in smartphone. GPS could achieve a high

accuracy while may consume too much energy. Other alter-

native ways such as leveraging cellar tower(GSM), WiFi or

Buletooth, which could achieve different localization accuracy

by consuming different energy. Maximilian Schirmer et al.

[12], propose sensor substitution approaches to reduce the

energy consumption of utilizing smartphone sensors.

C. Hypercube

The applications of hypercube have been initially studied

in parallel and distributed computing [13] [14]. And most of

recent hypercube based researches focus on routing. In [15],

the authors utilize hypercube to design a team multicast routing

protocol to address the scalability in mobile ad hoc networks.

H. Huo et al. [16] deﬁned a routing selection and maintenance

rules based on a logical hypercube structure. In [17], the

authors linked Bluetooth devices as a hypercube to construct

a parallel computation and communication environment. Also

trying to leverage hypercube properties, the authors in [18]

[19] mapped social features into hypercube to design routing

algorithm for HCNs(human contact networks). We also use

hypercube to analyze multiple features selection solution in

geometric view. And for future work, we would explore the

hypercube properties on objects prediction.

D. Relative Entropy

Bin Jiang et al. [20] propose a novel probability similarity

measurement, KL divergence, to deal with the data clustering

problem for uncertain data. With the KL divergence, which can

distinguish the probability distribution difference of two ob-

jects, they update the partitioning and density-based clustering

approaches. They also test their new clustering algorithms with

synthetic data. In [21] [22], KL divergence was treated as a

feature to differentiate two kind of sound. For example, S. Basu

et al. [21] ﬁrst introduce the relative entropy(KL divergence)

as a feature of sound to distinguish speech and other ambient

sounds.

III. OPTIMIZATION PROCESS FEATURE SELECTION

In this section, we introduce the optimization problem

between multiple feature selection and energy cost in crowd-

sensing. We explain data uncertainty of crowdsensing as the

ﬁrst step. Then we will deﬁne the accuracy as our optimization

target under the constraints of energy cost. Table I summarizes

a list of essential notations used throughout the paper. (Some

notations are introduced later.)

A. Uncertain Data

There are many studies on modeling uncertain data [7]–[9].

In [7], the possible world model has been proposed. In this

model, any legitimate combination of each tuple constitutes

an instance of a possible world. We can deﬁne uncertain

database in probabilistic way, which is a ﬁnite probability

space whose outcomes are all possible database instances

consistent with a given schema. This can be represented as the

pair < X, p >, where X is a ﬁnite set of possible database

instances consistent with a given schema, and p(x) is the

probability associated with any instance x ∈ X. We note that

剩余8页未读，继续阅读

评论收藏

内容反馈

weixin_38562085

粉丝: 6
资源: 964

Understanding Multiple Features with Hyper-cube for Distinguishi...

最新资源

Understanding Multiple Features with Hyper-cube for Distinguishi...

Adjacent-Vertex-Distinguishing Proper Edge Colorings of Planar Bipartite Graphs with △=9, 10, 11

hyperledger-fabricdocs+Documentation+(Feb+19,+2018)-hyperledger(2018)（100+页完整文档）

Understanding LTE with MATLAB

Distinguishing analyte from noise components in mass spectra

Python For Dummies, 2nd Edition

Universal-USB-Installer

A New Supramolecular Fluorescence Probe for Distinguishing waste oil by Determination of Dodecylbenzenesulfonic Anion

Seven Databases in Seven Weeks - Luc Perkins

On the vertex-distinguishing proper total colorings of several classes of complete p-partite graphs with equipotent parts

Computing Attitude and Affect in Text: Theory and Applications

Vertex-Distinguishing E-Total Colorings of Graphs

基于CORDIC的反正弦和反余弦计算的FPGA实现

BA无标度网络中的SIR模型

使用3DCNN和卷积LSTM进行手势识别学习时空特征

基于三次贝塞尔曲线的类汽车曲率连续路径平滑

基于机器学习的设备剩余寿命预测方法综述

基于维纳过程的退化模型，具有递归过滤算法，可用于估计剩余使用寿命

基于FPGA的奇异值和特征值分解的快速实现。

磁悬浮系统自适应模糊PID控制器的设计

基于BP神经网络的人口预测

最新资源