Combatingtheevolvingspammersinonlinesocialnetworks资源-CSDN文库

研究论文

34 浏览量 2021-02-07 02:02:55 上传评论收藏 1.26MB PDF 举报

资源推荐

资源详情

资源评论

Combating the evolving spammers in online

social networks

Qiang Fu

, Bo Feng

, Dong Guo

a,b

, Qiang Li

a,b,

College of Computer Science and Technology, Jilin University, Changchun, China

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin

University, Changchun, China

ARTICLE INFO

Article history:

Received 17 February 2017

Received in revised form 22 August

2017

Accepted 22 August 2017

Available online 7 September 2017

ABSTRACT

Online social networks, such as Facebook and Sina Weibo, have become the most popular

platforms for information sharing and social activities. Spammers have utilized social net-

works as a new way to spread spam information using fake accounts. Many detection methods

have been proposed to solve this problem, and have been proved to be successful to some

extent. However, as the spammers’ strategies for evading detection evolve, many existing

methods lose their efﬁcacy. A major limitation of previous approaches is that they are using

the features from a static time point to detect spammers, without considering temporal factors.

In this study, we approach the challenge of spammer detection by leveraging the temporal

evolution patterns of users. We propose a dynamic metric to measure the change in users’

activities and design new features to quantify users’ evolution patterns. Then we develop

a framework by combining unsupervised and supervised learning to distinguish between

spammers and legitimate users. We test our method on a real world dataset with a large

number of users. The evaluation results show that our approach can efﬁciently distin-

guish the difference between spammers and legitimate users regarding temporal evolution

patterns. It also demonstrates the high level of similarity in the spammers’ temporal evo-

lution patterns. Compared with other detection methods, our method can achieve better

performance. To the best of our knowledge, our study is the ﬁrst to provide a generic and

efﬁcient framework to depict the evolutional pattern of users. It can handle the problem

of spammers updating their strategies to evade detection and is a valuable reference for

this research ﬁeld.

Keywords:

Online social networks

Spammer detection

Temporal evolution

Machine learning

Classiﬁcation

1. Introduction

Online Social Networks (OSNs), such as Twitter, Facebook, and

Sina Weibo, have become an essential part of people’s daily

life. Facebook, the most popular social network worldwide, has

a monthly average of 1.51 billion active users (

Statista, 2016).

With their increasing inﬂuence among users, OSNs have become

an ideal platform for spammers to spread spams. The term

“spammers” mainly refers to users that initialize unsolicited

social relationships or send unsolicited messages through fake

accounts, social bots or spam applications (

Yang et al., 2014).

The types of attacks that are launched by spam include, but

are not limited to, product advertisements, phishing attacks,

and drive-by-download attacks. Spam can reduce users’ social

media experiences by annoying them with content that they

* Corresponding author.

E-mail addresses:

fuqiang15@mails.jlu.edu.cn (Q. Fu), fengbo16@mails.jlu.edu.cn (B. Feng), guodong@jlu.edu.cn (D. Guo), li_qiang@

jlu.edu.cn

(Q. Li).

http://dx.doi.org/10.1016/j.cose.2017.08.014

computers & security 72 (2018) 60–73

Available online at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/cose

ScienceDirec

are not interested in. Furthermore, it can lead to privacy leaks

or economic losses if users are tricked to the phishing web-

sites. Hence, accurately detecting spammers to make online

social networks more user-friendly and secure is one of the

most serious issues in existing OSNs.

The primary challenge of detecting spammers is that they

are upgrading their spam strategies rapidly to race with the

development of detection systems (

Yang et al., 2013). For

methods that use common features based on user proﬁles and

message content, such as

Chu et al. (2012); Egele et al. (2013);

Gao et al. (2012), spammers can evade being detected by pur-

chasing followers or using tools to post messages with the same

meaning but different words automatically (

Yang et al., 2013).

Yang et al. (2012) found that spammers tend to be inter-

connected, forming account communities, thus rendering

certain advanced features for detecting spammers such as Clus-

tering Coefﬁcient, ineffective. The primary assumption of the

PageRank-based method is that there are a limited number of

the edges maintaining reciprocal social relationships between

spammers and legitimate users, yet the evidence that legiti-

mate users follow spammers more than expected has been

found.

Ghosh et al. (2012) found that a small fraction of users,

known as social capitalists, follow back anyone who follows

them to increase their reputation.

Yang et al. (2012) also dis-

covered supporter accounts that help spammers avoid detection

by increasing their followers, allowing them to prey on more

victims.

As conventional detecting methods cannot cope with the

new strategies adopted by spammers, researchers have pro-

posed some new approaches to meet these challenges. For

example,

Yang et al. (2013) used some features that are more

sophisticated than the previous ones to improve the efﬁ-

ciency of machine learning classiﬁers.

Boshmaf et al. (2016),

based on the information of victims who are benign social

network users and have mutual connections with spammers,

made the PageRank-based method more robust.These studies

provide a deeper insight into the difference between spammers

and legitimate users and improve detection accuracy. However,

whether an individual user is a spammer is inferred by these

methods based on user characteristics at a single instant of

time.Their real-world data about users are collected at a single

point of time, and the experiments are conducted and evalu-

ated from the perspective of a static social network. In fact,

social networks are constantly changing, and spammers may

be able to improve the effectiveness of an attack through per-

sistent efforts (

Liu et al., 2015). Therefore, as spammers evolve

their strategies to evade detection, the capacity of these ap-

proaches to efﬁciently detect them becomes dubious.

In this study, we introduce temporal factors into the de-

tection of spammers by inspecting the activities of users over

an extended period of time and offer a detecting framework

to identify the spammers that evade detection by changing their

strategies. Intuitionally, even if many spammers can make their

accounts appear like legitimate user accounts at some static

time points to avoid being detected, it is impossible for them

to manipulate the dynamic changing process of features over

an extended period of time due to the high cost (

Yang et al.,

2013).

To achieve our research goals, we collect the proﬁles of a

vast number of social network users and track their activities

over a series of points of time. A window-based dynamic metric

is used to assess the temporal evolution patterns of users and

uncover a clear distinction between legitimate users and

spammers concerning different aspects of the temporal pat-

terns. Based on the dynamic metric, new temporal user features

are designed to detect spammers. Instead of using these fea-

tures to identify spammers directly, we investigate the similarity

in the temporal patterns of different spammers, and conduct

a clustering algorithm (

Maulik and Bandyopadhyay, 2000)on

users by abstracting their dynamic metrics into feature vectors.

The results indicate that it is relatively easier to group

spammers into the same cluster. We combine the new fea-

tures with the clustering results to build a machine learning

classiﬁer for accurate detection of spammers. Finally, we evalu-

ate our method using the real-world dataset and demonstrate

the effectiveness of our approach by comparing it with two con-

ventional spammer detection methods.

The contributions of this paper are three-folds:

• We propose a dynamic metric model based on sliding

window to measure the dynamic changes of social net-

works users’ activities.

• On the basis of the dynamic metric, we design new fea-

tures to describe the temporal evolution patterns and come

up with a novel framework combining unsupervised clus-

tering and supervised classiﬁcation to detect spammers in

OSNs.

• We implement our approach and evaluate it using a real-

world dataset. Compared to conventional methods, we are

able to achieve better performance.

The remainder of this paper is organized as follows:

Section

2 covers related works on spammer detection in online social

networks.

Section 3 provides the motivation and some as-

sumptions used in this study. Section 4 provides further details

about the proposed dynamic metric.

Section 5 describes our

approach for detecting spammers in online social networks.

Section 6 explains the experiments conducted and results of

the evaluation of our approach.

Section 7 concludes the paper.

2. Related work

Due to the inundation of spams in online social networks, many

studies have been conducted aiming to investigate spammers

in different kinds of social networks. Some previous works

focused on characterizing spammers (

Almaatouq et al., 2014;

Gao et al., 2010; Grier et al., 2010; Stringhini et al., 2010; Thomas

et al., 2011

). After collecting a certain amount of data of

spammer accounts, the characteristics of spammers were ana-

lyzed in these studies from different aspects, such as individual

proﬁles, propagative behavior, message contents and social re-

lationships.

Stringhini et al. (2010) collected the data related

to spammers by using Honeypot and then divided them into

different categories according to their behavior patterns.

Yang

et al. (2014) deploy social honeypots on Twitter with diverse

strategies to trap spammers, consequently revealing the pref-

erences of spammers when they are ﬁnding their targets. These

works showed the fundamental characteristics of spammers

from a variety of aspects, laid the foundation for the follow-up

61computers & security 72 (2018) 60–73

researches but did not provide sophisticated and efﬁcient

methods of detection.

Subsequently, many approaches have been proposed to

combat spam in online social networks.These approaches can

be generally divided into two types: machine learning-based

(

Hu et al., 2013; Lee et al., 2010; Lee and Kim, 2012; Wang et al.,

2015) and social-graph-based (Cao et al., 2012; Xue et al., 2013;

Yang et al., 2012

) approaches. Chen et al. (2015) constructed a

huge ground-truth dataset consisting of 6.5 million spam tweets

and 6 million non-spam tweets, and then conducted a com-

prehensive evaluation of different machine learning algorithms

using lightweight features.

Gao et al. (2012) present an online

spam ﬁltering system as a component of online social network

platforms. They aggregated messages generated by users to

campaigns by adopting incremental clustering algorithm and

used six features to distinguish spam campaigns.

Cao et al.

(2012) designed an inference scheme to detect fake accounts

by computing the landing probability of early terminated

random walks.Yang et al

(2013) further explored the friend in-

vitation graphs and developed a detection system based on it.

Given the ability of spammers to rapidly change their tactics

to evade such detections (

Zhu et al., 2012), these approaches

are not very effective.

To meet this challenge, researchers have developed their

countermeasures.

Yang et al. (2013) identiﬁed and veriﬁed

several common evasion strategies used by spammers, de-

signed some more sophisticated detection features based on

the analysis, and then proposed a formal model to evaluate

the robustness of the new features.

Fu et al. (2015) extracted

the carefulness of users as a metric to indicate how careful a

user is when following another user, and subsequently made

use of this parameter to adjust and improve existing features

and methods.

Chen et al. (2017) focused on the “Twitter spam

drift” problem in which spammers post more tweets with the

similar semantic meaning but different text to evade detec-

tion; they proposed a “Lfun” approach which learns from

unlabeled tweets to address the “Twitter Spam Drift” problem.

However, the weakness of the above approaches is that they

do not address the critical issue as they still consider the social

networks as a static system, whereas spammers are con-

stantly ﬁnding new evasion techniques.To address this problem,

we propose a dynamic metric to describe temporal patterns

of users and develop a novel method for identifying spammers.

It is one of the main differences between our method and other

previous approaches.

3. Preliminary

Before illustrating our study in detail, we provide the motiva-

tion behind our work and the assumptions used in our

approach.

3.1. Motivation

As mentioned earlier, the means that spammers use to evade

detection are becoming increasingly sophisticated. If an in-

spection system is capable of discovering majority of the

spamming accounts at a period, its capacity to do so at another

period is uncertain.

The main reason for this new challenge is that most de-

tection mechanisms characterize users on the basis of their

features at a single point of time, whereas spammers continu-

ously optimize their spamming strategies.This aspect motivated

us to obtain a deeper insight about users in terms of tempo-

ral evolution patterns and design a detection system.

3.2. Assumptions

To make the proposed approach more reasonable, we make fol-

lowing assumptions according to the observations of the dataset

and the experiences of previous studies.

Assumption 1. Spammers constantly change their spamming

strategies.

This assumption is mentioned and utilized in many exist-

ing approaches, such as

Liu et al. (2016); Tan et al. (2013). The

reason behind this assumption is easy to understand: As the

intensity of detection increases, only those spammers who

adjust their strategies according to the detection method will

be able to survive. Meanwhile, legitimate users will not have

to make these adjustments, thus forming relatively non-

volatile patterns.

Assumption 2. Spammers tend to control a large number of

accounts to spread spam.

The proﬁts of spammers’ activities are dependent on the

extent of users to which their spam messages can reach.

Because of the broad adoption of features based on bursty prop-

erty, such as time interval, spammers cannot generate a massive

amount of spam information using a few accounts. There-

fore, spammers usually create or compromise a signiﬁcant

number of accounts and use them to spread spam to a large

set of users, thereby resulting in corresponding spam ac-

counts with similar behavioral patterns.

4. Dynamic metric

In this section, we ﬁrst present the activity measures that are

used to build our dynamic metric.These activity measures prin-

cipally consist of features based on users’ activities. We then

illustrate the proposed dynamic metric and the new features

for characterizing the evolution patterns and detecting

spammers.

4.1. Activity measures

The features that we use as activity measures to build the

dynamic metric are divided into two categories: graph-based

and non-graph-based. For a spammer, the ﬁrst step to spread

the malicious information in social networks is to establish

social relationships with other users, thus features based on

the social graph is a primary source of access to users’ pref-

erences and characteristics. We select four of these features:

degree centrality, bidirectional link ratio, betweenness centrality, and

62 computers & security 72 (2018) 60–73

剩余13页未读，继续阅读

评论收藏

内容反馈

weixin_38727694

粉丝: 4
资源: 946

Combating the evolving spammers in online social networks

Discovering hidden suspicious accounts in online social networks

Dexter_Michael_Combating_Evolving_Ransomware_at_the_Block_Level.pdf

Combating QR-Code-Based Compromised Accounts in Mobile Social Networks

新东方考研英语培训教材-英文写作佳句300例.doc

Combating the class imbalance problem in sparse representation learning

ZigZag Decoding: Combating Hidden Terminals in Wireless

讨论观点类的英语四级作文模板.doc

Combating Hidden and Exposed Terminal Problems in Wireless Networks

ICI_OFDM.rar_802.11a_EKF_INTER CARRIER_ofdm symbol_self cancella

我国节能与新能源汽车发展战略与对策 (1).pdf

英语写作中不得不用的结尾句

藏经阁-Combating Abusive Language.pdf

Python Data Structures and Algorithms [2017]

Combating Web spam through trust-distrust propagation with confidence.pdf

英语作文常用短语

php.ini-development

An in vitro investigation of photodynamic efficacy of FosPegr on human colon cancer cells

pso 优化神经网络 MATLAB代码

浙江省台州市天台县平桥第二中学高中英语 Unit5 Nelson Mandela Words and expressions课件

Kernel Affine Projection Sign Algorithms for Combating Impulse Interference

Towards quantifying visual similarity of domain names for combating typosquatting abuse

Experimental demonstration of 60 Gb/s optical OFDM transmissions at 1550 nm over 100 m OM1 MMF IMDD system with central launching

Mastering-Malware-Analysis:掌握恶意软件分析，由Packt发布

建筑学里的战略与策划扁平化欧美风PPT模板 - 副本.pptx

最新资源