没有合适的资源?快使用搜索试试~ 我知道了~
Combating the evolving spammers in online social networks
0 下载量 34 浏览量
2021-02-07
02:02:55
上传
评论
收藏 1.26MB PDF 举报
温馨提示
Combating the evolving spammers in online social networks
资源推荐
资源详情
资源评论
Combating the evolving spammers in online
social networks
Qiang Fu
a
, Bo Feng
a
, Dong Guo
a,b
, Qiang Li
a,b,
*
a
College of Computer Science and Technology, Jilin University, Changchun, China
b
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin
University, Changchun, China
ARTICLE INFO
Article history:
Received 17 February 2017
Received in revised form 22 August
2017
Accepted 22 August 2017
Available online 7 September 2017
ABSTRACT
Online social networks, such as Facebook and Sina Weibo, have become the most popular
platforms for information sharing and social activities. Spammers have utilized social net-
works as a new way to spread spam information using fake accounts. Many detection methods
have been proposed to solve this problem, and have been proved to be successful to some
extent. However, as the spammers’ strategies for evading detection evolve, many existing
methods lose their efficacy. A major limitation of previous approaches is that they are using
the features from a static time point to detect spammers, without considering temporal factors.
In this study, we approach the challenge of spammer detection by leveraging the temporal
evolution patterns of users. We propose a dynamic metric to measure the change in users’
activities and design new features to quantify users’ evolution patterns. Then we develop
a framework by combining unsupervised and supervised learning to distinguish between
spammers and legitimate users. We test our method on a real world dataset with a large
number of users. The evaluation results show that our approach can efficiently distin-
guish the difference between spammers and legitimate users regarding temporal evolution
patterns. It also demonstrates the high level of similarity in the spammers’ temporal evo-
lution patterns. Compared with other detection methods, our method can achieve better
performance. To the best of our knowledge, our study is the first to provide a generic and
efficient framework to depict the evolutional pattern of users. It can handle the problem
of spammers updating their strategies to evade detection and is a valuable reference for
this research field.
© 2017 Elsevier Ltd. All rights reserved.
Keywords:
Online social networks
Spammer detection
Temporal evolution
Machine learning
Classification
1. Introduction
Online Social Networks (OSNs), such as Twitter, Facebook, and
Sina Weibo, have become an essential part of people’s daily
life. Facebook, the most popular social network worldwide, has
a monthly average of 1.51 billion active users (
Statista, 2016).
With their increasing influence among users, OSNs have become
an ideal platform for spammers to spread spams. The term
“spammers” mainly refers to users that initialize unsolicited
social relationships or send unsolicited messages through fake
accounts, social bots or spam applications (
Yang et al., 2014).
The types of attacks that are launched by spam include, but
are not limited to, product advertisements, phishing attacks,
and drive-by-download attacks. Spam can reduce users’ social
media experiences by annoying them with content that they
* Corresponding author.
E-mail addresses:
fuqiang15@mails.jlu.edu.cn (Q. Fu), fengbo16@mails.jlu.edu.cn (B. Feng), guodong@jlu.edu.cn (D. Guo), li_qiang@
jlu.edu.cn
(Q. Li).
http://dx.doi.org/10.1016/j.cose.2017.08.014
0167-4048/© 2017 Elsevier Ltd. All rights reserved.
computers & security 72 (2018) 60–73
Available online at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/cose
ScienceDirec
t
are not interested in. Furthermore, it can lead to privacy leaks
or economic losses if users are tricked to the phishing web-
sites. Hence, accurately detecting spammers to make online
social networks more user-friendly and secure is one of the
most serious issues in existing OSNs.
The primary challenge of detecting spammers is that they
are upgrading their spam strategies rapidly to race with the
development of detection systems (
Yang et al., 2013). For
methods that use common features based on user profiles and
message content, such as
Chu et al. (2012); Egele et al. (2013);
Gao et al. (2012), spammers can evade being detected by pur-
chasing followers or using tools to post messages with the same
meaning but different words automatically (
Yang et al., 2013).
Yang et al. (2012) found that spammers tend to be inter-
connected, forming account communities, thus rendering
certain advanced features for detecting spammers such as Clus-
tering Coefficient, ineffective. The primary assumption of the
PageRank-based method is that there are a limited number of
the edges maintaining reciprocal social relationships between
spammers and legitimate users, yet the evidence that legiti-
mate users follow spammers more than expected has been
found.
Ghosh et al. (2012) found that a small fraction of users,
known as social capitalists, follow back anyone who follows
them to increase their reputation.
Yang et al. (2012) also dis-
covered supporter accounts that help spammers avoid detection
by increasing their followers, allowing them to prey on more
victims.
As conventional detecting methods cannot cope with the
new strategies adopted by spammers, researchers have pro-
posed some new approaches to meet these challenges. For
example,
Yang et al. (2013) used some features that are more
sophisticated than the previous ones to improve the effi-
ciency of machine learning classifiers.
Boshmaf et al. (2016),
based on the information of victims who are benign social
network users and have mutual connections with spammers,
made the PageRank-based method more robust.These studies
provide a deeper insight into the difference between spammers
and legitimate users and improve detection accuracy. However,
whether an individual user is a spammer is inferred by these
methods based on user characteristics at a single instant of
time.Their real-world data about users are collected at a single
point of time, and the experiments are conducted and evalu-
ated from the perspective of a static social network. In fact,
social networks are constantly changing, and spammers may
be able to improve the effectiveness of an attack through per-
sistent efforts (
Liu et al., 2015). Therefore, as spammers evolve
their strategies to evade detection, the capacity of these ap-
proaches to efficiently detect them becomes dubious.
In this study, we introduce temporal factors into the de-
tection of spammers by inspecting the activities of users over
an extended period of time and offer a detecting framework
to identify the spammers that evade detection by changing their
strategies. Intuitionally, even if many spammers can make their
accounts appear like legitimate user accounts at some static
time points to avoid being detected, it is impossible for them
to manipulate the dynamic changing process of features over
an extended period of time due to the high cost (
Yang et al.,
2013).
To achieve our research goals, we collect the profiles of a
vast number of social network users and track their activities
over a series of points of time. A window-based dynamic metric
is used to assess the temporal evolution patterns of users and
uncover a clear distinction between legitimate users and
spammers concerning different aspects of the temporal pat-
terns. Based on the dynamic metric, new temporal user features
are designed to detect spammers. Instead of using these fea-
tures to identify spammers directly, we investigate the similarity
in the temporal patterns of different spammers, and conduct
a clustering algorithm (
Maulik and Bandyopadhyay, 2000)on
users by abstracting their dynamic metrics into feature vectors.
The results indicate that it is relatively easier to group
spammers into the same cluster. We combine the new fea-
tures with the clustering results to build a machine learning
classifier for accurate detection of spammers. Finally, we evalu-
ate our method using the real-world dataset and demonstrate
the effectiveness of our approach by comparing it with two con-
ventional spammer detection methods.
The contributions of this paper are three-folds:
• We propose a dynamic metric model based on sliding
window to measure the dynamic changes of social net-
works users’ activities.
• On the basis of the dynamic metric, we design new fea-
tures to describe the temporal evolution patterns and come
up with a novel framework combining unsupervised clus-
tering and supervised classification to detect spammers in
OSNs.
• We implement our approach and evaluate it using a real-
world dataset. Compared to conventional methods, we are
able to achieve better performance.
The remainder of this paper is organized as follows:
Section
2 covers related works on spammer detection in online social
networks.
Section 3 provides the motivation and some as-
sumptions used in this study. Section 4 provides further details
about the proposed dynamic metric.
Section 5 describes our
approach for detecting spammers in online social networks.
Section 6 explains the experiments conducted and results of
the evaluation of our approach.
Section 7 concludes the paper.
2. Related work
Due to the inundation of spams in online social networks, many
studies have been conducted aiming to investigate spammers
in different kinds of social networks. Some previous works
focused on characterizing spammers (
Almaatouq et al., 2014;
Gao et al., 2010; Grier et al., 2010; Stringhini et al., 2010; Thomas
et al., 2011
). After collecting a certain amount of data of
spammer accounts, the characteristics of spammers were ana-
lyzed in these studies from different aspects, such as individual
profiles, propagative behavior, message contents and social re-
lationships.
Stringhini et al. (2010) collected the data related
to spammers by using Honeypot and then divided them into
different categories according to their behavior patterns.
Yang
et al. (2014) deploy social honeypots on Twitter with diverse
strategies to trap spammers, consequently revealing the pref-
erences of spammers when they are finding their targets. These
works showed the fundamental characteristics of spammers
from a variety of aspects, laid the foundation for the follow-up
61computers & security 72 (2018) 60–73
researches but did not provide sophisticated and efficient
methods of detection.
Subsequently, many approaches have been proposed to
combat spam in online social networks.These approaches can
be generally divided into two types: machine learning-based
(
Hu et al., 2013; Lee et al., 2010; Lee and Kim, 2012; Wang et al.,
2015) and social-graph-based (Cao et al., 2012; Xue et al., 2013;
Yang et al., 2012
) approaches. Chen et al. (2015) constructed a
huge ground-truth dataset consisting of 6.5 million spam tweets
and 6 million non-spam tweets, and then conducted a com-
prehensive evaluation of different machine learning algorithms
using lightweight features.
Gao et al. (2012) present an online
spam filtering system as a component of online social network
platforms. They aggregated messages generated by users to
campaigns by adopting incremental clustering algorithm and
used six features to distinguish spam campaigns.
Cao et al.
(2012) designed an inference scheme to detect fake accounts
by computing the landing probability of early terminated
random walks.Yang et al
(2013) further explored the friend in-
vitation graphs and developed a detection system based on it.
Given the ability of spammers to rapidly change their tactics
to evade such detections (
Zhu et al., 2012), these approaches
are not very effective.
To meet this challenge, researchers have developed their
countermeasures.
Yang et al. (2013) identified and verified
several common evasion strategies used by spammers, de-
signed some more sophisticated detection features based on
the analysis, and then proposed a formal model to evaluate
the robustness of the new features.
Fu et al. (2015) extracted
the carefulness of users as a metric to indicate how careful a
user is when following another user, and subsequently made
use of this parameter to adjust and improve existing features
and methods.
Chen et al. (2017) focused on the “Twitter spam
drift” problem in which spammers post more tweets with the
similar semantic meaning but different text to evade detec-
tion; they proposed a “Lfun” approach which learns from
unlabeled tweets to address the “Twitter Spam Drift” problem.
However, the weakness of the above approaches is that they
do not address the critical issue as they still consider the social
networks as a static system, whereas spammers are con-
stantly finding new evasion techniques.To address this problem,
we propose a dynamic metric to describe temporal patterns
of users and develop a novel method for identifying spammers.
It is one of the main differences between our method and other
previous approaches.
3. Preliminary
Before illustrating our study in detail, we provide the motiva-
tion behind our work and the assumptions used in our
approach.
3.1. Motivation
As mentioned earlier, the means that spammers use to evade
detection are becoming increasingly sophisticated. If an in-
spection system is capable of discovering majority of the
spamming accounts at a period, its capacity to do so at another
period is uncertain.
The main reason for this new challenge is that most de-
tection mechanisms characterize users on the basis of their
features at a single point of time, whereas spammers continu-
ously optimize their spamming strategies.This aspect motivated
us to obtain a deeper insight about users in terms of tempo-
ral evolution patterns and design a detection system.
3.2. Assumptions
To make the proposed approach more reasonable, we make fol-
lowing assumptions according to the observations of the dataset
and the experiences of previous studies.
Assumption 1. Spammers constantly change their spamming
strategies.
This assumption is mentioned and utilized in many exist-
ing approaches, such as
Liu et al. (2016); Tan et al. (2013). The
reason behind this assumption is easy to understand: As the
intensity of detection increases, only those spammers who
adjust their strategies according to the detection method will
be able to survive. Meanwhile, legitimate users will not have
to make these adjustments, thus forming relatively non-
volatile patterns.
Assumption 2. Spammers tend to control a large number of
accounts to spread spam.
The profits of spammers’ activities are dependent on the
extent of users to which their spam messages can reach.
Because of the broad adoption of features based on bursty prop-
erty, such as time interval, spammers cannot generate a massive
amount of spam information using a few accounts. There-
fore, spammers usually create or compromise a significant
number of accounts and use them to spread spam to a large
set of users, thereby resulting in corresponding spam ac-
counts with similar behavioral patterns.
4. Dynamic metric
In this section, we first present the activity measures that are
used to build our dynamic metric.These activity measures prin-
cipally consist of features based on users’ activities. We then
illustrate the proposed dynamic metric and the new features
for characterizing the evolution patterns and detecting
spammers.
4.1. Activity measures
The features that we use as activity measures to build the
dynamic metric are divided into two categories: graph-based
and non-graph-based. For a spammer, the first step to spread
the malicious information in social networks is to establish
social relationships with other users, thus features based on
the social graph is a primary source of access to users’ pref-
erences and characteristics. We select four of these features:
degree centrality, bidirectional link ratio, betweenness centrality, and
62 computers & security 72 (2018) 60–73
剩余13页未读,继续阅读
资源评论
weixin_38727694
- 粉丝: 4
- 资源: 946
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 校园失物招领网站:设计与实现的全流程解析
- 基于java的公司固定资产管理系统.doc
- 基于java+springboot+vue+mysql的学科竞赛管理系统 源码+数据库+论文(高分毕业设计).zip
- 人工智能领域计算断层成像技术研究最新进展综述
- 微藻检测10-YOLO(v5至v11)、COCO、CreateML、Paligemma、TFRecord、VOC数据集合集.rar
- 快速排序算法在Rust语言的实现及其优化
- 2024年超融合网络架构研究与实践报告.pdf
- 埃森哲:2024年360°价值报告(英文版).pdf
- ISACA中国社区2024女性职业现状调查报告.pdf
- 如何看待“适度宽松”的货币政策.pdf
- 双目立体匹配三维重建点云C++ 本工程基于网上开源代码进行修改,内容如下: 1.修改为 VS2015 Debug win32 版本,支持利用特征点和 OpenCV 立体匹配算法进行进行三维重建及显示
- 华为云AI数字人生态赋能千行百业高效发展.pdf
- 金融业数据安全发展与实践报告.pdf
- 候鸟生命线—共筑候鸟迁徙保护网络.pdf
- 2024年全国统一电力市场建设情况及展望报告.pdf
- 2018-2023年粤港澳、京津冀、长三角三大区域高校本科专业调整趋势.pdf
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功