Assessing Transferability of Adversarial Examples against
Malware Detection Classifiers
Yixiang Wang, Jiqiang Liu, Xiaolin Chang
Beijing Key Laboratory of Security and Privacy in Intelligent Transportation
Beijing Jiaotong University, Beijing, China
{14281175, jqliu, xlchang}@bjtu.edu.cn
ABSTRACT
Machine learning (ML) algorithms provide better performance than
traditional algorithms in various applications. However, some
unknown flaws in ML classifiers make them sensitive to adversarial
examples generated by adding small but fooled purposeful
distortions to natural examples. This paper aims to investigate the
transferability of adversarial examples generated on a sparse and
structured dataset and the ability of adversarial training in resisting
adversarial examples. The results demonstrate that adversarial
examples generated by DNN can fool a set of ML classifiers such as
decision tree, random forest, SVM, CNN and RNN. Also,
adversarial training can improve the robustness of DNN in terms of
resisting attacks.
KEYWORDS
Adversarial machine learning, Adversarial examples,
Transferability, Deep neural network, Machine learning
1 INTRODUCTION
Deep neural networks (DNNs) are changing the views of
people towards machine learning (ML) by solving a variety
of challenges in real life. The way to solve these challenges
is more efficient and effective than traditional algorithms.
DNNs are transforming data processing methods in many
fields, such as speech recognition [1], computer vision [2],
natural language processing [15], and intrusion detection [3].
Meanwhile, adversaries have gained the impetus to
manipulate deep neural networks to output misclassifications.
In terms of their behaviors, adversarial ML is defined as a
kind of learning [10] methods to reduce the effectiveness of
ML models or the ability of algorithm extraction. An
adversary who constructs misclassified input can obtain
benefits from avoiding detection [4]. Adversarial examples
are the inputs to cause learning algorithm misclassification.
Recent research has shown that attackers can force the deep
learning object classification model to misclassify images by
making imperceptible modifications on pixel values [9].
There existed researched on constructing adversarial
examples. Szegedy et al. [5] constructed adversarial
examples based on dense and unstructured data such as
images, and accordingly, proposed the defense solution was
also based on this type of data. Grosse et al. [12] generated
adversarial examples on the sparse and structured dataset,
DREBIN, but they did not investigate transferability. Maria
et al. [13] proved that adversarial examples have the
transferability of adversarial examples but they used intrusion
detection.
In this paper, we aim to investigate the transferability of
adversarial examples and design defensive methods against
adversarial examples in a malware detection dataset, which is
sparse and structured. Concretely, we use the classic Android
malware dataset, DREBIN, to extract sparse and structured
data using 100 most frequent static features, and generate
adversarial examples based on the deep learning model and
JSMA algorithm, where the deep learning model reaches the
state-of-the-art performance. Then, rigorous experiments are
implemented to verify the transferability of the adversarial
examples. The transferability can be verified by both
traditional ML models/classifiers and the deep learning
models/classifiers. We also analyze the reasons for the
transferability of adversarial examples. Moreover, we reduce
the impact of noisy data on the model based on adversarial
training to improve the robustness of the model.
To the best of our knowledge, we are the first to use sparse
and structured data to investigate the transferability of
adversarial examples in malware detection field. Our
experimental results indicate that regardless of the data types,
adversarial examples can be generated by algorithms and has
the characteristics of transferability. And accordingly,
adversarial training can improve the robustness of the model
and reduce the impact of adversarial examples on the
prediction results.
The rest of the paper is organized as follows. Section 2
presents backgrounds and related work. Section 3 introduces
methodology of verifying transferability. Section 4 includes
experimental analysis and discussions. Section 5 summarizes
the conclusions.
2 BACKGROUND AND RELATED WORK
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from Permissions@acm.org.
CF '19, April 30-May 2, 2019, Alghero, Italy
© 2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-6685-4/19/05$15.00
https://doi.org/10.1145/3310273.3323072
211