没有合适的资源?快使用搜索试试~ 我知道了~
Generalization_of_Forgery_Detection_With_Meta_Deepfake_Detection...
需积分: 0 1 下载量 135 浏览量
2024-05-20
22:57:20
上传
评论
收藏 1.59MB PDF 举报
温馨提示
试读
12页
Generalization_of_Forgery_Detection_With_Meta_Deepfake_Detection_Model.pdf
资源推荐
资源详情
资源评论
Received 17 November 2022, accepted 17 December 2022, date of publication 26 December 2022,
date of current version 3 January 2023.
Digital Object Identifier 10.1109/ACCESS.2022.3232290
Generalization of Forgery Detection With Meta
Deepfake Detection Model
VAN-NHAN TRAN
1
, SEONG-GEUN KWON
2
, SUK-HWAN LEE
3
, HOANH-SU LE
4
,
AND KI-RYONG KWON
1
1
Department of Artificial Intelligence Convergence, Pukyong National University, Busan 48513, South Korea
2
Department of Electronics Engineering, Kyungil University, Gyeongsan 38428, South Korea
3
Department of Computer Engineering, Dong-A University, Busan 49315, South Korea
4
Faculty of Information Systems, University of Economics and Law, Vietnam National University Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
Corresponding author: Ki-Ryong Kwon (krkwon@pknu.ac.kr)
This work was supported in part by the Basic Science Research Program through the National Research Foundation of Korea (NRF)
funded by the Ministry of Education under Grant 2020R1I1A306659411 and Grant 2020R1F1A1069124; and in part by the Ministry of
Science and ICT (MSIT), South Korea, through the Information Technology Research Center (ITRC) Support Program supervised by the
Institute for Information & Communications Technology Planning & Evaluation (IITP) under Grant IITP-2022-2020-0-01797.
ABSTRACT Face forgery generating algorithms that produce a range of manipulated videos/images have
developed quickly. Consequently, this causes an increase in the production of fake information, making it
difficult to identify. Because facial manipulation technologies raise severe concerns, face forgery detection is
gaining increasing attention in the area of computer vision. In real-world applications, face forgery detection
systems frequently encounter and perform poorly in unseen domains, due to poor generalization. In this
paper, we propose a deepfake detection method based on meta-learning called Meta Deepfake Detection
(MDD). The goal of the model is to develop a generalized model capable of directly solving new unseen
domains without the need for model updates. The MDD algorithm establishes various weights for facial
images from various domains. Specifically, MDD uses meta-weight learning to shift information from the
source domains to the target domains with meta-optimization steps, which aims for the model to generate
effective representations of the source and target domains. We build multi-domain sets using meta splitting
strategy to create a meta-train set and meta-test set. Based on these sets, the model determines the gradient
descent and obtains backpropagation. The inner and outer loop gradients were aggregated to update the model
to enhance generalization. By introducing pair-attention loss and average-center alignment loss, the detection
capabilities of the system were substantially enhanced. In addition, we used some evaluation benchmarks
established from several popular deepfake datasets to compare the generalization of our proposal in several
baselines and assess its effectiveness.
INDEX TERMS Deepfake detection, meta-learning, artificial intelligence, computer vision.
I. INTRODUCTION
Face recognition systems have progressed substantially in
recent times. In particular, deep learning technologies have
significantly improved the performance of this task. However,
the sophistication of face image manipulation puts existing
facial recognition algorithms in danger of being considered
inefficient. With the development of technologies such as
Generative Adversarial Networks (GAN) [1], GANs family,
The associate editor coordinating the review of this manuscript and
approving it for publication was Liangxiu Han .
and Variational AutoEncoders [2], [3]. Fake facial images and
videos can be made and utilized to deceive recognition sys-
tems. Many manipulation algorithms [4], [5], [6] person with-
out specific skills to produce high-quality fake faces without
expert skills and special knowledge for training. As a result,
it can be often challenging for the human eyes to identify the
difference between actual and manipulated images. This has
led to an increase in the usage of modified multimedia content
in various cybercrime activities. The technology may be uti-
lized maliciously, resulting in a major trust issue for modern
society. Due to the fact that such methods may produce
VOLUME 11, 2023
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
535
V.-N. Tran et al.: Generalization of Forgery Detection With Meta Deepfake Detection Model
FIGURE 1. Overview architecture of our proposed MDD.
high-quality fake images that are even indistinguishable from
human eyes. Therefore, the scientific community has shown
a lot of interest in the need to develop techniques for identi-
fying authentic faces from fraudulent images. Many methods
for deepfake detection have been proposed in [7], [8], [9],
[10], and [11]. These proposals primarily take inspiration
from the binary classification problem, applying its models
to the deepfake detection challenge in order to differentiate
between real and fake photos. The common model for these
proposals typically uses the data preprocessing associated
with backbone networks to extract features from faces in
images or videos. Then uses a binary classifier network to
classify them into real and fake ones. However, due to the
rapid advancement of face forgery generation algorithms,
some samples seem extremely similar to one another and
only differ from one another by a few small features, it is
getting harder to determine the difference between fake and
real features in fake images. In addition, there is a lot of
variety in fake images which are produced using different
algorithms. Resulting in the ineffective performance of such
global feature-based systems which used binary classifier
networks.
Presently, Face forgery generation algorithms are increas-
ing rapidly, which can be mentioned as expression swap-
ping, identity swapping, face swapping, face synthesis, etc.
Based on these algorithms, a variety of manipulated datasets
is created to serve the research and development of face
forgery detection. Several common datasets used in the
experiment of this paper are DFDC [12], Celeb-DF-v2 [13],
FaceForensics++ [9]. The synthetic faces in these datasets
were produced using the same algorithm leading to similar
data distribution in each one. When training and testing are
completed on one dataset, then only one data distribution
set is used to assess the outcomes. When testing with other
databases, often the results are poor. However, in real-world
applications, the model is frequently used in a significantly
different domain (unseen domain) with a different distribu-
tion than the source domains. As a result, generalized face
forgery detection is less researched and more difficult with
unseen facial manipulations.
In this research, we design a generalized face forgery
detection model to solve the face authentication issue. With-
out any model updating, the model can be evaluated directly
on unseen domains after being trained on a number of source
domains. Inspired by [14], [15], and [16], by using meta-
learning, we propose a novel deepfake detection algorithm,
termed Meta Deepfake Detection (MDD). With a meta-
optimization objective, in order to learn efficient face rep-
resentations on both synthetic source and target domains.
The MDD shifts the source domain to the target domain.
So as to increase model generalization, the gradients from
the meta-train and the meta-test are combined using meta-
optimization. The MDD can handle unseen domains without
model updating for unseen domains. The followings are sum-
mary of our main contributions:
• We propose a Meta Deepfake Detection model (MDD)
to handle the generalization of the deepfake forgery
detection problem, which uses transferable knowledge
across domains to learn from meta-learning to enhance
model generalization.
• We emphasize the generalized deepfake detection
challenge, which necessitates that a trained model
536 VOLUME 11, 2023
V.-N. Tran et al.: Generalization of Forgery Detection With Meta Deepfake Detection Model
generalizes effectively on new domains without any
updating.
• We propose two loss functions: Pair-Attention Loss
(PAL), which is to concentrate on maximizing positive
and negative pairings and separating positive samples
from negative samples. Average-Center Alignment Loss
(ACA), which is to minimize the variations in each class,
while retaining the capacity to differentiate between
features of various classes. Moreover, these two losses
are aggregated with softmax loss to update the entire
model and learn across domains.
• We apply data preprocessing along with the block shuf-
fling transformation technique to increase the perfor-
mance of the generalized model.
• Some generalized deepfake detection benchmarks are
used for the evaluation of our proposal. A number of
experiments on these evaluation benchmarks are con-
ducted and compared with some related methods.
II. RELATED WORK
A. FACE FORG ERY GENERATION
Deep generative models, which are gaining popularity, are
being used to synthesize and produce fake videos and images.
The manipulation algorithms also expand along with it. Sev-
eral well-known algorithms include face swap, face manipu-
lation, expression reenactment, etc.
1) FACE SWAP
Face swapping involves replacing the face of a source image
with that of a target image. Some remarkable research such
as RSGAN [17] proposed a region-separative generative
adversarial network, which replaces the handles face and
hair appearances in the latent-space representations of the
faces and reconstructs the full face to achieve face swapping.
FSGAN [18] proposed Face Swapping GAN, which derives
a recurrent neural network (RNN) for face reenactment and
adapts to changes in position and expression. FSGANv2 [19]
offered a subject-agnostic swapping scheme for face reenact-
ment which adjusts important pose and expression variation.
MobileFaceSwap [20] proposed an advanced face swapping
approach with a lightweight Identity-aware Dynamic Net-
work (IDN) to modify the model parameters depending on
the identification information dynamically.
2) FACE MANIPULATION
It is a generation task in which the facial attributes and
styles of the output face are changed to point in the direction
of the intended target. AttGAN [21] applied an attribute
classification constraint to ensure the precise changing of
the desired characteristics in the resulting image and pre-
serve attribute-excluding details. Moreover, the suggested
approach is enhanced to allow attribute style adjustment in
an unsupervised setting. STGAN [22] presented a selec-
tive transfer perspective to utilize the target attribute vec-
tor to direct the flexible translation to the desired target
domain. MaskGAN [6] proposed a model with two primary
components: Dense Mapping Network (DMN) and Editing
Behavior Simulated Training (EBST) to modify target images
and learn style mapping by using a modified mask. Star-
GANv2 [23] proposed a framework that meets the variety
of generated images and scalability across multiple domains
when learning a mapping across several visual domains.
FacialGAN [24] proposed a framework that allows for the
simultaneous manipulation of dynamic face features and
extensive style transfers.
3) EXPRESSION REENACTMENT
The conditional face synthesis problem of facial expres-
sion reenactment aims to transfer a source face shape to
a target face while keeping the same target identity of the
face and appearance. Some related research can be men-
tioned as MarioNETte [25] which creates professional reen-
actments of hidden identities in a few-shot environment
to handle attention block of the image, facial landmark
transformer, and focus feature alignment. DEA-GAN [26]
presented a self-supervised hybrid model that learns an
embedded face that is pose-invariant for each video by using
a multi-frame deforming auto-encoder. FReeNet [27] pro-
posed a multi-identity face reenactment framework to share
a common model and transmit facial expressions from the
source face to the target face. AD-NeRF [28] proposed an
audio-driven talking head technique that renders portraits
by directly mapping audio characteristics to dynamic neural
radiance fields. FACEGAN [29] proposed a model that uses
the Action Unit (AU) representation to transfer from the
driving face to facial motion.
B. FACE FORG ERY DETECTION
Face forgery detection is divided into different groups, such
as spatial clue for detection, temporal clue for detection, and
generalizable clue for detection.
1) SPATIAL CLUE FOR DETECTION
The work in [30] presented an innovative attention-based
layer to boost classification efficiency and generate an
attention map showing the altered face areas. Furthermore,
the work in [31] designed an inconsistency-aware wavelet
dual-branch network to recognize real and fake images.
Capsule-forensics [32] proposed a method that employs a
deep convolutional neural network and a capsule network
to identify several types of spoofs, including replay attacks
that use printed pictures or recorded movies and computer-
generated videos. FakeLocator [33] introduced the attention
mechanism by using face parsing and suggest a single sample
clustering and partial data augmentation to improve the train-
ing data. In research [34], with the goal of developing a novel
detection technique that can find a forensics trail concealed
in images, we concentrate on the analysis of deep fakes of
human faces.
2) TEMPORAL CLUE FOR DETECTION
MesoNet [35] presented a method for quickly and effec-
tively identifying face tampering in videos with a focus on
VOLUME 11, 2023 537
剩余11页未读,继续阅读
资源评论
m0_74321217
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功