【免费】OCR-GAN（Omni-frequencyChannel-selectionRepresentations）

共3个文件

txt：1个

pptx：1个

pdf：1个

论文

需积分: 0 26 浏览量 2023-03-27 14:59:20 上传评论收藏 3.18MB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

合集.rar （3个子文件）

新建文件夹

Omni-frequency Channel-selection Representations for Unsupervised Anomaly Detection.pdf 3.02MB

omni.pptx 803KB

omni.txt 2KB

Omni-frequency Channel-selection Representations

for Unsupervised Anomaly Detection

Yufei Liang

∗

, Jiangning Zhang

∗

, Shiwei Zhao, Runze Wu, Yong Liu

†

, and Shuwen Pan

Abstract—Density-based and classiﬁcation-based methods have

ruled unsupervised anomaly detection in recent years, while

reconstruction-based methods are rarely mentioned for the poor

reconstruction ability and low performance. However, the latter

requires no costly extra training samples for the unsupervised

training that is more practical, so this paper focuses on improving

this kind of method and proposes a novel Omni-frequency

Channel-selection Reconstruction (OCR-GAN) network to handle

anomaly detection task in a perspective of frequency. Concretely,

we propose a Frequency Decoupling (FD) module to decouple

the input image into different frequency components and model

the reconstruction process as a combination of parallel omni-

frequency image restorations, as we observe a signiﬁcant dif-

ference in the frequency distribution of normal and abnormal

images. Given the correlation among multiple frequencies, we

further propose a Channel Selection (CS) module that performs

frequency interaction among different encoders by adaptively

selecting different channels. Abundant experiments demonstrate

the effectiveness and superiority of our approach over different

kinds of methods, e.g., achieving a new state-of-the-art 98.3

detection AUC on the MVTec AD dataset without extra training

data that markedly surpasses the reconstruction-based baseline

by +38.1↑ and the current SOTA method by +0.3↑. Source code

will be available at https://github.com/zhangzjn/OCR-GAN.

Index Terms—Anomaly detection, omni-frequency decoupling,

unsupervised learning, reconstruction-based network.

I. INTRODUCTION

NOMALY detection is a binary classiﬁcation task to

distinguish whether a given image deviates from the

predeﬁned normality, which is an essential task in visual image

understanding that has various applications in the real world,

e.g., novelty detection [1], product quality monitoring based on

industrial images [2], automatic defect restoration [3], human

health monitoring [4] and video surveillance [5]–[8]. In real-

world applications, anomaly detection tasks can be divided

into sensory AD (Fig. 1a) and semantic AD (Fig. 1b): the

former only suffers from covariate shift without semantic shift,

while the later is the opposite. Most anomalies appear in

the form of defects in the sensory AD, such as the normal

defect detection task in MVTec AD [2] and KolektorSDD [9]

∗

Equal contribution.

†

Corresponding author.

Yufei Liang, Jiangning Zhang, and Yong Liu are with the Laboratory of

Advanced Perception on Robotics and Intelligent Learning, College of Control

Science and Enginneering, Zhejiang University, Hangzhou 310027, China;

Email: 22032139@zju.edu.cn, 186368@zju.edu.cn, yongliu@iipc.zju.edu.cn.

Shuwen Pan is with the Discipline of Control Science and Engineering,

School of Information and Electrical Engineering, Zhejiang University City

College, Hangzhou 310015, China; Email: pansw@zucc.edu.cn

Shiwei Zhao and Runze Wu are with the Fuxi AI Lab, NetEase

Games, Hangzhou 310012, China; Email: zhaoshiwei@corp.netease.com,

wurunze1@corp.netease.com.

Normal Normal

Normal Abnormal

Train

Test

Normal

Abnormal

Train

Test

Sensory Anomaly Detection

Semantic Anomaly Detection

& One-Class Detection

Cat

Cat Cat

Dog

Cat

Fig. 1. Illustrations of sensory anomaly detection (Left) and semantic

anomaly detection (Right) .

datasets. However, semantic AD task detects images with label

shifts, assuming that normal and abnormal come from different

semantic distributions, such as the one-class detection task in

CIFAR-10 [10]. This work focus on solving the sensory AD

task but also evaluate on the related semantic AD dataset.

In anomaly detection, obtaining abnormal samples and

detecting novel abnormalities are time-consuming and costly

objects that force us to develop unsupervised methods for

more practical applications. Current unsupervised anomaly

detection methods are mainly divided into three categories:

density-based (Fig. 2a), classiﬁcation-based (Fig. 2b) and

reconstruction-based (Fig. 2c) methods. a) Density-based

methods generally employ a pre-trained model to extract

meaningful vectors of the input image. The anomaly score

can be obtained by calculating the similarity between the

embedding representation of the test image and the reference

density distribution. This kind of method [11]–[13] achieves

a high AUC score on the popular MVTec AD [2] dataset, but

they need pre-trained models and are insufﬁcient for the model

interpretability. b) Classiﬁcation-based methods try to ﬁnd

the classiﬁcation boundaries of normal data. Self-supervised

methods are representative of classiﬁcation-based methods,

which use the model trained by the proxy task to detect

anomalies. Thus, self-supervised methods rely on how well the

proxy tasks match the test data. For example, CutPaste [14]

performs well in anomaly detection on MVTec AD dataset.

However, it is difﬁcult for this method to perform well on other

datasets. Also, these methods need pre-trained models and

arXiv:2203.00259v1 [cs.CV] 1 Mar 2022

Anomaly score

Encoder

Decoder

Density-based method Reconstruction-based method

Pre-trained

CNN

Pre-trained

CNN

Train

Test

Anomaly score

CNN

Proxy task output

CNN

Anomaly

detection

head

Anomaly

score

Train

Test

Classification-based method

Pretrained Model

Extra Training Data

(a) (b) (c)

Fig. 2. Pipeline illustrations of three kinds of unsupervised anomaly detection methods in column. Bottom two rows indicate whether Pretrained Model and

Extra Training Data are used for each kind of method.

Reconstruction-based Methods

Density-based Methods

Classification-based Methods

97.1

'18

SSIM

'18

AnoGAN'17

Skip-GANomaly'19

Puzzel-AE'20

GradCon'20

InTra'21

DGAD'21

Year

50.0

60.0

70.0

80.0

90.0

100.0

SPADE'20

DifferNet’21

55.0

63.0

71.0

60.2

72.1

77.6

80.0

85.5

94.7

95.9

98.3

+18.3↑

Ours

AUC

Draem'21

98.0

(b)

Frequency Energy

Frequency

(a)

CutPaste'21

Fig. 3. (a) Energy distribution with frequencies for normal and abnormal samples in MVTec AD dataset, and the shadow represents standard deviation.

Normal and abnormal data have noticeable frequency distribution differences. (b) Development of three kinds of methods. Our approach surpasses the

SOTA reconstruction-based method without extra training data by a large margin, i.e., +18.3↑. Note that the current SOTA Draem [19] is not a classical

reconstruction-based method, which requires a new training strategy and extra training data.

extra training data. c) Reconstruction-based methods [15]–

[18] contain a generator to reconstruct the input image, and

the anomaly score is the more interpretable reconstruction

error. These methods do not need pre-trained models and extra

training data. However, current reconstruction-based methods

without extra training data are much less expressive than other

methods for the generator’s poor reconstruction ability. In

summary, current unsupervised anomaly detection approaches

are still suffering from two main challenges: (1)) Some works

achieve high AUC score but require abnormal samples or extra

training data that are hard to obtain and costly for practical use.

(2)) Current reconstruction-based methods are more practical

and do not need pre-trained models and extra training data but

suffer from low performance. This paper focuses on studying

the reconstruction-based method as it requires no extra training

data and only normal samples that is more practical.

To improve the performance of the reconstruction-based

method, we need to enhance the reconstruction ability of

the generator for the anomaly detection task. For an im-

age, different frequency bands contain different types of

information, e.g., low frequency represents more semantic

information while high frequency represents more detailed

texture information. Also, we ﬁnd that the model performance

can be improved from the frequency domain perspective in

many computer vision tasks, e.g., in image super-resolution

task, [20] separates the different frequency components to

compensate for the loss of information in different frequency

bands of real LR images to improve the performance of

the model. Motivated by the idea, we analyze the frequency

distribution of normal and abnormal images in the anomaly

detection task. As shown in Fig. 3(a), we count the fre-

quency energy distribution of normal and abnormal images,

as the energy distribution of the Fourier-transformed image

is reﬂected in the amplitude spectrum. We re-analyze this

paradigm and ﬁnd that normal and abnormal samples have

different frequency distributions in sensory AD. So it may be

difﬁcult and unsuitable for only one generator to learn the

full-frequency reconstruction of the RGB image. Therefore,

we propose an anomaly detection framework using multiple

frequency branches to reconstruct information from different

frequency bands respectively. In order to differentiate the use

of information from different frequency bands, we propose

an effective Frequency Decoupling (FD) module to pre-obtain

omni-frequency representation of the input image and use par-

allel generators to reconstruct images of multiple frequencies.

Considering the model efﬁciency, we conduct experiments

with 2 or 3 frequency branches in this paper. Different fre-

quency branches in the framework are independent by default.

However, an image contains information in multiple frequency

bands, and the information in different frequency bands is

not completely unrelated to each other but complementary in

the real world. So, we design a tailored Channel Selection

(CS) module to further realize omni-frequency interaction

among multiple branches that can adaptively select different

channel features. Based on the above modules and the baseline

Skip-GANomaly [21], we propose a novel Omni-frequency

Channel-selection Reconstruction (OCR-GAN) network. Our

method achieves state-of-the-art (SOTA) results on multiple

public datasets consistently. Speciﬁcally, our OCR-GAN im-

proves +0.3↑ than current SOTA method Draem [19] and sig-

niﬁcantly +18.3↑ than SOTA reconstruction-based DGAD [17]

without extra training data on MVTec AD in Fig. 3(b), which

strongly proves that the reconstruction-based method can

also perform well even without extra training data and pre-

trained models. To the best of our knowledge, this paper is

the ﬁrst attempt to explore omni-frequency information with

reconstruction-based anomaly detection method. Our main

contributions can be summarized as follows:

• We rethink the difference between normal and abnormal

images from the frequency domain perspective and pro-

pose a novel framework for anomaly detection based on

omni-frequency reconstruction.

• We propose an effective FD module to obtain different

frequency bands information of the image that enables

the omni-frequency reconstruction by multiple branches.

• We propose a CS module to realize omni-frequency in-

teraction among multiple branches and adaptive selection

of different channel features.

• Abundant experiments demonstrate the superiority of our

OCR-GAN over SOTA methods, e.g., we achieve a new

SOTA 98.3 detection AUC on the MVTec AD dataset

without extra training data, which markedly surpasses the

SOTA reconstruction-based method without extra training

data by +18.3↑ and the SOTA method by +0.3↑.

The remainder of the paper is organized as follows. In Sec-

tion II, we review some related works. Details of the proposed

OCR-GAN method are given in Section III. Experimental

results are presented in Section IV. And we conclude the paper

with discussion and summary in Section V.

II. RELATED WORK

Anomaly detection methods can be mainly divided into

density-based, classiﬁcation-based and reconstruction-based

methods as follows.

A. Density-based methods

Density-based methods build a density estimation model

for the distribution of normal training data. And this kind of

method assumes that normal data have a higher likelihood

under this model than abnormal data during inference. Pa-

rameter density estimation assumes that the density of normal

data can be represented by some reference distribution. A

pre-trained network is used to extract meaningful vectors

representing the whole image or patch image for anomaly

detection. The similarity between the representation vector

of the test image and the reference vector is set as anomaly

score. Some researches [22]–[27] train the model on the entire

image, while works [12], [13], [28], [29] on the patch image.

The normal distribution reference can be the parameter of

the Gaussian distribution of the normal image embedding

vectors [13], [30], the mixed Gaussian distribution [31],

[32], the Poisson distribution [33], the center of the sphere

containing the embedding from normal images [12], [34],

the entire set of normal embedding vectors [11], [35], the

feature of the last layer in the network [36], [37], or the

mid-level feature representation [38]. Mahalanobis distance is

used to calculate the anomaly score between the embedding

vector of the test image and the reference vector of the

normal training distribution. These methods have achieved

good performance recently, but they lack interpretability that it

is difﬁcult to clearly distinguish which part of the image causes

high abnormal scores. Also, this kind of method requires the

pre-trained model for extracting vectors that is less practical

for various real scenarios.

Another method of density estimation is normalizing ﬂows.

Normalizing ﬂows are used to learn bijective transformations

between data distributions with a special property. Differ-

Net [39] using normalizing ﬂows to estimate the precise

likelihood achieved good anomaly detection performance on

MVTec AD. Since ﬂow-based methods have no dimensional

reduction, the computation cost is signiﬁcant. And this kind

of method also needs pre-trained models to extract features.

B. Classiﬁcation-based methods

Classiﬁcation-based methods [40] try to ﬁnd the classi-

ﬁcation boundaries of normal data. DeepSVDD [34] ﬁrst

introduces one-class classiﬁcation to anomaly detection. Sam-

ples that deviate from the normal training sample description

are considered abnormal. Moreover, there are some self-

supervised learning methods to design good proxy tasks to

help the model detect anomalies from normal samples. One

classical self-supervised anomaly detection method is isolation

forest [41]. This method is based on abnormal samples that can

be isolated in fewer steps compared to normal samples. Other

proxy tasks for self-supervised anomaly detection methods

include image transformation prediction [24], [42], contrastive

learning [43] and proxy binary classiﬁcation [14]. [14] uses

data augmentation to generate pseudo-anomaly data and then

does a binary classiﬁcation proxy task with normal train-

ing samples to train the feature extraction model. The self-

supervised method relies on the design of proxy tasks, which

is difﬁcult to perform well on multiple data sets.

···















































































···

































Fig. 4. Overview of proposed OCR-GAN. Input image I goes through Frequency Decoupling (FD) module to obtain omni-frequency images {I

, I

, . . . }

from pre-processed Gaussian images {I

, I

, . . . }. Then {I

, I

, . . . } are fed into multiple generators {φ

, φ

, . . . } to reconstruct corresponding images

{

, . . . }, which are added to obtain the ﬁnal output

I. The proposed Channel Selection (CS) module performs omni-frequency interaction among different

encoders, i.e., {φ

, φ

, . . . }.

C. Reconstruction-based methods

One of the reconstruction-based methods is sparse recon-

struction which assumes that normal samples can be re-

constructed with a limited number of basis functions while

abnormal samples are more expensive to reconstruct. L

norm-

based kernel PCA [44] and low-rank embedded networks [45]

are belong to sparse reconstruction methods.

Abnormal images would get higher reconstruction error as

they have a different data distribution than normal images

so that the reconstruction error can be used as the anomaly

score for the reconstruction-based method. The autoencoder

(AE) [46] and generative adversarial networks (GAN) [47]

can reconstruct samples from the normal training data. [48]

propose to use an autoencoder for the reconstruction process

and structural similarity to measure reconstruction error. Some

studies [47], [49] have shown that using adversarial network

training would improve generation results. Moreover, GAN-

based methods have more suitable metrics that can play the

role of anomaly score, e.g., output of the discriminator [50],

[51] and latent space distance [21], [51], [52]. It is difﬁcult

to ensure the poor reconstruction for abnormal samples as

the capacity of the generator is strong, so OCGAN [53] uses

denoising autoencoder, latent discriminator, visual discrimi-

nator, and classiﬁer to ensure that any example generated

from the learned latent space is indeed from the normal class.

For GAN-based methods, the discriminator is usually used to

distinguish the reconstructed image from the original image,

but OGNet [54] redeﬁnes the role of the discriminator that is

used to distinguish reconstructed images of different qualities.

Recently, [55] utilize backpropagated gradients as repre-

sentations to characterize anomalies, and DGAD [17] learns

representation by the guidance of the discriminator to improve

the model performance. The generation ability of the generator

has a signiﬁcant inﬂuence on the effect of the reconstruction-

based method, so [56] propose to construct GAN ensembles

for anomaly detection as GAN ensembles often outperform

the single GAN. And [57] propose multistage GAN to detect

fabric defect. These methods indiscriminately reconstruct all

frequencies of the RGB image that may be difﬁcult for the

generator, leading to poor results in anomaly detection. Also,

we ﬁnd that normal and abnormal samples have different

frequency distributions, so we propose a new paradigm that

uses parallel branches to reconstruct omni-frequency images.

III. OUR APPROACH

A. Overview

In this section, we aim at improving the current

reconstruction-based approach without extra training data and

designing a generalized network for anomaly detection. As

the difference between normal and abnormal images varies

in different frequency bands, we perform anomaly detection

from the perspective of the frequency domain. As shown

in Fig. 4, our method derives from a frequency-decoupling

idea that comprises multiple generators, i.e., G={φ

, φ

, . . . },

to reconstruct omni-frequency images {

, . . . }, which is

trained alternately with a discriminator D to further boost the

model performance. Concretely, we propose an effective FD

module to decouple the input image I to omni-frequency im-

ages {I

, I

, . . . } and a CS module to realize omni-frequency

interaction by adaptively selecting channels among encoders

{φ

, φ

, . . . }. When the model ﬁnishes the training, the

abnormal images would be poorly reconstructed and get higher

anomaly scores than normal images.

评论收藏

内容反馈

wo42ge

粉丝: 59
资源: 3

OCR-GAN（Omni-frequency Channel-selection Representations）

最新资源

OCR-GAN（Omni-frequency Channel-selection Representations）

人工智能-项目实践-区块链-基于omni协议的比特币区块链上usdt rpc调用实现.zip

EDIMAX AC-1200支持omnipeek空口抓包驱动

AC-55 omnipeek 抓包 driver

论文研究-An Omni-directional vSLAM based on Spherical Camera Model and 3D Modeling.pdf

20210726-华安证券-千方科技-002373-Omni~T引领商业模式变革，业务双擎持续发力.pdf

DWA160无线网卡 for Omnipeek的驱动

前端开源库-omni-fetch

Android-Omni-Notes.zip

wildpackets-Omnipeek10.0.1-64X.zip

gatsby-omni-font-loader：字体加载程序已优化，可实现最佳性能。 删除阻止渲染的字体资源，并异步加载它们。 使用字体加载状态监视程序处理FOUT和FOUC。 支持本地托管的字体和Web字体

google-chrome-omni::artist_palette:适用于Google Chrome的Omni主题

Android-Omni-Notes开源笔记Android应用程序

asyncomplete-omni.vim:用于asyncomplete.vim的Omni完成源

Omni_NFS_Server_v4.2 注册机

CISCO wusb600-omnipeek用驱动

airbnb-omni9

Wildpackets Omnipeek 10.0.1 64X+注册机.part4.rar

AbilitySpectrometer-omni.rar

Elm-Omni-Dialer

Academic+Phrasebank+2021+Edition+_中英文对照.pdf

基于python的超市管理系统的设计与实现毕业论文+项目文档源码

1000套计算机毕业设计带源码

数模国赛word模板.zip

2021年国赛A题（FAST主动反射面形状调节）论文+代码材料.zip

2023高教社数学建模C题 - 蔬菜类商品的自动定价与补货决策【数据处理详细代码】

Python大作业（包含论文）——可打包的双人五子棋程序

基于高校校园网的网络规划设计与实现-以锦城学院为例-kaic.docx

YOLOv9论文，2024.02发布

软考 系统分析师论文 范文

最新资源

gatsby-omni-font-loader：字体加载程序已优化，可实现最佳性能。删除阻止渲染的字体资源，并异步加载它们。使用字体加载状态监视程序处理FOUT和FOUC。支持本地托管的字体和Web字体

软考系统分析师论文范文