See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/338292427
Unsupervised Deep Transfer Learning for Intelligent Fault Diagnosis: An Open
Source and Comparative Study
Preprint · December 2019
CITATIONS
0
READS
270
7 authors, including:
Some of the authors of this publication are also working on these related projects:
force sparse identification View project
Adaptive Inverse Control and Identification View project
Zhibin Zhao
Xi'an Jiaotong University
32 PUBLICATIONS157 CITATIONS
SEE PROFILE
Chuang Sun
Xi'an Jiaotong University
40 PUBLICATIONS578 CITATIONS
SEE PROFILE
Shibin Wang
Xi'an Jiaotong University
41 PUBLICATIONS631 CITATIONS
SEE PROFILE
Ruqiang Yan
Xi'an Jiaotong University
198 PUBLICATIONS4,411 CITATIONS
SEE PROFILE
All content following this page was uploaded by Zhibin Zhao on 01 January 2020.
The user has requested enhancement of the downloaded file.
Unsupervised Deep Transfer Learning for Intelligent Fault Diagnosis: An Open
Source and Comparative Study
Zhibin Zhao
a,b
, Qiyang Zhang
a
, Xiaolei Yu
a
, Chuang Sun
a
, Shibin Wang
a
, Ruqiang Yan
a,∗
, Xuefeng Chen
a
a
School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an, China
b
Centre for Health Informatics, Univeristy of Manchester, Manchester, United Kingdom
Abstract
Recent progress on intelligent fault diagnosis has greatly depended on the deep learning and plenty of labeled data.
However, the machine often operates with various working conditions or the target task has different distributions with
the collected data used for training (we called the domain shift problem). This leads to the deep transfer learning based
(DTL-based) intelligent fault diagnosis which attempts to remit this domain shift problem. Besides, the newly collected
testing data are usually unlabeled, which results in the subclass DTL-based methods called unsupervised deep transfer
learning based (UDTL-based) intelligent fault diagnosis. Although it has achieved huge development in the field of
fault diagnosis, a standard and open source code framework and a comparative study for UDTL-based intelligent fault
diagnosis are not yet established. In this paper, commonly used UDTL-based algorithms in intelligent fault diagnosis
are integrated into a unified testing framework and the framework is tested on five datasets. Extensive experiments
are performed to provide a systematically comparative analysis and the benchmark accuracy for more comparable and
meaningful further studies. To emphasize the importance and reproducibility of UDTL-based intelligent fault diagnosis,
the testing framework with source codes will be released to the research community to facilitate future research. Finally,
comparative analysis of results also reveals some open and essential issues in DTL for intelligent fault diagnosis which
are rarely studied including transferability of features, influence of backbones, negative transfer, and physical priors. In
summary, the released framework and comparative study can serve as an extended interface and the benchmark results to
carry out new studies on UDTL-based intelligent fault diagnosis. The code framework is available at https://github.
com/ZhaoZhibin/UDTL.
Keywords: Unsupervised deep transfer learning, intelligent fault diagnosis, open source study
1. Introduction
With the rapid development of industrial big data and Internet of things in the Industry 4.0 background, Prognostic
and Health Management (PHM) for industrial equipment is becoming increasingly important, leading to more and more
intelligent maintenance systems for industrial equipment. Intelligent fault diagnosis is becoming an important branch
in the machine PHM technology, and traditional machine learning methods, including support vector machine (SVM),
and artificial neural network (ANN), have been widely applied in this field. While, with the increment of available data,
data-driven intelligence methods with the ability for representation learning become increasingly popular. Under this
background, Deep Learning (DL) [1] with advantages for adaptive feature extraction and pattern recognition of data
∗
Corresponding author
Email address: zhibinzhao1993@gmail.com (Zhibin Zhao)
Preprint submitted to XXXX January 1, 2020
arXiv:1912.12528v1 [eess.SP] 28 Dec 2019
processing gradually becomes a hot research focus for PHM of industrial equipment. Effectiveness of DL models, such
as Convolutional Neural Network (CNN) [2], Deep Belief Network (DBN) [3], Sparse Autoencoder (SAE) [4], etc. for
tasks in PHM has been validated successfully in current research.
Besides, effectiveness of DL for intelligent fault diagnosis is based on the following two assumptions: 1) plenty of
labeled data are available; 2) fault patterns of training datasets in source domain are the same as those of testing datasets
in the target domain (mathematically, the training datasets (the source domain) should follow the same distribution with
the testing datasets (the target domain)). The labeled data can be collected for model training by the fault seeding or
simulations in the laboratory. However, training datasets acquired in the laboratory are not strictly consistent with the
data generated in real industrial equipment. If DL models are trained using these datasets, they might overfit the training
datasets, which leads to a weak generalization for real industrial applications, especially for the new conditions that are
not trained in models. Besides, the machine often operates with various working conditions in the real application, which
requires trained models adaptive to the change of working conditions. These two aspects make models trained in the source
domain hard to be generalized or transferred to the target domain, directly. Common characteristics existing in the data
from these two domains due to the intrinsic similarity in different application scenarios or different working conditions
make this domain shift manageable. Hence, to let DL models trained in the source domain work well in the target domain,
an effective way is to fine-tune DL models with a few labeled data in the target domain, and then the fine-tuned model can
be used to diagnose new data in the target domain and this way is also called deep transfer learning-based (DTL-based)
intelligent fault diagnosis. However, the newly collected engineering data or the data under different working conditions
are usually unlabeled and it is sometimes very difficult, or even impossible to label these data. Therefore, it is necessary
to investigate the unsupervised version of DTL which means that there is no labeled data in the target domain, and in this
paper, we mainly focus on this kind version called unsupervised deep transfer learning-based (UDTL-based) intelligent
fault diagnosis.
UDTL is widely used and has achieved huge development in the field of computer vision (CV) and natural language
processing (NLP), due to the application value, open source codes, and the baseline accuracy in these fields. But there are
little open source codes or the baseline accuracy in the field of UDTL-based intelligent fault diagnosis, plenty of research
has been published on UDTL-based intelligent fault diagnosis by simply using models that already have been published in
other fields. Due to the lack of open source codes, the results in these published papers are very hard to repeat for further
comparisons. This is not beneficial to identify the state-of-the-art methods in this field, and furthermore, it is unfavorable
to the advancement of this field on a long view. Hence, it is very important to perform a comparative study, provide a
baseline accuracy, and release open source codes of UDTL-based algorithms which are widely applied to intelligent fault
diagnosis. More importantly, open source codes are essential to finding existing problems and potential improvement
directions of these algorithms for the research community in this field.
For testing UDTL-based algorithms, the unified testing framework, parameter setting, and datasets are three important
aspects to affect fairness and effectiveness of comparisons between different algorithms. While, due to the leak of open
source codes, which causes the inconsistency of these factors, there are a lot of unfair and unsuitable comparisons in
UDTL-based algorithms leading to exist some similar studies and ineffective improvement in the current research, which
is harmful to the development of advanced algorithms. It seems that researchers are continuing to combine the new
algorithms which have already been published in the field of DTL, and the proposed algorithms always have better
2
performance than the former algorithms, which comes to the questions: Is the improvement beneficial to intelligent fault
diagnosis or just depends on the excessive parameter adjustment? However, the open and essential issues in DTL for
intelligent fault diagnosis are rarely studied, such as transferability of the features, influence of backbones, which transfer
learning method works better, etc.
To fill in this gap, commonly used UDTL-based algorithms in intelligent fault diagnosis are integrated into a unified
testing framework and tested on five datasets, in this paper. The UDTL-based intelligent diagnosis methods discussed in
this study mainly consist of four kinds of methods: network-based DTL, instanced-based DTL, mapping-based DTL, and
adversarial-based DTL. This testing framework with source codes will be released to the research community to facilitate
the research on DTL for intelligent fault diagnosis. With this comparative study and open source codes, the authors try
to give a benchmark (it is worth mentioning that results are just a lower bound of the accuracy) performance of current
algorithms and attempt to find the core that determines the transfer performance of the algorithms.
The main contributions of this paper are summarized as follows:
1) Various datasets and data splitting. We collect most of the publicly available datasets which are suitable for UDTL-
based intelligent fault diagnosis and provide a detailed discussion about its adaptability. We also discuss the way of
data splitting and explain that it is more appropriate to split data into training and testing datasets regardless of whether
they are in the source domain or in the target domain.
2) Benchmark accuracy and further discussion. We evaluate various UDTL-based intelligent diagnosis methods includ-
ing network-based DTL, instanced-based DTL, mapping-based DTL, and adversarial-based DTL for different datasets
and provide a systematic and comparative analysis and the benchmark accuracy (it is worth mentioning that the results
are just a lower bound of accuracy) from several perspectives to make the future studies in this field more comparable
and meaningful. We also discuss the transferability of features, influence of backbones, negative transfer, and other
potential studies and applications.
3) Open source codes. To emphasize the importance and reproducibility of UDTL-based intelligent fault diagnosis, we
release the whole evaluation codes framework that implements all the UDTL-based methods discussed in this paper
under a unified interface for the advancement of this field. Meanwhile, This is an extensible framework that retains an
extended interface for everyone to combine different algorithms and load their own datasets to carry out new studies.
The code framework is available at https://github.com/ZhaoZhibin/UDTL.
The rest of this paper is organized as follows: Section 2 provides a brief review of UDTL-based intelligent fault
diagnosis. Evaluation algorithms, applications, datasets, data preprocessing and splitting, and evaluation methodology are
introduced in Section 3 to 7. After that, in Section 8 and 9, evaluation results and further discussions are investigated,
followed by the conclusion part in Section 10.
2. Brief Review
Transfer learning, which is a well-known tool to solve the problem of limited labeled data or no labeled data in the
target domain, has developed rapidly in the field of artificial intelligence. Pan et al. [5] and Weiss et al. [6] reviewed the
basic progress and various applications of transfer learning in 2009 and 2016, respectively. Recently, due to the strong
3
presentation ability of DL, it can learn more transferable features without any request of hand-craft features. Therefore,
DTL (transfer learning methods based on DL models) has emerged as a popular branch and achieved many inspiring
results, and researchers can refer to some excellent survey papers of DTL [7, 8]. Intelligent fault diagnosis is a natural
transfer learning problem because of changes in working conditions and the lack of labeled fault data. Many traditional
transfer learning methods have been applied to fault diagnosis research works, such as transfer component analysis (TCA)
based models [9], subspace learning-based methods [10]. Since this paper mainly focuses on the application of DTL in
the field of intelligent fault diagnosis, the following part will mainly review DTL-based intelligent fault diagnosis.
According to Tan et al. [7], DTL methods can be classified into four categories: network-based DTL, instanced-based
DTL, mapping-based DTL, and adversarial-based DTL. In the following space, a brief review of DTL in intelligent fault
diagnosis is summarized according to those four categories (for more detailed information, researchers can refer to two
excellent reivew papers [11, 12] published recently).
Network-based DTL: Network-based DTL means that partial network parameters pre-trained in the source domain
are transferred to be partial network parameters of the testing procedure or network parameters are fine-tuned with a few
labeled data in the target domain. Pre-trained deep neural networks with the source data were used in [13–22] by frozing
its partial parameters, and then part of network parameters were transferred to the target network and other parameters
were fine-tuned with a small amount of target data. Pre-trained deep neural networks on ImageNet were used in [23–28]
and were fine-tuned with limited target data to adapt the domain of engineer applications. Qureshi et al. [29] pre-trained
nine deep sparse auto-encoders on a wind farm, and predictions on other wind farm datasets were taken by fine-tuning
the pre-trained networks. Zhong et al. [30] trained CNN on enough normal samples and then replaced fully-connected
layers with support vector machine (SVM) as the target model to train and test on fewer fault samples. Han et al. [31]
discussed and compared three fine-tuning strategies: only fine-tuning the classifier, fine-tuning the feature descriptor and
fine-tuning both the feature descriptor and the classifier for diagnosing unseen machine conditions. Besides, Xu et al. [32]
pre-trained the offline CNN on the source domain and directly transferred them to the shallow layers of the online CNN
with fine-tuning the online CNN on the target domain for online fault diagnosis.
Instanced-based DTL: Instanced-based DTL refers to reweight instances in the source domain to assist the classifier
to predict on the target domain or use the statistics of instances in the target domain to help align the domains, such as
TrAdaBoost [33] and adaptive Batch Normalization (AdaBN) [34]. Xiao et al. [35] used TrAdaBoost to enhance the
diagnostic capability of the fault classifier by adjusting the weight factor of each training sample. Zhang et al. [36] and
Qian et al. [37] used AdaBN to improve the domain adaptation ability of the model by ensuring that each layer receives
data from a similar distribution.
Mapping-based DTL: Mapping-based DTL refers to map instances from both source and target domains to a feature
space through deep neural network. In this feature space, the domain divergence is minimized by distance metrics such
as correlation alignment (CORAL) [38], maximum mean discrepancy (MMD) [39, 40], multi kernels MMD (MK-MMD)
[41, 42], joint distribution adaptation (JDA) [43], balanced distribution adaptation (BDA) [44], and Joint Maximum
Mean Discrepancy (JMMD) [45]. Wang et al. [46] used BDA to adaptively balance the importance of the marginal
and conditional distribution discrepancy between feature domains learned by deep neural networks for the power data
analysis. Wang et al. [47] minimized the CORAL loss for reducing the marginal and conditional distribution discrepancy
between domains in the feature space for fault diagnosis of a thermal system. Another metric distance called MMD
4