8148 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 64, NO. 10, OCTOBER 2017
Data-Driven Distributed Local Fault Detection for
Large-Scale Processes Based on the
GA-Regularized Canonical Correlation Analysis
Qingchao Jiang, Steven X. Ding, Yang Wang, and Xuefeng Yan
Abstract—Large-scale processes have become common,
and fault detection for such processes is imperative. This
work studies the data-driven distributed local fault detection
problem for large-scale processes with interconnected sub-
systems and develops a genetic algorithm (GA)-regularized
canonical correlation analysis (CCA)-based distributed
local fault detection scheme. For each subsystem, the
GA-regularized CCA is first performed with its all coupled
systems, which aims to preserve the maximum correlation
with the minimal communication cost. A CCA-based
residual is then generated, and corresponding statistic is
constructed to achieve optimal fault detection for the sub-
system. The distributed fault detector performs local fault
detection for each subsystem using its own measurements
and the information provided by its coupled subsystems
and therefore exhibits a superior monitoring performance.
The regularized CCA-based distributed fault detection ap-
proach is tested on a numerical example and the Tennessee
Eastman benchmark process. Monitoring results indicate
the efficiency and feasibility of the proposed approach.
Index Terms—Canonical correlation analysis (CCA),
distributed fault detection, genetic algorithm (GA), large-
scale processes.
Manuscript received November 15, 2016; revised Febr uary 27, 2017
and March 26, 2017; accepted April 1, 2017. Date of publication April
27, 2017; date of current version September 11, 2017. This work
was supported in part by the National Natural Science Foundation of
China under Grant 61603138, in part by the Fundamental Research
Funds for the Central Universities under Grant 222201717006 and Grant
222201714027, in par t by the Young Teacher Study Abroad Program of
Shanghai under Grant A1-0217-16-003-01, and in part by the Alexander
von Humboldt Foundation. (Corresponding authors: Steven X. Ding and
Xuefeng Yan.)
Q. Jiang is with the Key Laboratory of Advanced Control and Opti-
mization for Chemical Processes of Ministry of E ducation, East China
University of Science and Technology, Shanghai 200237, P.R. China.
He was formerly with the Institute for Automatic Control and Complex
Systems (AKS), University of Duisburg-Essen, Duisburg 47057, Ger-
many (e-mail: qchjiang@ecust.edu.cn).
S. X. Ding is with the Institute for Automatic Control and Complex Sys-
tem, University of Duisburg-Essen, Duisburg 47057, Germany (e-mail:
steven.ding@uni-due.de).
Y. Wang is with the School of Electric Engineering, Shanghai
Dianji University, Shanghai 200240, China, and also with the School
of Mechatronic Engineering and Automation, Shanghai University,
Shanghai 200072, China (e-mail: wangyang@sdju.edu.cn).
X. Yan is with the Key Laboratory of Advanced Control and Opti-
mization for Chemical Processes of Ministry of E ducation, East China
University of Science and Technology, Shanghai 200237, China (e-mail:
xfyan@ecust.edu.cn).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIE.2017.2698422
I. INTRODUCTION
W
ITH the increasing demand in process safety and prod-
uct quality, fault detection remains a key issue in aca-
demic research and industrial application [1]–[6]. Nowadays,
plant-wide processes are usually characterized by a large scale,
multiple operation units, and complex correlations; monitoring
and control for these large-scale processes are among the new
challenges in the process control community [7]–[9]. Consider-
ing the wide use of intelligent sensing and computing devices,
recent research has seen significant advances in distributed mod-
eling, control, and monitoring methodologies for large-scale
systems [10]–[23].
The development of data gathering, transmitting, and process-
ing techniques has led to the rapid progress of data-driven mul-
tivariate statistical process-monitoring (MSPM) methods [2],
[3], [24]. To deal with large-scale processes, multiblock and
distributed monitoring schemes, which divide the entire process
data into several blocks to obtain local and global information,
have attracted considerable attention [12], [25], [26]. Traditional
methods generally assume that the processes are decomposed
using process knowledge. However, accurate process knowledge
is not always available. Totally data-driven distributed moni-
toring schemes have been developed [27]–[29]. These studies
have enhanced the data-driven distributed monitoring founda-
tion. Nevertheless, they usually focus on the overall monitoring
performance of the entire process but not on a local unit. Lo-
cal fault detection for large-scale processes remains an open
question.
As a representative multivariate analysis method, canonical
correlation analysis (CCA) has shown its efficiency in exploring
the relationship between two sets of variables. The first appli-
cations of canonical variate analysis (CVA, a generalized form
of CCA) have been reported in [1] and [30]–[32]. Unlike the
CVA-based methods that depend on canonical variables, in [33],
CCA is used to characterize the process input–output relations
and generate the fault detection residual. In [34], the CCA-based
method is improved to deal with incipient multiplicative faults
and the efficiency is shown. However, to the knowledge of the
authors, the use of CCA to characterize the correlation rela-
tions between a subsystem and the entire process has not been
discussed.
In this work, we study the distributed local fault detection for
large-scale processes that consist of interconnected subsystems.
0278-0046 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications
standards/publications/rights/index.html for more information.