Model-free optimal control design for a class of linear discrete-time
systems with multiple delays using adaptive dynamic programming
Jilie Zhang
a
, Huaguang Zhang
a,b,
n
, Yanhong Luo
a
, Tao Feng
a
a
School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning 110819, P.R. China
b
State Key Laboratory of Synthetical Automation for Process Industries (Northeastern University), Shenyang, Liaoning 110819, P.R. China
article info
Article history:
Received 9 July 2013
Received in revised form
4 November 2013
Accepted 16 December 2013
Communicated by D. Liu
Available online 24 January 2014
Keywords:
Model-free optimal control
Discrete-time delay system
Optimal control
Adaptive dynamic programming
abstract
In this paper, a model-free optimal control scheme for a class of linear discrete-time systems with
multiple delays in state, control and output vectors is proposed. The optimal control can be obtained
using only measured input/output data from systems, by adaptive dynamic programming (ADP)
technology. First, we give a class of systems what we want to address. Then, a model-free optimal
control is designed to minimize the given cost functional by ADP technology, which combines a similar
Q-learning method with a value iteration (VI) algorithm, using only the measured input/output data.
Finally, several numerical examples are given to illustrate the effectiveness of our approach.
& 2014 Elsevier B.V. All rights reserved.
1. Introduction
Since systems with time delay phenomena are ubiquity in
various research fields, such as biology, chemistry, economics,
mechanics, electrical, physics, as well as engineering sciences
[1–4], the optimal control problem is discussed as a key topic for
time-delay problems in [5–7] over the past several decades. In
fact, the optimal control for time-delay systems is an infinite-
dimensional control problem [8], which is hard to be solved.
However, because adaptive (approximate) dynamic programming
is a powerful tool for solving optimal control problems [9–11], the
optimal control based on ADP attracts considerably attention of
researchers.
In recent years, the ADP is used to design the optimal control
for control systems [12–18,20,32–34]. The optimal control pro-
blem for continuous-time systems is studied in [12–15,17,20,
32,34]. While Refs. [16,18,33] design the optimal control for
discrete-time systems. However, to the best of our knowledge,
the optimal control results based on ADP for time-delay systems
are rare. There exist only some relevant results, such as [19,21,22].
An optimal control scheme for nonlinear systems with delays is
proposed by using a new iterative ADP algorithm in [21].In[19],a
new iterative heuristic dynamic programming (HDP) algorithm is
proposed to solve the optimal control problem for a class of
nonlinear discrete time-delay systems with saturating actuators.
The local and global optimization searching processes are devel-
oped to solve the optimal control problem in the iterative HDP
algorithm. Later, Ref. [22] designs the optimal control for tracking
control systems by a novel HDP iteration algorithm which contains
state updating, control policy iteration and performance index
iteration. However, most of the above results design the optimal
control for time-delay systems with known knowledge of systems.
Although ADP algorithms, which are used to obtain the optimal
control for time-delay systems, have made some progress, how to
design the model-free optimal control for time-delay systems is
still an open problem. For a simple case without delays, Lewis has
made a contribution [23] to the model-free optimal control design,
but few researches focus on designing the model-free optimal
control for systems with multiple time delays. Therefore, a control
we present by the method in [23] is used to drive the time-delay
systems, rather than the systems without delays. Namely, we
design the optimal control for the equivalent systems by the
method in [23], then drive the original systems using it. However,
the system must satisfy the certain conditions. Although the
systems we address are not general, it has been a progress for
designing the model-free optimal control for time-delay systems
in the ADP field. The other contribution is that we find a class of
systems with delays, which can be drove by an optimal control
without delays.
In this paper, we not only expand the necessary and sufficient
conditions in [24] to linear discrete-time systems with multiple
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/neucom
Neurocomputing
0925-2312/$ - see front matter & 2014 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.neucom.2013.12.038
n
Corresponding author.
E-mail addresses: jilie0226@163.com (J. Zhang), hgzhang@ieee.org (H. Zhang),
neuluo@gmail.com (Y. Luo), sunnyfengtao@163.com (T. Feng).
Neurocomputing 135 (2014) 163–170