Deep Learning of Dynamic Factor Models for Asset Pricing
Nikolay Nikolaev
https://nikolaevny.github.io/
The large amount of transactions on the electronic exchanges today stimulates the development
of computational algorithms for automated portfolio trading [1], [2], [3], [4]. These algorithms
continuously optimize the portfolio with the arrival of new information on the market. The
optimization strategy typically includes asset pricing, asset selection and rebalancing. A popular
formalism for asset pricing are the factor models [5]. A factor model describes the relationships
between assets by regressing their returns on unobservable latent variables (called risk factors).
Such a model captures the dependencies between multiple return series into a small number
of correlations. The difficulty is to choose only a subset of factors which taken together are
profitable to trade (i.e., to select a parsimonious factor model).
Among the various approaches to factor modelling current research actively investigates the
machine learning [6],[7] and deep learning [1], [2], [3],[4]. These paradigms offer stable algorithms
that are robust to small unessential changes in the returns and volatilities, and are especially suitable
for induction and selection of variables representing asset correlations [8]. These algorithms provide
capacity for learning linear and nonlinear factor models, as well as identification of interactions between
series of returns on prices. Some of the most recent frameworks are: the Deep Learning Factor Network [1],
the Deep Autoencoder Asset Pricing Network [2], and the Deep Multilayer Factor Network [4]. A common
limitation of all cited above machine and deep learning algorithms is that during model calibration they
apply static training procedures, that is they do not handle explicitly the temporal dimension, and so they
loose accuracy. It should be noted that the standard portfolio optimizers also treat the factor models
with static procedures, and this leads to suboptimal solutions. Even if the model has temporal
variables it is important to apply dynamic calibration in order to achieve its full descriptive potential.
Our research developed an original nonlinear dynamic factor model for asset pricing using a deep
learning technology. We designed a dynamic factor model represented by a recurrent neural network
with local memory that carries temporal information. The network is trained with a dynamic learning
algorithm the BPTT (Backpropagation Through Time) [9] which unfolds the model back in time to
capture temporal dependencies (with dynamic derivatives). This is a deep learning mechanism because
the unrolling creates a deep network structure. The BPTT is used to update the parameters of the hidden
layer which produces the latent factors. The output layer computes the beta loadings. The strength of
our model is that it predicts asset returns from inferred factor realizations, more precisely, the factors
are forecasted arrangements of individual asset contributions to the overall portfolio. The overall model
is a time-dependent function whose forecasts are less sensitive to noise and nonstationarities in the time
series (this is an advantage over the previous deep factor networks which were trained as static functions).
The proposed Deep Dynamic Factor Model (DDFM) is a modern tool for portfolio construction. We
investigated the usefulness of DDFM for building sparse portfolios [6] that aim to outperform the equally
weighted benchmark. The complexity of the DDFM model is reduced with a neural network pruning
technique, using a cardinality parameter to control the degree of sparseness. The machine learning
algorithms LASSO and LARS were taken for comparison because they are probably the most popular
algorithms [7] for computing sparse portfolios [6]. The LASSO optimizes the portfolio using l1-norm
regularization that helps to achieve sparsity by increasing the magnitude of the optimal allocations. The
LARS is a sophisticated algorithm for solving the same regularized optimization problem by efficient search
for the regularization hyperparameter. Just for comparisons we also implemented a Ridge Regression
Portfolio (RRP) algorithm which uses the typical square loss training error.