A Driving Behavior Recognition Model with
Bi-LSTM and Multi-Scale CNN
He Zhang, Zhixiong Nan*, Tao Yang, Yifan Liu and Nanning Zheng
Abstract— In autonomous driving, perceiving the driving
behaviors of surrounding agents is important for the ego-vehicle
to make a reasonable decision. In this paper, we propose a
neural network model based on trajectories information for
driving behavior recognition. Unlike existing trajectory-based
methods that recognize the driving behavior using the hand-
crafted features or directly encoding the trajectory, our model
involves a Multi-Scale Convolutional Neural Network (MSCNN)
module to automatically extract the high-level features which
are supposed to encode the rich spatial and temporal infor-
mation. Given a trajectory sequence of an agent as the input,
firstly, the Bi-directional Long Short Term Memory (Bi-LSTM)
module and the MSCNN module respectively process the input,
generating two features, and then the two features are fused to
classify the behavior of the agent. We evaluate the proposed
model on the public BLVD dataset, achieving a satisfying
performance.
I. INTRODUCTION
Researches on understanding complex traffic scenarios
have recently been widely studied in the autonomous driv-
ing community [1]. When constructing a safe and reliable
autonomous driving system or Advanced Driver Assistance
System (ADAS), in order to analyze the dynamic evolution
of the traffic scene and then make a reasonable decision, it
is necessary to perceive the driving behavior of other agents
around the autonomous vehicle in real-time. For example,
sensing the braking behavior and the lane changing behavior
of vehicles in front of the autonomous vehicle is signifi-
cant for predicting possible dangerous events. Meanwhile,
accurate recognition of driving behavior can not only assist
path planning and motion decisions but also serve as high-
level semantics to assist trajectory prediction of vehicles or
pedestrians [2], [3]. In this paper, we focus on accurately
identifying interactive behavior in the traffic environment,
and the interactive behavior refers to the movement status
of surrounding traffic agents (vehicles, pedestrians, riders,
etc) relative to the ego-vehicle [4]. The driving behavior
categories of vehicles around the ego-vehicle are shown in
Fig. 1.
In the autonomous driving environment, the trajectory
sequence is considered as relatively reliable and valuable
information to model traffic agent behaviors. Due to the
complexity and dynamics of real traffic environments, it
is challenging to classify the driving behavior. The main
challenges are three-fold: 1) Generally, each kind of driving
event has different temporal durations. If we use a big
*Corresponding author: Zhixiong Nan nzx2018@xjtu.edu.cn
The authors are with the Institute of Artificial Intelligence and Robotics,
Xi’an Jiaotong University, Xi’an, China
Fig. 1. Driving behavior categories
window to split the trajectory into training samples, there
may exist multiple kinds of behaviors in a sample; 2) For
a fixed temporal window, the number and the behavior type
of agents around ego-vehicle are highly dynamic; 3) There
exists a severe imbalance in behavioral data, and the limited
training samples are available for most anomalous behavior
categories.
Recent progress in Lidar, GPS and visual vehicle de-
tection technologies allows collecting accurate and robust
trajectory data, which makes it possible to leverage data-
driven methods for driving behavior recognition task. Exist-
ing trajectory-based methods can be generally divided into
two types, one is to construct a classifier based on some
hand-crafted features [5]–[10], the other is to directly model
the trajectory sequence to obtain dynamic evolution rules,
and then implement the behavior classification [2], [3], [11].
However, there exist many drawbacks for both of them. The
former requires some domain knowledge to design manual
features and generally needs to select different features for
different datasets, leading to a lack of generalization across
different scenarios. The drawback of the latter is that the
original information included in trajectory points may be
insufficient, which may lead to the under-fitting of the model.
To overcome those drawbacks existing in the conventional
methods, we propose a neural network model to recognize
the driving behavior of surrounding agents. Unlike existing
trajectory-based methods that recognize the driving behavior
using the hand-crafted features or directly encoding the
arXiv:2103.00801v1 [cs.CV] 1 Mar 2021