没有合适的资源?快使用搜索试试~ 我知道了~
面向任务的对话系统的最新进展和挑战【清华大学】.pdf
需积分: 10 15 下载量 51 浏览量
2020-03-24
17:53:43
上传
评论
收藏 575KB PDF 举报
温馨提示
试读
19页
由于任务型对话系统在人机交互和自然语言处理中的重要意义和价值,越来越受到学术界和工业界的重视。在这篇论文中,我们以一个具体问题的方式综述了最近的进展和挑战。
资源推荐
资源详情
资源评论
SCIENCE CHINA
Information Sciences
.
REVIEW
.
Recent Advances and Challenges in
Task-oriented Dialog System
Zheng Zhang, Ryuichi Takanobu, Minlie Huang
*
& Xiaoyan Zhu
Dept. of Computer Science & Technology, Tsinghua University, Beijing 100084, China;
Institute for Artificial Intelligence, Tsinghua University (THUAI), Beijing 100084, China;
Beijing National Research Center for Information Science & Technology, Beijing 100084, China
Abstract Due to the significance and value in human-computer interaction and natural language process-
ing, task-oriented dialog systems are attracting more and more attention in both academic and industrial
communities. In this paper, we survey recent advances and challenges in an issue-specific manner. We discuss
three critical topics for task-oriented dialog systems: (1) improving data efficiency to facilitate dialog system
modeling in low-resource settings, (2) modeling multi-turn dynamics for dialog policy learning to achieve
better task-completion performance, and (3) integrating domain ontology knowledge into the dialog model
in both pipeline and end-to-end models. We also review the recent progresses in dialog evaluation and some
widely-used corpora. We believe that this survey can shed a light on future research in task-oriented dialog
systems.
Keywords task-oriented dialog, low-resource, dialog state tracking, dialog policy, end-to-end model
Citation Zhang Z, Takanobu R, Huang M, Zhu X. Recent Advances and Challenges in Task-oriented Dialog
System. Sci China Inf Sci, for review
1 Introduction
Building task-oriented (also referred to as goal-oriented) dialog systems has become a hot topic in the
research community and the industry. A task-oriented dialog system aims to assist the user in completing
certain tasks in a specific domain, such as restaurant booking, weather query, and flight booking, which
makes it valuable for real-world business. Compared to open-domain dialog systems where the major
goal is to maximize user engagement [1], task-oriented dialog systems are more targeting at accomplishing
some specific tasks in one or multiple domains. Typically, the task-oriented dialog systems are built on
top of a structured ontology, which defines the domain knowledge of the tasks.
1.1 General Framework
The architecture of task-oriented dialog systems can be roughly divided into two classes: pipeline and
end-to-end approaches. In pipeline approaches, the model often consists of several components, includ-
ing Natural Language Understanding (NLU), Dialog State Tracking (DST), Dialog Policy, and Natural
Language Generation (NLG), which are combined in a pipeline manner as shown in Figure 1. The NLU,
DST and NLG components are often trained individually before being aggregated together, while the
dialog policy component is trained within the composed system. It is worth noting that although the
NLU-DST-Policy-NLG framework is a typical configuration of the pipeline system, there are still some
* Corresponding author (email: aihuang@tsinghua.edu.cn)
arXiv:2003.07490v2 [cs.CL] 19 Mar 2020
Zhang Z, et al. Sci China Inf Sci 2
Natural Language
Understanding
User
Dialog State
Tracking
Natural Language
Generation
Dialog Policy
Dialog State
“I want to find a
Chinese restaurant.”
“Where do you
want to eat? ”
Inform (cuisine=“Chinese”)
Request (location)
Dialog
Manager
KB
Query
Figure 1 General framework of a pipeline task-oriented dialog system.
other kinds of configurations. Recently, there are works that merge some of the typical components, such
as word-level DST and word-level policy, resulting in various pipeline configurations.
In end-to-end approaches, the dialog systems are trained in an end-to-end manner, without specifying
each individual component. Commonly, the training process is formulated as generating a responding
utterance given the dialog context and the backend knowledge base.
1.1.1 Natural Language Understanding
Given a user utterance, the natural language understanding (NLU) component maps it to a structured
semantic representation. A popular schema for semantic representation is the dialog act, which consists
of intent and slot-values, as illustrated in Table 1. The intent type is a high-level representation of an
utterance, such as Query and Inform. Slot-value pairs are the task-specific semantic elements that are
mentioned in an utterance. Based on the dialog act structure, the task of NLU can be further decomposed
into two tasks: intent detection and slot-value extraction. The former is normally formulated as an intent
classification task by taking the utterance as input, and the slot-value extraction task is often viewed as
a sequence labeling problem:
p
intent
(d|x
1
, x
2
, ..., x
n
) (1)
p
slot
(y
1
, y
2
, ..., y
n
|x
1
, x
2
, ..., x
n
) (2)
where the d indicates intent class and y
1
to y
n
are the labels of each token in the utterance [x
1
, x
2
, ..., x
n
]
in which x
i
is a token and n means the number of tokens. p
intent
and p
slot
are often implemented
using recurrent neural networks, such as LSTM, to predict the intent class d and the sequence label y
t
respectively.
Table 1 An example of dialog act for an utterance in the restaurant reservation domain.
Utterance How about a British restaurant in north part of town.
Intent Query
Slot Value Cuisine=British, Location=North
1.1.2 Dialog State Tracking
The dialog state tracker estimates the user’s goal in each time step by taking the entire dialog context
as input. The dialog state at time t can be regarded as an abstracted representation of the previous
turns until t. Most existing works adopted belief state for dialog state representation, in which the state
is composed of several probability distributions over the value vocabulary of each slot. Therefore, this
Zhang Z, et al. Sci China Inf Sci 3
problem can be formulated as a multi-task classification task:
p
i
(d
i,t
|u
1
, u
2
, ..., u
t
) (3)
where for each specific slot i, there is a tracker p
i
. u
t
represents the utterance in turn t. The class of
slot i in the t-th turn is d
i,t
. However, this approach falls short when facing previously unseen values at
running time. To mitigate this issue, there are some generative approaches that generalize well on new
domains and previous unseen values.
𝑆
"#$
𝑆
"
𝑆
"%$
𝑆
"%&
𝑅
"#$
𝑅
"
𝑅
"%$
𝑎
"#$
𝑎
"
𝑎
"%$
Reward
State
Action
Figure 2 The framework of Markov Decision Process [2]. At time t, the system takes an action a
t
, receiving a reward R
t
and transferring to a new state S
t+1
.
1.1.3 Dialog Policy
Conditioned on the dialog state, the dialog policy generates the next system action. Since the dialog
acts in a session are generated sequentially, it is often formulated as a Markov Decision Process (MDP),
which can be addressed by Reinforcement Learning (RL). As illustrated in Figure 2, at a specific time
step t, the uesr takes an action a
t
, receiving a reward R
t
and the state is updated to S
t
.
A typical approach is to first train the dialog policy off-line through supervised learning or imitation
learning based on the dialog corpus, and then fine-tune the model through RL with real users. Since real
user dialogs are costly, user simulation techniques are introduced to provide affordable training dialogs.
1.1.4 Natural Language Generation
Given the dialog act generated by the dialog policy, the natural language generation component maps the
act to a natural language utterance, which is often modeled as a conditioned language generation task [3].
To improve user experience, the generated utterance should (1) fully convey the semantics of a dialog act
for task-completion, and (2) be natural, specific, and informative, analogous to human language.
1.1.5 End-to-end Methods
The end-to-end approaches for task-oriented dialog systems are inspired by the researches on open-
domain dialog systems, which use neural models to build the system in an end-to-end manner without
modular design, as shown in Figure 3. Most of these methods utilized seq2seq neural networks as the
infrastructural framework. End-to-end formulation can avoid the problem of error propagation within
cascaded components during training.
1.2 Main Challenges
Recent advances in task-oriented dialog systems are dominated by neural approaches. These approaches
can be roughly classified into two genres: pipeline and end-to-end methods.
Zhang Z, et al. Sci China Inf Sci 4
Encoder Decoder
Latent
Variable
KB
Query
knowledge
Figure 3 The framework of end-to-end dialog systems. It first encodes natural language context to get some latent
variables, which can be used for KB query. Then based on the latent variables and query results, the decoder generates a
natural language response.
In pipeline methods, recent researches focus more on the dialog state tracking and dialog policy com-
ponents, which are also called Dialog Management. This is because both NLU and NLG components are
standalone language processing tasks, which is less interweaved to the task in dialog systems. Based on
the domain ontology, the DST task can be seen as a classification task by predicting the value of each
slot. However, when the training data is not sufficient, such classification-based methods can suffer from
the out-of-vocabulary (OOV) problem and can not be directly generalized to new domains. The dialog
policy learning task is often considered as a reinforcement learning task. Nevertheless, different from
other well-known RL tasks, such as playing video games and Go, the training of dialog policy requires
real humans to serve as the environment, which is very costly. Furthermore, most existing methods used
manually defined rewards, such as task-completion rate and session turn number, which cannot reliably
evaluate the performance of a system.
For end-to-end methods, the data-hungry nature of the vanilla sequence-to-sequence model makes it
difficult to learn the sophisticated slot filling mechanism in task-oriented dialog systems with a limited
amount of domain-specific data. The knowledge base query issue requires the model to generate an
intermediate query besides the encoder and the decoder, which is not straightforward. Another drawback
is that the encoder-decoder framework utilizes a word-level strategy, which may lead to sub-optimal
performance because the strategy and language functions are entangled together.
Based on the above analysis, we elaborate three key issues in task-oriented dialog systems which we
will discuss in detail:
• Data Efficiency Most neural approaches are data-hungry, requiring a large amount of data to
fully train the model. However, in task-oriented dialog systems, the domain-specific data is often
hard to collect and expensive to annotate. Therefore, the problem of low-resource learning is one
of the major challenges.
• Multi-turn Dynamics The core feature of task-oriented dialog as compared to open-domain
dialog is its emphasis on goal-driven in multi-turn strategy. In each turn, the system action should
be consistent with the dialog history and should guide the subsequent dialog to larger task reward.
Nevertheless, the model-free RL methods which have shown superior performance on many tasks,
can not be directly adopted to task-oriented dialog, due to the costly training environment and
imperfect reward definition. Therefore, many solutions are proposed to tackle these problems in
multi-turn interactive training to learn a better policy, including model-based planning, reward
estimation and end-to-end policy learning.
• Knowledge Integration A task-oriented dialog system has to query the knowledge base (KB)
to retrieve some entities for response generation. In pipeline methods, the KB query is mostly
constructed according to DST results. Compared to pipeline models, the end-to-end approaches
剩余18页未读,继续阅读
资源评论
syp_net
- 粉丝: 158
- 资源: 1196
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功