‘‘MoGCN’’ (Mixture of
Gated Convolutional
Neural Network)
2
“MoGCN”(⻔控卷积神经⽹络的
混合)
Research on Chinese
Ancient and Modern
Writing Habits Based
on Ergonomics
ABSTRACT
This paper addresses the
respective issues and proposes
an efficient automatic processing
solution for tackling NER of
ancient Chinese data, including
the implementation of data-
driven tagging and an innovative
end-to-end network namely
‘‘MoGCN’’ (Mixture of Gated
Convolutional Neural Network).
1
本⽂解决了各⾃的问题,并提出了
⼀种处理中国古代数据NER的⾼
效⾃动处理解决⽅案,包括实现数
据驱动标记和创新的端到端⽹络,
即“MoGCN”(混合⻔卷积神经⽹
络)。
Future work should be focused
on more exploration about NER
optimization on massive Chinese
traditional texts with linguistic
features and learning strategies.
1
今后的⼯作应侧重于对具有语⾔特
征和学习策略的⼤量中⽂传统⽂本
的净⼊学率优化进⾏更多的探索。
I. INTRODUCTION
Named Entity Recognition (NER)
is a fundamental task to extract
useful named entities from texts
in automatic way, which plays a
crucial role in the fields of
Natural Language Processing
(NLP) and Information Retrieval
(IR).
1
命名实体识别(NER)是从⽂本
中⾃动提取有⽤命名实体的⼀项基
本任务,在⾃然语⾔处理(NLP)
和信息检索(IR)领域发挥着⾄关
重要的作⽤。
However, this encounters two
major problems: first, the
distinction of text styles in
diachronic corpora [4] brings
great difficulty for the entity
identification, since different
types of ancient texts have quite
different entity content and
complexity of Chinese character
distribution.
2
然⽽,这遇到了两个主要问题:第
⼀,历时语料库中⽂本⻛格的区分
[4]给实体识别带来了很⼤困难,
因为不同类型的古代⽂本具有截然
不同的实体内容和汉字分布的复杂
性。
texts. To tackle these problems,
we propose a novel entity
extraction approach for Chinese
historical texts in particular, that
is, a prior semi-automatic
procedure of annotated corpus
through database-oriented
mapping rules and a novel
supervised sequential tagger
with Deep Learning (DL)
architectures.
2
⽂本。为了解决这些问题,我们提
出了⼀种新的中⽂历史⽂本实体提
取⽅法,即通过⾯向数据库的映射
规则预先半⾃动地提取带注释的语
料库,以及⼀种具有深度学习
(DL)结构的新型有监督序列标
记器。
One innovation of this paper is
to import a more effective deep
blocks called ‘‘MoGCN’’ (Mixture
of Gated Convolutional Neural
Network) on the basis of gated
mechanism to traditional local
representation of multiple
character-based embeddings.
2
本⽂的⼀个创新之处是,在选通机
制的基础上,将⼀个更有效的深层
块“MoGCN”(混合选通卷积神经
⽹络)引⼊到基于多字符嵌⼊的传
统局部表示中。
II. RELATED WORK
The major approaches of NER
for Chinese historical texts have
been focused on handcrafted
heuristic rules [6]–[11] to
formulate entity features from
context-derived patterns in a
quick manner, which depend a
lot on relevant domain
knowledge.
2
中国历史⽂本的NER主要⽅法集
中于⼿⼯编制的启发式规则[6]-
[11],以从上下⽂衍⽣模式中快速
形成实体特征,这在很⼤程度上依
赖于相关领域知识。
B. DL-BASED NEURAL
NETWORKS FOR NER
TASKS
B、 ⽤于NER任务的基于DL
的神经⽹络
Recent advances of NER have
utilized deep neural networks in
modelling Chinese texts, which
demonstrate a more powerful
capacity for feature abstraction.
These methods can be divided
into two main branches,
including RNN (Recurrent Neural
Network)- and CNN
(Convolutional Neural Network)-
based NER models.
2
NER的最新进展已将深度神经⽹
络⽤于中⽂⽂本建模,这表明其具
有更强⼤的特征提取能⼒。这些⽅
法可以分为两个主要分⽀,包括基
于RNN(递归神经⽹络)和CNN
(卷积神经⽹络)的净⼊学率模
型。
In this paper, we offer a holistic
solution to this research gap with
an automatically constructed
labeled corpus by extracting
historical entities from
unstructured raw texts.
3
在本⽂中,我们通过从⾮结构化原
始⽂本中提取历史实体来⾃动构建
标记语料库,为这⼀研究空⽩提供
了⼀个整体解决⽅案。
III. PROPOSED
FRAMEWORK
iii。 建议的框架
The overall architecture of the
framework in this paper consists
of two major phases: the semi-
automatic construction of an
annotated historical corpus and
the supervised DL-based
sequential model for NER, as
shown in Figure 1.
3
本⽂框架的总体架构由两个主要阶
段组成:半⾃动构建带注释的历史
语料库和基于监督DL的NER序列
模型,如图1所示。
However, the difference is that
we replace the original CNN with
a self-defined stacked module -
‘‘MoGCN’’ (Mixture of Gated
Convolutional Network), which
manifests a stronger fitting
capability. To capture a rich
semantic representation,
MoGCNs are employed in
multiple kernels with varying
widths to generate robust
functions of mapping the
embedding vectors to diverse
hidden neural spaces, as
demonstrated in Figure 2.
4
然⽽,不同的是,我们⽤⾃定义的
堆叠模块“MoGCN”(⻔控卷积⽹
络的混合)取代了原始的CNN,
这表明了更强的拟合能⼒。为了捕
获丰富的语义表示,MoGCN被⽤
于具有不同宽度的多个内核中,以
⽣成将嵌⼊向量映射到不同隐藏神
经空间的健壮函数,如图2所示。
As such, we develop a more
powerful block-Dilated Residual
Network (DRN) to combine the
strengths of Dilated CNNs (DCN)
and residual blocks together.
4
因此,我们开发了⼀个更强⼤的区
块扩张残差⽹络(DRN),将扩
张CNN(DCN)和残差区块的优
势结合在⼀起。
Moreover, Gate Linear Unit (GLU)
[46] has been used to substitute
the activation function (i.e. tanh
or sigmoid) based on the gated
mechanism for controlling the
non-linear ability of activation
function.
4
此外,⻔线性单元(GLU)[46]已
⽤于替代基于⻔机制的激活函数
(即tanh或sigmoid),以控制激
活函数的⾮线性能⼒。
IV. EXPERIMENTS
5
D. EVALUATION
AND ANALYSIS
As shown in Table 4, our method
(Avg.F1 = 86.99%) achieves the
top performance among the all
the models, with significant
improvement by 1.5%-11%.
Notably, our model outperforms
the previously best model in
record (LSTM-CNNs-CRF) by
1.5%. In addition, to testify the
model robustness, we
implement sub-experiments on
the three sub-datasets and
results are provided in Table 5.
7
如表4所示,我们的⽅法(平均
F1=86.99%)在所有模型中都达
到了最佳性能,显著提⾼了
1.5%-11%。值得注意的是,我们
的模型⽐以往记录中最好的模型
(LSTM CNN CRF)⾼1.5%。此
外,为了验证模型的鲁棒性,我们
在三个⼦数据集上进⾏了⼦实验,
结果如表5所示。
评论0