AdaptiveMulti-TaskTransferLearning_learningmoment资源-CSDN文库

需积分: 18 149 浏览量 2018-10-06 14:52:29 上传评论收藏 386KB PDF 举报

多任务自适应迁移学习是自然语言处理（NLP）领域中的一项重要技术，它在处理特定领域文本，例如医疗文本的中文分词（CWS，Chinese Word Segmentation）时，具有显著的性能提升作用。下面将详细阐述“Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text”这篇英文文献中提到的知识点。中文分词是中文自然语言处理中的一项基础任务。目前最先进的方法大多基于统计监督学习和神经网络。这些方法严重依赖于人工标注数据，而这些数据的标注是一个耗时且成本高昂的工作。尤其是对于特定领域的分词，例如医学领域，因为只有该领域的专家才有资质完成这样的工作，所以分词任务的标注成本更高。此外，当使用开源数据集（例如SIGHAN2005）训练得到的分词工具来处理特定领域的文本时，会面临显著的性能下降。专业术语和书写风格的多样性使得训练一个通用的中文分词工具变得极其困难。以医学领域的术语“高铁血红蛋白血症”（methemoglobinemia）为例，中文医学专家会将其标注为“高铁/血红蛋白/血症”，这表示由含“高铁”（在中文中，意味着带三价铁的血红蛋白）引起的贫血症，与“Methemoglobinemia”的形态相对应。而使用在PKU《人民日报》语料库上训练的模型（PKU模型）处理后，“铁血”（jagged）被当作一个词处理，这在语义上是错误的。另外，另一个流行的中文分词工具Jieba，错误地将“高”和“铁”放在一起，这在中文中代表的是高速子弹头列车。研究者Junjie Xing、Kenny Q. Zhu、Shaodian Zhang来自上海交通大学，他们提出了一种方法，该方法能够利用资源丰富的领域到资源贫乏领域的领域不变知识。经过广泛的实验验证，他们的模型在性能上一致地超越了单一任务的CWS和其它迁移学习基线方法，尤其是在源域和目标域之间存在较大差异时。多任务自适应迁移学习的核心思想在于，通过共享不同任务之间的知识，能够在有限的标注数据情况下，有效提升特定任务的性能。在中文分词中应用这一技术，可以有效克服单一数据源的限制，特别是在数据稀疏或分布不均的医疗文本领域。在这个研究中，作者们将多任务学习和迁移学习结合起来，使用一种自适应的方式整合多个任务的学习过程。在多任务学习框架下，模型同时学习多个任务，并通过任务之间的相关性来提高整体性能。而迁移学习则是在任务间共享知识，将一个任务上学习到的知识应用到另一个任务上，以减少对目标任务数据的依赖。在特定的中文分词场景下，这表示模型能够利用通用的或高资源领域数据来提升在医疗文本这个低资源领域的分词准确性。作者们的工作突出了领域适应技术在实际应用中的重要性，特别是在需要处理专业术语和多样化书写风格的领域，如医疗文本。他们提出的自适应多任务迁移学习方法，通过在不同任务或领域间高效迁移知识，大幅提高了中文分词的准确率，这对于中文NLP技术在医疗、金融等专业领域的应用具有重要的推动作用。从该文献中可以看出，对于中文分词这一基础的NLP任务，采用多任务自适应迁移学习的策略是有效且可行的，尤其是在面对高成本标注和低资源领域数据时。这一策略有助于降低标注成本，提高模型泛化能力，对于推动中文分词技术的进一步发展和应用具有积极的指导意义。

资源推荐

资源详情

资源评论

Proceedings of the 27th International Conference on Computational Linguistics, pages 3619–3630

Santa Fe, New Mexico, USA, August 20-26, 2018.

3619

Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation

in Medical Text

Junjie Xing Kenny Q. Zhu Shaodian Zhang

Shanghai Jiao Tong University

{jjxing@,kzhu@cs.,shaodian@apex.}sjtu.edu.cn

Abstract

Chinese word segmentation (CWS) trained from open source corpus faces dramatic performance

drop when dealing with domain text, especially for a domain with lots of special terms and

diverse writing styles, such as the biomedical domain. However, building domain-speciﬁc CWS

requires extremely high annotation cost. In this paper, we propose an approach by exploiting

domain-invariant knowledge from high resource to low resource domains. Extensive experiments

show that our model achieves consistently higher accuracy than the single-task CWS and other

transfer learning baselines, especially when there is a large disparity between source and target

domains.

1 Introduction

Chinese word segmentation (CWS) is a fundamental task for Chinese natural language processing (NLP).

Most state-of-art methods are based on statistical supervised learning and neural networks. They all

rely heavily on human-annotated data, which is a time-consuming and expensive work. Specially, for

domain CWS, e.g., medical ﬁeld , the annotation expense is even higher because only domain experts

are qualiﬁed for the work.

Moreover, CWS tools trained from open source datasets, e.g., SIGHAN2005

, face a signiﬁcance

performance drop when dealing with domain text. The ambiguity caused by domain terms and writing

style makes it extremely difﬁcult to train a universal CWS tool. As shown in Table 1, given a medical term

“高铁血红蛋白血症” (methemoglobinemia), Chinese medical experts would annotate it as “高/铁/血红

蛋白/血症”, which means anemia caused by hemoglobin with “high iron” (in Chinese, means iron with

valence of 3), corresponding to the morphology of “Methemoglobinemia”. “PKU” stands for a model

trained on PKU’s People’s Daily corpus, we can see that after segmentation, the word “铁血” (jagged)

is treated as one word, which is wrong semantically. Also, another popular Chinese CWS tool Jieba

mistakenly puts the characters “高” and “铁” together, which stands for the high-speed bullet train in

China.

CWS tool 高高高铁铁铁血血血红红红蛋蛋蛋白白白血血血症症症

PKU

高铁血红蛋白血症

high jagged albumen anemia

Jieba

高铁血红蛋白血症

train hemoglobin anemia

Medical

高铁血红蛋白血症

high iron hemoglobin anemia

Table 1: Medical CWS ambiguity with CWS tools. PKU stands for a model trained on PKU dataset.

In summary, domain speciﬁc CWS task poses signiﬁcant challenges because:

Kenny Q. Zhu is the corresponding author. This work is licensed under a Creative Commons Attribution 4.0 International

License. License details: http://creativecommons.org/licenses/by/4.0/

http://sighan.cs.uchicago.edu/bakeoff2005/

https://github.com/fxsjy/jieba

3620

1. Tools built on open source annotated corpus works badly on domain speciﬁc CWS.

2. Annotated domain data is scarce due to high cost.

3. How to leverage open source annotated data despite their generality is an open question.

Recently, efforts have been made to exploit open source (high resource) data to improve the perfor-

mance of domain speciﬁc (low resource) tasks and decrease the amount of domain annotated data (Yang

et al., 2017; Peng and Dredze, 2016; Mou et al., 2016). In this paper, we further this line of work by

developing a multi-task learning (Caruana, 1997; Peng and Dredze, 2016) framework, named Adaptive

Multi-Task Transfer Learning. Inspired by the success of Domain Adaptation (Saenko et al., 2010; Tzeng

et al., 2014; Long and Wang, 2015b), we propose to minimize distribution distance of hidden represen-

tation between the source and target domain, thus make the hidden representations adapt to each other

and obtain domain-invariant features. Finally, we annotated 3 medical datasets from different medical

departments and medical forum, together with 3 open source datasets

. The contribution of this paper

can be summarized as follows:

• We propose a novel framework for Chinese word segmentation in the medical domain.

• To the best of our knowledge, we are the ﬁrst to analyze the performance of transfer learning meth-

ods against the amount of disparity between target/source domains.

• Our framework outperforms strong baselines especially when there is substantial disparity.

• We open source 3 medical CWS datasets from different sources, which can be used for further study.

2 Related Work

2.1 Chinese word segmentation

Statistical Chinese word segmentation has been studied for decades. Xue and others (2003) was the ﬁrst

to treat it as a sequence tagging problem, using a maximum entropy model. Peng et al. (2004) achieved

better results by using a conditional random ﬁeld model (Lafferty et al., 2001). This method has been

followed by many other works (Zhao et al., 2006; Sun et al., 2012).

Recently, neural network models have been applied on CWS. These methods use automatically derived

features from neural network instead of hand-crafted discrete features. Zheng et al. (2013) ﬁrst adopted

neural network architecture to CWS. Chen et al. (2015b) used Long short-term memory(LSTM) to

capture long term dependency. Chen et al. (2015a) proposed a gated recursive neural network (GRNN)

to incorporate context information. In this paper, we adopt Bidirectional LSTM-CRF Models (Huang et

al., 2015) as our base model.

2.2 Transfer Learning

Transfer learning distills knowledge from source domain and helps target domain to achieve a higher per-

formance (Pan and Yang, 2010). In feature-based models, many transfer approached have been studied,

including instance transfer (Jiang and Zhai, 2007; Liao et al., 2005), feature representation transfer (Ar-

gyriou et al., 2006; Argyriou et al., 2007), parameter transfer(Lawrence and Platt, 2004; Bonilla et al.,

2007) and relation knowledge transfer(Mihalkova et al., 2007; Mihalkova and et al., 2009).

Recently, the transferability of neural networks is also studied. For example, (Mou et al., 2016)

studied two methods (INIT, MULT) on NLP applications. Peng and Dredze (2016) proposed to use

domain mask and linear projection upon multi-task learning (MTL) (Long and Wang, 2015a). In this

paper, we follow MTL and extend the framework with a novel loss function.

3 Single-Task Chinese word segmentation

In this section, we brieﬂy formulate the Chinese word segmentation task and introduce our base model,

Bi-LSTM-CRF (Huang et al., 2015).

剩余11页未读，继续阅读

评论收藏

内容反馈

月月月月月月半

粉丝: 3
资源: 5

Adaptive Multi-Task Transfer Learning

Adaptive Multi-Task Transfer 文献总结翻译

几篇CVPR关于multi-task的整理

Transfer Learning for Non-Intrusive Load Monitoring.pdf

Multi-Task Semi-Supervised Semantic Feature Learning for Classification

Deep Multi-Task Learning for Aspect Term Extraction.pdf

Representation_Learning_Using_Multi-Task_Deep_Neur.pdf

Joint Classification and Regression via Deep Multi-Task Multi-Channel Learning f_and me

Deep Multi-Task and Meta Learning【Stanford CS330】.zip

Multi-Agent Machine Learning The Reinforcement Approach

ANSI-C code for the floating-point Adaptive Multi-Rate (AMR) speech codec

Adaptive multi-focus image fusion using a wavelet-based statistics Sharpness measure:Adaptive multi-focus image fusion using a wavelet-based statistics Sharpness measure-matlab开发

Adaptive-Soft-Contrastive-Learning_ICPR2022-main.zip

Adaptive Support-Weight Approach for Correspondence Search.pdf

Large Margin Multi-Task Metric Learning

Tianchi-Multi-Task-Learning:第一名克莱登大学二队方案分享

An Overview of Multi-Task Learning in Deep Neural Networks.pdf

藏经阁-Multi-Task Learning for E-comm.pdf

A simple adaptive first-order differential microphone.pd

Adaptive Loss-Aware Quantization for Multi-Bit Networks.pdf

Design and implementation on DSP of the ETSI GSM Adaptive Multi-Rate Vocoder

自适应k均值matlab代码-SMCL:自适应多原型竞争学习

ABCNet - Real-time Scene Text Spotting with Adaptive Bezier-curve Network.mp4

Multi-Task Representation Learning for Demographic Prediction

Multi-Task-Learning:多任务学习的论文，代码和应用程​​序列表

2020-Multi-Task Learning Method for Tool Wear Conditionand.pdf

Taskonomy Disentangling Task Transfer Learning

Multi-Domain Transfer Learning for Early Diagnosis of Alzheimer's Disease

A-MULTI-EXPOSURE-IMAGE-FUSION-BASED-ON-THE-ADAPTIVE-WEIGHTS

Adaptive Subspaces for Few-Shot Learning.pdf

最新资源

Multi-Task-Learning:多任务学习的论文，代码和应用程序列表