中国手语感知与理解的认知计算_中国手语数据集(CSL)资源-CSDN文库

102 浏览量 2021-03-25 21:08:46 上传评论收藏 812KB PDF 举报

认知计算模型在手语感知与理解中的应用研究是近年来人机交互领域的一个热点。本论文详细地讨论了认知计算在中文手语（CSL）感知与理解方面的应用，主要涉及了心理学和神经生理学的研究，提出了一个基于人类语言处理认知功能的计算模型，并给出详尽的算法描述。在论文中，作者提出了一个基于语义神经网络的模型用作中文手语理解中的知识表示方法。语义神经网络是一种模拟大脑神经元连接模式来处理和理解信息的计算模型，它能够模拟大脑在处理语言时的动态过程。在该模型中，通过算法处理，每个手势节点根据与其他节点的关系被赋予适当的值，这一过程类似于大脑对于语言的理解过程。本论文还提出了一种基于注意力分配的改进游戏理论，旨在更好地反映聋人的实际注意力分布。这一理论的提出是为了使模型能够更好地模拟人在学习和使用手语时的注意力分配，从而提高模型在手语识别和理解上的效率和准确性。实验结果显示，所提出的模型能有效改善手形联想记忆的性能。这一点对于手语识别系统来说是至关重要的，因为它直接影响到系统能否准确快速地识别出用户的手势，并将其转化为可理解的语言信息。论文所提到的关键技术包括认知计算（Cognitive Computing），它是一种模仿人类认知过程的计算方式，能够处理复杂的任务，如感知、学习、推理等。在手语识别应用中，认知计算可以对大量的手势数据进行处理，从而在计算机系统中实现对人类手语的理解和反应。此外，文章中还提到了中文手语本身。中文手语是为聋人群体服务的自然语言，是他们进行交流的重要工具。在计算语言学中，手语的理解是一个基础性任务，这需要复杂的算法设计，以模拟手语的理解过程，包括手势的识别、语义的解析等。在研究方法上，目前这一领域内的研究者提出了多种算法，大致可分为基于规则的方法和基于统计的方法。基于规则的方法缺乏在整体场景中规划元素的能力，而基于统计的方法虽然在一定程度上能够解决这个问题，但仍然面临着实际语言输入集无限的问题。因此，使用有限规则，尤其是次要规则，很难满足所有需求。针对这一点，作者提出的认知计算模型则提供了一个创新的视角，尝试通过模拟人类大脑的功能来解决这些问题。本论文对将来研究方向进行了展望。作者提出，后续研究可以进一步深入探索人脑处理自然语言和手语的相似性和差异性，进而优化模型的性能。此外，研究者还可以进一步研究如何将该模型应用到其他语言的处理中，以及如何在实际的人机交互产品中实现该模型，使之服务于更广泛的人群，特别是那些需要特殊交流工具的群体。

资源推荐

资源详情

资源评论

Cognitive Computing on Chinese Sign

Language perception and comprehension

Dengfeng Yao

Lab of Computational Linguistics, School of

Humanities, Tsinghua University, Beijing, China

Center for Psychology and Cognitive Science,

Tsinghua University, Beijing, China

Beijing Key Lab of Information Service Engineering,

Beijing Union University, Beijing, China

yaodengfeng@gmail.com

Minghu Jiang*

Lab of Computational Linguistics, School of

Humanities, Tsinghua University, Beijing, China

Center for Psychology and Cognitive Science,

Tsinghua University, Beijing, China

jiang.mh@tsinghua.edu.cn

Abudoukelimu.abulizi

Lab of Computational Linguistics, School of

Humanities, Tsinghua University, Beijing, China

Center for Psychology and Cognitive Science,

Tsinghua University, Beijing, China

keram1106@163.com

Hanjing Li

Beijing Key Lab of Information Service

Engineering, Beijing Union University, Beijing,

China

tjthanjing@buu.edu.cn

Abstract—After a systematic review on psychological and

neurophysiological studies, we propose a computational

cognitive model for Chinese Sign Language (CSL)

perception and comprehension with detailed algorithmic

descriptions based on cognitive functionalities in human

language processing. A semantic neural network based on

this model is introduced as the knowledge representation

method in CSL comprehension. In line with the actual

attention of the deaf, a revised game theory is presented by

assigning attention. This paper illustrates the applications

of the proposed model in classifier predicative

comprehension of CSL. After the spreading activation

process, each handshape node is assigned with appropriate

values that depend on the relationships nodes. Our

experimental results demonstrate that the proposed model

can effectively improve the performance of handshape

associative memory.

Keywords— Cognitive Computing;Chinese Sign

Language;perception ;comprehension

I.INTRODUCTION

Sign language is a sort of natural language for deaf

community and its comprehension is a fundamental task

in computational linguistics. Many researchers in this

domain have proposed several algorithms that involve

two main approaches. One approach is based on rule-

based methods [1], whereas the other approach is based

on statistical methods [2][3]. However, these rule-based

methods lack the ability of planning the elements in the

entire scene [4]. Moreover, the input set of the actual

languages is infinite. Thus, the modeling method through

finite rules, especially minor rules, can hardly satisfy all

requirements of sign language processing. As a result,

the present statistical methods have become the

mainstream. However, spoken language can successfully

use the statistical model because the digitalization and

networking of information in the Internet age provide

endless and abundant data resources for the spoken

model. Collecting and annotating sign language videos

are remarkably tedious and difficult that the raw corpus

and annotated corpus are still lacking. Applying the

statistical model on sign languages still faces serious

problems of data sparsity. For example, some scholars

pointed out that the real-time-factor (RTF) of the sign

language video corpus is 100; that is, a one-hour corpus

requires at least 100 hours of annotation [5].

Studying sign language comprehension, using the

traditional statistical model and machine learning

method, is difficult. Solving this problem, before the

reliable method to establish a signer 3D model for the

sign language corpus and a technology to automatically

annotate a large-scale sign language video corpus, is

difficult. In addition, Chinese Sign Language (CSL) does

not have a writing system; hence, it is not saved in any

form of written texts. Spoken languages have a written

system because they are outputted with sounds as carrier.

Audio is a data flow based on timeline, whereas the

speech channel is a set of values that change with passage

of time [3].

The natural language processing system of spoken

languages is based on texts. Moreover, it only records the

written text that corresponds to speech, which is the only

available channel. This system only requires that the user

have suitable literacy. By contrast, sign language is

Proc. 2015 IEEE 14th Int'l Conf. on Cognitive Informatics & Cognitive Computing (ICCI*CC’15)

N. Ge, J. Lu, Y. Wang, N. Howard, P. Chen, ;7DRB. Zhang, & L.A. Zadeh (Eds.)

essentially composed of multiple channels: hand

location, handshape, hand orientation, hand movement,

eye gazing, head tilting, shoulder tilting, body gesture,

and facial expression. Linguistic meaning is described

and represented by these information channels in sign

language. This multi-channel nature of CSL results in the

difficulty of coding sign languages to linear single-

channel character string. Foreign scholars believe that

sign languages also have writing systems, such as the

SignWriting system [6], ASL-phabet [7], and

HamNoSys [8]. However, these writing systems have

narrow range of users. The report sent by SignWriting

(sw-l@majordomo.valenciacc.edu) in groups shows that

at least 14 schools all over the world are currently using

this system. The narrow range of users explains why the

deaf community is small on the scale and can be

shrinking because of population, politics, technology,

and other factors [9].

Even if CSL has a writing system, its multi-channel

nature will definitely result in the system losing many

linguistic details. The most ideal understanding of sign

languages should be that the visual characteristics of the

visual-spatial language in sign language directly reach

the semantic units in the brain rather than initially

transforming it to written texts and then to semantics.

This approach is the most natural method of

comprehending sign languages in the brain. From this

perspective, this paper presents a computational

cognitive model for CSL comprehension on the

cognitive functionalities in the human brain combined

with a knowledge representation theory of artificial

intelligence. Therefore, we think that an ideal model is

quantitative in a programmable way.

The rest of this paper is organized as follows: First,

the background of CSL understanding is presented to

provide a brief introduction of the problems related to

CSL information processing. Second, we present a

systemic review on psychological and

neurophysiological studies with converging evidences to

uncover the cognitive and neural mechanisms of sign

language comprehension in the human brain. Third, a

computational model for CSL comprehension based on

the cognitive mechanism of sign language is proposed.

Fourth, the relevant meanings of a sign are considered as

nodes within a semantic neural network, and the

relevance between each meaning and corresponding sign

is regulated using the spreading activation theory.

Finally, the last section is the conclusion and future work.

2 SIGN LANGUAGE PERCEPTION AND COMPREHENSION

IN THE HUMAN BRAIN

An important source of inspiration in the research on

natural language processing is the cognitive brain

mechanism. The human brain has abundant sensing

organs and can thus abstract the overall knowledge of

languages from the perceived information and then

complete the understanding languages, thereby realizing

more complex intellectual activities. The brain delivers

and exchanges 1 PB data one trillion times every second,

as well as processes sound, sign, image, and other data

synchronously. The human brain is clearly an inborn

natural language processor [10].

Exploring the computational cognitive models on

how human brains comprehend sign language in both

areas of computational linguistics and cognitive

computation is much desired. This section summarizes

the psychological and neurophysiological findings for

sign language perception and comprehension in the

human brain. A summary follows the cognitive process

[11], including the mental processes of perception,

memory, and judgment. A computational cognitive

model is developed in Section 3 on the basis of these

findings.

A. Perception

Sign language exploits visual-spatial mechanisms to

express grammatical structure and function. Visual-

spatial perception, memory, and mental transformations

are prerequisites to grammatical processing in American

Sign Language [12], and are also central to visual mental

imagery [13].

A series of experiments are conducted to investigate

visual attention [14]. Movement recognition in the

peripheral vision is important in sign perception because

the signers mainly look at the face instead of keeping

track of the hands when their communicating through

sign language [15]. Therefore, lexical identification

depends on peripheral vision when signs are produced

away from the face. The recognition of movement

direction seems to be the selective functions of peripheral

vision [16]. At present, whether deaf subjects only have

strong perceiving ability of the peripheral vision or are

more efficient in allocating attention to peripheral vision

remains unclear. Literature [17] showed that auditory

deprivation can change visual attention processing. They

determined that deaf subjects did not shift their attention

when processing the information (alphabet set) presented

in central vision, while hearing subjects had to shift their

attention to search for the alphabet set continuously.

Similarly, Literature [18] also determined that the lack of

auditory input would cause weak selective (or more

distributed) visual attention among deaf children.

Literature [17] proposed that intermodal sensory

compensation results in more effective visual processing;

that is, the strong allocation of visual attention can be

attributed to neuron reorganization caused by auditory

deprivation from birth. Recent MRI evidence supports

this hypothesis [19].

These findings are selective attention cases, where

attention selectively processes certain stimuli but ignores

other stimuli. The cases refer to the selective orientation

and concentration of people’s senses (i.e., visual,

auditory, and tastable sense) and consciousness (i.e.,

awareness and thinking) on certain targets synchronously

(will toward other factors). Studies on attention cannot

剩余7页未读，继续阅读

评论收藏

内容反馈

weixin_38663113

粉丝: 5
资源: 896

中国手语感知与理解的认知计算

手语合成系统网络版软件

一种用于手语识别的中国手语分类方法

中国手语合成系统

手语合成系统安装软件NEW

中国手语0.69非正式安装版

8.手语数字数据集_手语数据集_手语数字数据集_

中国手语数据库的构想

新项目基于USTC数据集、MediaPipe和YOLOv5算法实现的手语视频识别系统python源码

基于压缩感知与SURF特征的手语关键帧提取算法

手语培训教材和手语培训网址

论文研究 - L2 / Ln手语测试与评估程序与评估

按图学习手语 字母 问候语

5千多手语图文学习大全ACCESS数据库

基于CORDIC的反正弦和反余弦计算的FPGA实现

BA无标度网络中的SIR模型

使用3DCNN和卷积LSTM进行手势识别学习时空特征

基于三次贝塞尔曲线的类汽车曲率连续路径平滑

基于机器学习的设备剩余寿命预测方法综述

基于维纳过程的退化模型，具有递归过滤算法，可用于估计剩余使用寿命

基于FPGA的奇异值和特征值分解的快速实现。

基于BP神经网络的人口预测

磁悬浮系统自适应模糊PID控制器的设计

无人机协同目标的多无人机协同搜索方法

两轮平衡车的建模与控制研究

基于改进遗传算法的六自由度机器人时间最优轨迹规划

最新资源

按图学习手语字母问候语