J. Shanghai Jiaotong Univ. (Sci.), 2015, 20(1): 1-6
DOI: 10.1007/s12204-009-0501-3
Sentiment Analysis for Chinese Text Based on
Emotion Degree Lexicon and Cognitive Theories
WU Xing
∗
(武 星), L
¨
U Hai-tao (吕海涛), ZHUO Shao-jian (卓少剑)
(School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China)
© Shanghai Jiaotong University and Springer-Verlag Berlin Heidelberg 2015
Abstract: The mass data of social media and social networks generated by users play an important role in tracking
users’ sentiments and opinions online. A good polarity lexicon which can effectively improve the classification
results of sentiment analysis is indispensable to analyze the user’s sentiments. Inspired by social cognitive theories,
we combine basic emotion value lexicon and social evidence lexicon to improve traditional polarity lexicon. The
prop osed method obtains significant improvement in Chinese text sentiment analysis by using the proposed lexicon
and new syntactic analysis method.
Key words: Chinese text, sentiment analysis, emotion lexicon, social cognitive theory, emotion tendency
CLC number: TP 301.2 Document code: A
0 Introduction
Sentiment analysis of users is of great importance
in the commodity promotion and marketing. There
have been several researches concerning sentiment anal-
ysis. For example, Eirinaki et al.
[1]
presented an algo-
rithm which not only analyzes the overall sentiment
of a document/review, but also identifies the semantic
orientation of specific comp onents of the review which
leads to a particular sentiment. Zhai et al.
[2]
modeled
classic sentiment analysis methods as a semi-supervised
learning problem, in which lexical characteristics of the
problem were exploited to automatically identify some
labeled examples.
Present sentiment analysis methods mainly focus on
the emotion tendency analysis
[3-4]
. Generally, text po-
larity can be classified into three classes: positive, neg-
ative and neutral. Sentiment analysis usually consists
of subjective classification, emotion polarity, semantic
orientation, opinion mining, emotion analysis, opinion
extraction and emotion summarization
[5-6]
.
The purpose of sentiment analysis is to find user re-
views and emotion polarity from texts. User review is
Received date: 2014-05-16
Foundation item: the National Natural Science Founda-
tion of China (No. 61303094), the Doctoral Fund of
Ministry of Education of China (No. 20123108120027),
the Program of Science and Technology Commission of
Shanghai Municipality (No. 14511107100), the Shang-
hai Leading Academic Discipline Project (No. J50103),
and the Innovation Program of Shanghai Municipal Ed-
ucation Commission (No. 14YZ024)
∗E-mail: xingwuvip@aliyun.com
of great significance, which can attract potential cus-
tomers, help sellers make decisions, and can get the
product feedback. It can also make prediction for polit-
ical elections and other major events. In addition, the
sentiment analysis technology also contributes to the
research of other natural language processing (NLP)
fields, for instance, automatic text summary and ques-
tion answering system. In sentiment analysis field,
there are two methods: combining rules with emotional
dictionary
[7]
and using machine learning technology
[8]
.
For the former method, texts are classified by using
positive emotional words and negative emotional words.
The latter one (using machine learning technology) will
use naive bayes, max entropy or support vector machine
to classify texts.
At present, the majority of sentiment analysis re-
search focuses on English language. Recent studies,
however, show that non-native English speakers heav-
ily support the growing use of network media. The de-
velopment of Internet in China is fast in recent years.
In fact, natural language processing research endeavor
primarily depends on the availability of resources like
lexicons and corpora. But these are still very limited for
sentiment analysis research in Chinese language. Cam-
bria et al.
[9]
developed a Chinese common and common
sense knowledge for sentiment analysis by blending the
largest existing taxonomy of English common knowl-
edge, and by using machine translation techniques to
effectively translate its content into Chinese. However,
the difference between English and Chinese in grammar
is enormous, so the approach proposed by Cambria can-
not effectively classify Chinese text.