In sentiment analysis of productreviews, one important problem
is to produce a summary of opinions based on product
features/attributes (also called aspects). However, for the same
feature, people can express it with many different words or
phrases. To produce a useful summary, these words and phrases,
which are domain synonyms, need to be grouped under the same
feature group. Although several methods have been proposed to
extract product features from reviews, limited work has been done
on clustering or grouping of synonym features. This paper focuses
on this task. Classic methods for solving this problem are based
on unsupervised learning using some forms of distributional
similarity. However, we found that these methods do not do well.
We then model it as a semi-supervised learning problem. Lexical
characteristics of the problem are exploited to automatically
identify some labeled examples. Empirical evaluation shows that
the proposed method outperforms existing state-of-the-art
methods by a large margin.