A Product Features Mining Method based on
Association Rules and the Degree
of Property Co-occurrence
Shaoning Shi
College of Mathematics & Computer Science
Baoding, Hebei, China
Shishaoning521@163.com
Yu Wang
College of Mathematics & Computer Science
Baoding, Hebei, China
wy@hbu.edu.cn
Abstract—With the rapid development of e-commerce,
excavating network consumer reviews is more and more
important nowadays. This paper proposes a product features
mining method based on association rules and the degree of
property co-occurrence. Excavating candidate features by
Apriori algorithm first, then pruning the candidate features by
the degree of property co-occurrence between specifiers and
product features, finally excavating Non-frequent features
through opinion words to supplement Apriori algorithm.
Experimental results prove that the method is effective.
Keywords-association rules; property co-occurrence;
candidate features; pruning
I. INTRODUCTION
Online shopping has gradually become an indispensable
part of daily life, the following consumer reviews not only
become a way of people to express likes or dislikes, but also
become an important reference of the potential buyers
whether to buy a product, while the latter directly affects the
sales. Therefore, more and more businesses begin to focus on
consumer reviews on the network, they improve their
products and services by getting the consumer reviews.
DoubleClick Inc studied the U.S. apparel industry, computer
hardware equipment, sports fitness products industry
[1]
, the
results prove more than 50% of Internet users before making a
purchase have searched for the information about product and
product reviews on the network.
However, the network has a huge number of product
reviews and the reviews present a characteristic of
non-structured, so it is difficult to get the effective
information from the complex reviews. The contradiction
between mass reviews and the limited manual reading ability
has become a major challenge for researchers. How to
automatically complete the product review mining has
become an important issue, in which the product features
mining is one of the most critical technologies.
This paper proposes a product features mining method
based on association rules and the degree of property
co-occurrence. We get the candidate features by Apriori
algorithm, bring the point-wise mutual information into the
pruning of candidate features, and calculate the value of
point-wise mutual information between candidate features and
specifier to filter the candidate features which is under the
threshold; then we excavate the non-frequent features through
opinion words to supplement Apriori algorithm so that we can
get more comprehensive and accurate product features.
II. RELATED WORKS
Product features mining has become a hot content which
researchers studied. Product features mining in English has
achieved certain results, Liu et al.
[2]
first proposed using
association rules to mine the product features in 2004, they
excavate product features against mobile phones, Mp3 and
DVD, and get comprehensive product features, but its
accuracy is significantly low. Kobayashi et al.
[3]
extract the
product features and consumer opinions through
semi-automated method, but this method needs to add a lot of
artificial factors. Popescu et al.
[4]
build OPINE system based
on KnowItAll (Etzioni et. 2005), which achieves product
features by calculating the value of point-wise mutual
information. it gets higher precision than Liu’s, but the recall
is lower.
As the differences of cultural background, grammar and
terminology used between English and Chinese, the product
features mining methods in English could not be directly
applied to Chinese. So Chinese product features mining
methods have been gradually appearing. Yao et al.
[5]
use
ontology to extract auto’s product features, identify the
product features with different granularity by the dictionary,
but this method needs a lot of manual participation. Li Shi et
al.
[6]
introduce algorithm in English to Chinese areas, they
extract the product features based on association rules. The
method obtains higher recall, but precision is significantly
lower, and the pruning methods they proposed have
significantly manual participation. Li Shi et al.
[7]
further
improve their method. They combine non-supervised
algorithm based on Apriori and supervise opinion analysis
technology to extract product features and consumer opinions.
This method does not need to establish product property
model, so it is more suitable for mining product features.
Although the method gets more comprehensive product
features, the precision is still not ideal.
III. MINING PRODUCT FEATURES
This paper proposes a new Chinese products features
mining method based on association rules and the degree of
property co-occurrence to extract product features. There are
2011 International Conference on Computer Science and Network Technology
978-1-4577-1587-7/11/$26.00 ©2011 IEEE