Team # 2002116 Page 3 of 38
1 Introduction
Our society has witnessed the rise of many online marketplaces, with a total worldwide
market value of 4.3 trillion dollars [1]. One salient feature of the online marketplace compared
with traditional platforms is the massive review of texts and ratings. Among all of them, Ama-
zon has received the most attention, as its greatest success [1]. Amazon also provides customers
with chances to freely express their feeling and rate the products that they have purchased.
Previous work [2] indicates that customers will largely refer to the reviews and ratings be-
fore they buy the product on the platforms. Platforms can adjust their sales strategy by checking
these comments. Hence, the ratings and the reviews both provide references to other potential
buyers and massive data to analyze the demand of the customers, which can help to develop
adaptive strategies. By making full use of these data, we can achieve a win-win situation for
both the buyers and the platform.
One of the biggest challenges is the complexity and diversity of the texts of the reviews [3, 4].
In this paper, we propose a novel sentiment analysis model as the text-based measure to address
this issue. In this paper, we develop a series of models as the combination of text-based, rating-
based, and time-based measures to pick out the most informative ratings and reviews to track.
We also construct a novel evaluation framework to quantify the reputation of each product
and predict potential success or failure. Then, we analyze the correlation between continuous
same star ratings, word descriptors and the reputation of the products. We implement our
model on the real data set generated from three different types of products, namely the pacifier,
microwave, and the hair dryer.
Researchers have pointed out the necessity to study when and how the online platforms
should adjust their marketing communication strategy in response to consumer reviews or rat-
ings [5]. We propose several sales strategies and recommendations in this paper based on our
analysis and results.
The rest of the paper is organized as follows. In section 2, we list the main assumptions in
model construction and introduce the notations which will be frequently used in this paper.
In section 3, a novel Information Evaluation Model is proposed. It is made up of a hybrid the
state-of-art CE [6] and VADER [7] for sentiment analysis in the review text. Then we propose
the "importance" rate as a combination of text-based measure (i.e., our proposed CE-VADER
model) and ratings-based measure (i.e., the star-rating and the helpful votes) to indicate how
informative the review and the rating are. To the best of our knowledge, we are the first to
propose a review-text-based sentiment analysis model. In section 4, we employ a difference
equation model as the backbone to measure the time pattern of each product. Moreover, the
"reputation" rate is proposed in this section to measure the growth or the decline of the repu-
tation. In section 5, we employ an Auto Regression model (AR) to predict the change of rep-
utation in the future time domain and propose a fuzzy system to predict the potential success
or failure of each product. More details about the results of our model implemented on given
data can be found in section 6,7,8. The strengths and weaknesses of the proposed model and
framework are discussed in section 9. We conclude in section 10. All source codes are attached
to the Appendix D-I and can be easily implemented to other data sets.