Abstract
In the context of fierce competition and diversified market demands in today's film
industry, accurately predicting movie box office performance has become a significant
concern for both film production companies and investors. This paper applies methods
such as linear regression, K-nearest neighbors (KNN), and support vector regression
(SVR) from machine learning to forecast movie box office performance and conducts
in-depth analysis on the prediction results.
Firstly, we introduce the background and significance of movie box office
prediction. Predicting movie box office performance not only affects the profits of
production companies but also reflects the demand of the movie market, presenting
challenges to investors' decision-making. Therefore, accurate prediction of movie box
office performance is of great importance for the industry's development and investors'
decisions.
Secondly, we elaborate on the predictive models and methods employed. Firstly,
the linear regression model predicts movie box office performance by fitting a linear
function, considering the weights of various influencing factors. Next, the K-nearest
neighbors algorithm predicts based on the box office data of similar movies, assuming
that similar movies will have similar performances in box office. Lastly, support vector
regression conducts regression analysis by constructing a hyperplane in a high-
dimensional space to predict movie box office performance.
Subsequently, we describe the experimental design and data collection process in
detail. We collected data including movie genres, cast members, directors, release dates,
and other factors, and preprocessed the data and engineered features to meet the
requirements of different models.
Following that, we present the experimental results and analysis. By comparing
the predictive performance of different models, we find that linear regression performs
well in certain cases, while KNN and SVR are more accurate in others. Additionally,