2023年美赛特等奖论文-C-2311035-解密.pdf资源-CSDN文库

版权申诉

89 浏览量 2024-05-06 22:05:58 上传评论收藏 9.46MB PDF 举报

资源推荐

资源详情

资源评论

Problem Chosen

C

2023

MCM/ICM

Summary Sheet

Team Control Number

2311035

Winners in Wordle

Summary

Wordle is a phenomenal network game. Its appearance intensely aroused people’s attention.

Although it looks tiny, the hidden information behind it is huge and meaningful. Capturing and

understanding this information will help the New York Times better design and operate Wordle.

We built three models to ﬁnish the tasks. Model I uses LSTM to forecast the number of

reported scores in the future. Model II uses seven XGBoost regressors to predict the percentage

distribution of a given the word. Model III classiﬁes words by their diﬃculties using SVM with

RBF kernel. Based on our three models, we can provide some advice to help improve Wordle.

The speciﬁc details are shown below:

Model I: LSTM is an improved recurrent neural network that can solve the long-distance

dependence problem that other neural networks cannot handle. We trained processed data of

the number of reported scores for the model and used an iterative method to predict the number

until March 1st (2023). After 150 times of independent model training, the prediction interval is

[20745.72, 22914.74]. Additionally, from the linear regression on the proportion of hard mode

with word attributes, we can also ﬁnd no correlation between hard-mode-ratio and target word.

Model II: To get the percentage distribution of a given day associated with a speciﬁc word, we

trained seven separate XGBoost models. The R

2

of our model is 0.68, which can be accurately

predicted with low uncertainty after testing. We apply "EERIE" to the model and gain a prediction

percentage distr ibution, showing that ERRIE should be considered as an problematic word.

Model III: We quantiﬁed the diﬃculty of words by the unequally weighted average of percentage

distribution and divided them into three levels: easy, medium and hard. Then we used labeled to

ﬁt SVM model with RBF kernel and gain an accuracy score of 0.6556 and an F1 score of 0.6634.

Also, the classiﬁcation result for EERIE is hard, consistent with the result in model 2.

Except for these three models, we also found some interesting observations from the dataset,

one of which discussed the diﬀerences between human thinking and machine learning.

Finally, we write a letter including our models, results, and advice to the New York Times

Wordle editor. We hope this letter will become a valuable reference for the further development of

Wordle.

Keywords: Wordle; LSTM; Recursive Regression; XGBoost; Feature Engineering

剩余22页未读，继续阅读

内容反馈

版权申诉

阿拉伯梳子

粉丝: 1655
资源: 5735

最新资源

资源上传下载、课程学习等过程中有任何疑问或建议，欢迎提出宝贵意见哦~我们会及时处理！点击此处反馈

feedback-tip