没有合适的资源?快使用搜索试试~ 我知道了~
MCM美国大学生数学建模竞赛题目与答案:预测单词结果.pdf
4 下载量 82 浏览量
2023-08-21
10:54:31
上传
评论 1
收藏 611KB PDF 举报
温馨提示
试读
23页
Wordle 是一款风靡全球的游戏。我们预测了以及 "ERIE "一词在 2023 年 3 月 1 日的得分分布。我们的我们的模型利用了《纽约时报》提供的数据,其中包括每天的每天的硬模玩家人数和玩家总数、得分分布以及单词信息。我们将玩家分布和单词难度的模型分离开来。为了证明分离的合理性、我们研究了玩家数量与时间和单词难度的关系。 通过绘制数据图,我们发现玩家总数只与时间有关,表现为像 1/t。此外,我们还发现硬模玩家数量与总玩家数量之间的静态比率为 0.1。与玩家总数的比率为 0.1。两者的周期都在 200 天左右。根据模型,我们预测:3 月 1 日的玩家总数将约为 17630 人,而硬模玩家总数将约为 17630 人,硬模玩家总数将约为 17630 人。硬模玩家总数约为 1778 人。同时,我们的单词难度模型只依赖于
资源推荐
资源详情
资源评论
Problem Chosen
C
2023
MCM/ICM
Summary Sheet
Team Control Number
2321756
Summary
Wordle is a popular game played around the world. We predicted the number of total players
and hard mode players on March 1, 2023, and the score distribution of the word "EERIE". Our
model utilized the data provided by the New York Times, including information about the daily
number of hard mode players and total players, the score distributions, and words for each day.
We decoupled our models for player distribution and word difficulty. To justify the sep-
aration, we investigated the dependence of the player population on time and word difficulty.
Plotting the data, we found that the total player population only depends on time and behaves
like 1/t. In addition, we found a static ratio of 0.1 between the population of hard-mode players
and total players. They both showed a periodicity of around 200 days. Based on the model, we
can then get the prediction for March 1st: the total number of players will be about 17630, and
the total number of hard mode players will be around 1778.
Meanwhile, our model for word difficulty was dependent only on the properties of the word,
not of the number of players. We determined four factors that influence a word’s difficulty: the
number of repeated letters, similarity to other words in the data set, rarity, and use of uncommon
letters.
We quantitatively defined these parameters, normalized them to the same scale, and fit them
with linear coefficients to build our model for the word difficulty. Using this hybrid model for
word difficulty, we determined that the word "EERIE" would have a mean score of about 4.796.
This difficulty number was then fed into an experimentally fit beta distribution to predict the
probability distribution for guessing "EERIE" in any number of tries.
Keywords: Wordle; Difficulty
Team # 2321756 Page 1 of 22
Contents
1 Content 3
1.1 Background Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Analysis of the Problem 4
2.1 Population Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Word Difficulty Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Repeated Letters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2 Word Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.3 Word Rarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.4 Uncommon Letters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Calculating and Simplifying the Model 5
3.1 Population Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Probability Distribution Functions of Guessing a Word . . . . . . . . . . . . . 7
3.3 Variable Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 The Model Results 10
4.1 Population Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2 Hard Mode Dependence on Word Difficulty . . . . . . . . . . . . . . . . . . . 11
4.3 Fitness of Word Features and Expected Value . . . . . . . . . . . . . . . . . . 11
4.4 Distribution Fitting with Beta . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.5 Extrapolation to EERIE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Validating the Model 13
6 Conclusions 14
7 Letter 16
8 References 18
Team # 2321756 Page 3 of 22
1 Content
1.1 Background Information
Wordle is a popular puzzle currently offered daily by the New York Times. Players try to solve
the puzzle by guessing a five-letter word in six tries or less, receiving feedback with every guess.
The player’s score is the number of guesses used, with lower scores being better. Each guess in
standard Wordle must be a 5-letter English word. Guesses that are not recognized as words by
the game are not allowed. Wordle continues to grow and versions of the game are now available
in over 60 languages.
Figure 1: Example Solution of Wordle Puzzle from July 21, 2022
Players can opt to make the game more difficult by participating in Hard Mode, which requires
that once a player has found a correct letter in a word (the tile is yellow or green), those letters
must be used in subsequent guesses.
1.2 Problem description
We have been asked by New York Times to do to develop a model based on the past data
(January 7th, 2022–December 31st, 2022) of daily total number of players, number of hard
mode players, and score distribution in order to extrapolate word difficulty, player population,
and percentage of games played in hard mode on a specific day–March 1st, 2023.
剩余22页未读,继续阅读
资源评论
小兔子平安
- 粉丝: 210
- 资源: 1940
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功