# playerElo 2019
Open the playerElo App here: https://jrichey.shinyapps.io/playerelo/
View playerElo published on FanGraphs here: https://community.fangraphs.com/playerelo-factoring-strength-of-schedule-into-player-analysis/
###### *For the following article, all numbers are updated to September 10th, 2019.*
###### *Data sourced from Baseball Savant, RotoWire, Baseball Reference, Retrosheet, and BigDataBall.*
## Abstract
With the sabermetric revolution of the MLB, a plethora of new statistics have come into the mainstream, and a growing number of fantasy owners, ballclubs, and regular fans are turning to these new statistical methods for player analysis. However, I propose even advanced metrics such as wOBA, FIP, xwOBA, xFIP, and wRC+ are all missing a crucial element to accurately represent player performance thus-far. The playerElo system is able to reveal in aggregate the effects of previously unconsidered aspects of the game. Using an Elo ranking system determined by run-value calculations of all major league baseball players, the model incorporates context-dependent analysis and quality of competition to produce a proper evaluation of batters and pitchers. This enables playerElo to appropriately credit pitchers, especially relievers, for their true impact on the game, particularly when called upon in disadvantageous situations. Additionally, playerElo does not allow relative team strength, which confounds common counting statistics, to influence the evaluation of a player. The model is a holistic approach to the assessment of major league players and has incredible ramifications on player projections during free agency and player acquisition.
## Introduction
Consider the following comparison between Freddie Freeman (29) and Carlos Santana (33). Both players were starters for the 2019 All-Star teams of their respective leagues and are enjoying breakout seasons, beyond their usual high production level, with nearly identical statistics across the board.
| Player | PA | wOBA | xwOBA | wRC+ |
| :-: | :-: | :-: | :-: | :-: |
| Freeman, 1B | 643 | 0.398 | 0.396 | 144 |
| Santana, 1B | 624 | 0.389 | 0.371 | 141 |
*Data from FanGraphs, Baseball Savant.*
However, I argue there is an underlying statistic that makes Santana’s success less impressive and Freeman’s MVP-consideration worthy. Recall the quality of competition of pitchers faced. The Atlanta Braves’ division, the NL East, contains the respectable pitching competition of the Mets (11th league- wide in ERA), Nationals (12th), Phillies (17th), and Marlins (21st). Contrast this with the competition of the Cleveland Indians in the AL Central: The Twins (8th), White Sox (23rd), Royals (26th), and Tigers (28th). Over his first 500 plate appearances, Santana faced a top 15 pitcher (ranked by FIP) just 15 times, compared to 43 times by Freeman. wRC+ controls for park effects and the current run environment, while xwOBA takes into account quality of contact, but all modern sabermetrics fail to address the problem of Freeman and Santana’s near-equal statistics, despite widely different qualities of competition. Thus, I present the modeling system of playerElo.
## Methodology
Conceived out of inspiration from Arpad Elo’s rating system for zero-sum games like chess, as well as FiveThirtyEight’s use of an Elo modeling scheme for MLB team ratings and season-wide predictions, playerElo treats all at-bats as events and maintains a running power ranking of all MLB batters and pitchers. The system uses expected run values over the 24 possible base-out states. Additionally, run values are calculated for each at-bat event by subtracting the run expectancy of the beginning state from the ending state, and adding the runs scored.
`Run Value of Play = RE End State - RE Beginning State + Runs Scored`
The following run expectancy matrix presents the expected runs scored for the remainder of the inning, given the current run environment, baserunners, and number of outs. Data is sourced from all at-bats from 2016-2018, and expected run values are rounded to the second decimal place. For example, a grand slam hit with one out would shift the run expectancy from 1.54 to 0.27 and score four runs, so the run value of the play would be 2.73.
| 1B | 2B | 3B | 0 outs | 1 out | 2 outs |
| :-: | :-: | :-: | :-: | :-: | :-: |
| -- | -- | -- | 0.51 | 0.27 | 0.11 |
| 1B | -- | -- | 0.88 | 0.52 | 0.22 |
| -- | 2B | -- | 1.15 | 0.69 | 0.32 |
| -- | -- | 3B | 1.39 | 0.97 | 0.36 |
| 1B | 2B | -- | 1.45 | 0.93 | 0.44 |
| 1B | -- | 3B | 1.77 | 1.20 | 0.48 |
| -- | 2B | 3B | 1.97 | 1.40 | 0.56 |
| 1B | 2B | 3B | 2.21 | 1.54 | 0.75 |
*Data from Retrosheet, 2016-2018.*
The model begins with a calibration year of 2018, and for 2019, players begin with their previous seasons’ ending playerElo, regressed to the mean slightly. If a player did not have a single plate
appearance or batter faced pitching in 2018, such as Vladimir Guerrero Jr. or Chris Paddack, then they are assigned a baseline playerElo of 1000 (calibration year of 2018 began every player at 1000). For every at-bat, given the current base-out state, an expected run value for both the batter and pitcher is calculated, based on quadratic formulas of historic performance of players of that caliber in the given situation. The dependency of the Elo formula on the base-out state ensures the model is context-dependent, meaning it incorporates the fact that a bases-loaded double is far more valuable than a double with the bases empty, however, it also takes into account that runs were more likely to be scored in the former situation compared to the latter.
It is important to note playerElo is a raw batting statistic and does not evaluate overall production, meaning stolen bases are not factored into the ranking system. Additionally, while the model does not take defense into account, it also does not count stolen bases or passed balls negatively against a pitcher, and likewise does not count changes in game states due to wild pitches positively for a batter. Once an expected run value is synthesized from the current state and the playerElo of the batter and the pitcher, park factor and home field advantage adjustments (if applicable) are made, and the expected run value of the play is then compared to the true run value outcome. The playerElo of both the batter and pitcher are then updated accordingly, dependent on the difference between the true run value and the expected run value. For example, if an excellent pitcher strikes out a mediocre batter, the batter will not lose much Elo, and the pitcher will not gain much Elo. Likewise, if a below- average batter does extremely well against a top pitcher, there will be a far greater change in the Elo of both players. Errors are also taken into account and will prevent a positive run value from counting against a pitcher or positively for a batter.
Refer to the Technical Appendix at the end of the README.md for further details regarding the playerElo methodology.
## Player Analysis
![playerElo Top 25](https://user-images.githubusercontent.com/22247220/64912297-1fca5580-d6fb-11e9-988d-b4f9442d5576.png)
It is interesting to note Nolan Arenado and Edwin Encarnacion do particularly well in the model, even with park factor adjustments. This is can be attributed to the difficulty of schedule of the Rockies and Yankees, facing the tough pitching competition in the NL West and AL East respectively. The average pitching Elo faced by Arenado and Encarnacion is 1010.5 and 1009.6 (20th and 25th highest overall). Quality of contact does leave room to be desired for Arenado, however playerElo does not incorporate statistics like exit velocity and launch angle in its calculations, and thus the model is a better reflection of on-field performance than underlying swing metrics. In contrast to Arenado and Encarnacion, Yordan Álvarez has played incredibly since called up in June but has faced some of the easiest competition i
没有合适的资源?快使用搜索试试~ 我知道了~
playerElo:将赛程强度纳入MLB球员分析_R_下载.zip
共29个文件
csv:19个
r:3个
dcf:2个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 198 浏览量
2023-04-15
09:36:00
上传
评论
收藏 24.7MB ZIP 举报
温馨提示
playerElo:将赛程强度纳入MLB球员分析_R_下载
资源推荐
资源详情
资源评论
收起资源包目录
playerElo:将赛程强度纳入MLB球员分析_R_下载.zip (29个子文件)
playerElo_2019-master
.DS_Store 6KB
rsconnect
shinyapps.io
jrichey
playerelo.dcf 243B
documents
02_app.R
shinyapps.io
jrichey
playerElo.dcf 253B
.RData 14.06MB
03_app.R 10KB
data
p19EloDisplay.csv 52KB
mlb-player-stats-Batters.csv 52KB
.DS_Store 6KB
p18Elo.csv 20KB
teamElo19.csv 2KB
all_playerElo.csv 13.3MB
expected_stats-6.csv 73KB
eloTeam19.csv 2KB
exit_velocity-2.csv 72KB
b19Elo_disp.csv 56KB
expected_stats-7.csv 63KB
b19EloDisplay.csv 55KB
stateMatrix.csv 3KB
allPlayerEloX.csv 12.81MB
standings.csv 5KB
ParkFactors.csv 787B
b18Elo.csv 24KB
p19Elo_disp.csv 49KB
mlb-player-stats-P.csv 54KB
exit_velocity-3.csv 68KB
01_regular_app_construction.R 23KB
02_postseason_app_construction.R 16KB
README.md 17KB
.Rhistory 22KB
共 29 条
- 1
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9152
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功