用于在线学习的Bandit算法模拟___下载.zip资源-CSDN文库

共99个文件

png：54个

test：7个

base：7个

版权申诉

49 浏览量 2023-04-16 19:42:50 上传评论收藏 7.96MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

用于在线学习的Bandit算法模拟___下载.zip （99个子文件）

bandit_simulations-master

scripts

contextual

CABs

simulations_onlineadvertising.R 3KB

MABs

ts_bootstrap.R 1KB

mabs.R 1024B

epsilongreedy.R 584B

python

contextual_bandits

data

ml-100k

u3.test 387KB

u.occupation 193B

u3.base 1.51MB

README 7KB

u2.base 1.51MB

u5.base 1.51MB

u.genre 202B

u1.test 383KB

u.info 36B

mku.sh 643B

u5.test 388KB

u4.base 1.51MB

u.item 231KB

ub.test 182KB

u.data 1.89MB

ub.base 1.71MB

ua.base 1.71MB

u.user 22KB

u2.test 386KB

allbut.pl 716B

u1.base 1.51MB

u4.test 388KB

ua.test 182KB

news_dataset.txt 2.04MB

analysis

linUCB disjoint implementation and analysis.md 20KB

linUCB hybrid implementation and analysis.md 35KB

img

paper_hybrid_stddev_defn.png 7KB

paper_eqn_2.png 3KB

paper_eqn_5.png 9KB

hybrid_simulation_alpha_0.5.png 12KB

hybrid_simulation_alpha_1.0.png 11KB

paper_stddev_defn.png 2KB

hybrid_simulation_avg_reward.png 21KB

paper_hybrid_argmax.png 23KB

simulation_alpha_0.5.png 11KB

simulation_alpha_1.5.png 12KB

paper_disjoint_algo.png 66KB

hybrid_simulation_alpha_0.25.png 12KB

ucb_eqn.png 2KB

compare_disjoint_hybrid.png 18KB

ucb_example1.png 6KB

paper_theta_simplified.png 2KB

paper_b_defn.png 872B

paper_hybrid_eqn.png 4KB

caltech_linucb_lect_hybrid.png 128KB

paper_mean_defn.png 2KB

caltech_linucb_lect.png 229KB

paper_A_defn.png 2KB

ucb_example.png 6KB

paper_eqn_3.png 3KB

hybrid_simulation_alpha_1.5.png 11KB

paper_hybrid_algo.png 157KB

simulation_alpha_1.0.png 12KB

notebooks

avg_hybrid_ctr.pickle 5KB

hybrid_ctr.pickle 0B

disjoint_ctr.pickle 1.26MB

LinUCB_disjoint.ipynb 60KB

avg_disjoint_ctr.pickle 5KB

LinUCB_hybrid.ipynb 228KB

multiarmed_bandits

analysis

ts.md 13KB

softmax.md 13KB

eps-greedy.md 12KB

ucb.md 13KB

img

bayes_rule_denom.png 1KB

bayes_rule_likelihood.png 2KB

cum-reward_5-arms_0dot1-0dot9_ts.png 27KB

cum-reward_5-arms_0dot1-0dot9_epsg.png 42KB

cum-reward_5-arms_0dot8-0dot9_epsg.png 32KB

cum-regret_5-arms_0dot8-0dot9_ts.png 24KB

cum-regret_5-arms_0dot8-0dot9_ucb.png 26KB

rate-best-arm_5-arms_0dot8-0dot9_epsg.png 51KB

rate-best-arm_5-arms_0dot8-0dot9_ucb.png 31KB

rate-best-arm_5-arms_0dot8-0dot9_ts.png 30KB

softmax_eqn.png 2KB

bayes_rule_posterior.png 6KB

cum-reward_5-arms_0dot8-0dot9_soft.png 32KB

cum-reward_5-arms_0dot1-0dot9_ucb.png 27KB

ucb_eqn.png 2KB

cum-regret_5-arms_0dot8-0dot9_epsg.png 38KB

rate-best-arm_5-arms_0dot1-0dot9_soft.png 48KB

bayes_rule.png 2KB

rate-best-arm_5-arms_0dot1-0dot9_epsg.png 55KB

rate-best-arm_5-arms_0dot1-0dot9_ts.png 28KB

cum-regret_5-arms_0dot8-0dot9_soft.png 36KB

rate-best-arm_5-arms_0dot1-0dot9_ucb.png 36KB

cum-reward_5-arms_0dot8-0dot9_ucb.png 27KB

rate-best-arm_5-arms_0dot8-0dot9_soft.png 43KB

cum-reward_5-arms_0dot1-0dot9_soft.png 40KB

bayes_rule_prior.png 3KB

cum-reward_5-arms_0dot8-0dot9_ts.png 27KB

notebooks

analysis.ipynb 2.64MB

.gitignore 52B

README.md 3KB

.Rhistory 20KB

bandit_simulations.Rproj 205B

# Bandit_simulations Bandit algorithms simulations and analysis for online learning This repo is part of my interest to learn more about optimisation for online learning algorithms which are heavily centerd on bandit theory. Based on what I understand, there are different types of bandit problems: - __Multi-armed bandits:__ Bandits arms are inherently non-differentiable except for their inherent reward function. For multiple arm bandits, the objective is to determine the bandit with the highest reward function via online learning, which is a classic explore-versus-exploit problem. - __Contextual bandits:__ Bandits with features (aka context) that interact differently with different actions. Different contextual features will require different actions to return the reward. This can be perceived as a classification problem: given input features aka context, what is the right classification of "actions" that will return high accuracy/reward? This repo is segmented into both Python and R. - Python: - __Phase 1 (MAB analysis):__ Comprises coding of certain Multi-Armed Bandit algorithms for experimentation. - __Phase 2 (CB analysis):__ Implementation of contextual bandit algorithms starting with LinUCB Disjoint and LinUCB Hybrid based on [A Contextual-Bandit Approach to Personalized News Article Recommendation](https://arxiv.org/pdf/1003.0146.pdf). - __Phase 3 (CB analysis):__ Utilise use `vowpal wabbit` package for online learning for contextual bandits simulation - R: - __Phase 4 (MAB & CB analysis):__ Using R library package `contextual` that has a comprehensive ecosystem for different algorithm and policies ## Analysis and Code Implementation __Phase 1 MAB analysis includes:__ - [Epsilon Greedy](https://github.com/kfoofw/bandit_simulations/blob/master/python/multiarmed_bandits/analysis/eps-greedy.md) - [SoftMax](https://github.com/kfoofw/bandit_simulations/blob/master/python/multiarmed_bandits/analysis/softmax.md) - [UCB](https://github.com/kfoofw/bandit_simulations/blob/master/python/multiarmed_bandits/analysis/ucb.md) - [Thompson Sampling](https://github.com/kfoofw/bandit_simulations/blob/master/python/multiarmed_bandits/analysis/ts.md) __Phase 2 CB analysis (Currently ongoing):__ - [LinUCB Disjoint Implementation and Analysis with a Dataset](https://github.com/kfoofw/bandit_simulations/blob/master/python/contextual_bandits/analysis/linUCB%20disjoint%20implementation%20and%20analysis.md) - [LinUCB Hybrid Implementation and Analysis with a MovieLens Dataset for Recommender Systems](https://github.com/kfoofw/bandit_simulations/blob/master/python/contextual_bandits/analysis/linUCB%20hybrid%20implementation%20and%20analysis.md) ## Special Mention A portion of the MAB code is based on the book ["Bandit Algorithms for Website Optimization"](https://www.oreilly.com/library/view/bandit-algorithms-for/9781449341565/) by John Myles White. Microsoft's `vowpal wabbit` package for Python can be found in this [Github repo](https://github.com/VowpalWabbit/vowpal_wabbit). The R package for `contextual` can be found in this [Github repo](https://github.com/Nth-iteration-labs/contextual).

评论收藏

内容反馈

版权申诉