# 时间序列模型的试用——阿里音乐流行趋势预测比赛
https://tianchi.shuju.aliyun.com
## 文件说明
第一季比赛前后共两个数据。第一个数据集预测50个歌手的歌曲播放量,使用的程序在文件夹`s1d1`中;第二个数据集预测100个歌手的歌曲播放量,使用的代码在文件夹`s1d2`中。
* 第一个数据集合的下载链接:https://pan.baidu.com/s/1i4ORGjF
* 第二个数据集合的下载链接:https://pan.baidu.com/s/1hrF0sok
# 第一个数据集(50个歌手),文件夹`s1d1`
## 依赖的运行环境
* Ubuntu
* python 2.7
* 可以安装pip,然后用pip安装python的libs
* ipython (python开发环境)
* matplotlib (用来画图表)
* numpy (用来作数据处理)
* pandas (用来作数据处理)
* scikit-learn (用来作学习) [安装方法](http://www.bogotobogo.com/python/scikit-learn/scikit-learn_install.php)
## data文件夹
mars_tianchi_songs.csv、mars_tianchi_user_actions.csv两个数据文件
* 下载链接:https://pan.baidu.com/s/1i4ORGjF
* 下载两个数据文件,放到文件`data`里面,zip文件需要解压得到csv文件
## 运行ipython
在shell下面运行ipython
```
ipython --pylab
```
## pp.py
数据预处理和统计部分的代码
在ipython里面输入
```
%run pp.py
```
开始进行数据预处理和统计
#### 输出
```
===start generate date rank==================================
#从20150301到20151031共245天
date num 245
#日期的编号
rank to date : {0: '20150301', 1: '20150302', 2: '20150303', 3: '20150304', 4: '20150305', 5: '20150306', 6: '20150307', 7: '20150308', 8: '20150309', 9: '20150310', 10: '20150311', 11: '20150312', 12: '20150313', 13: '20150314', 14: '20150315', 15: '20150316', 16: '20150317', 17: '20150318', 18: '20150319', 19: '20150320', 20: '20150321', 21: '20150322', 22: '20150323', 23: '20150324', 24: '20150325', 25: '20150326', 26: '20150327', 27: '20150328', 28: '20150329', 29: '20150330', 30: '20150331', 31: '20150401', 32: '20150402', 33: '20150403', 34: '20150404', 35: '20150405', 36: '20150406', 37: '20150407', 38: '20150408', 39: '20150409', 40: '20150410', 41: '20150411', 42: '20150412', 43: '20150413', 44: '20150414', 45: '20150415', 46: '20150416', 47: '20150417', 48: '20150418', 49: '20150419', 50: '20150420', 51: '20150421', 52: '20150422', 53: '20150423', 54: '20150424', 55: '20150425', 56: '20150426', 57: '20150427', 58: '20150428', 59: '20150429', 60: '20150430', 61: '20150501', 62: '20150502', 63: '20150503', 64: '20150504', 65: '20150505', 66: '20150506', 67: '20150507', 68: '20150508', 69: '20150509', 70: '20150510', 71: '20150511', 72: '20150512', 73: '20150513', 74: '20150514', 75: '20150515', 76: '20150516', 77: '20150517', 78: '20150518', 79: '20150519', 80: '20150520', 81: '20150521', 82: '20150522', 83: '20150523', 84: '20150524', 85: '20150525', 86: '20150526', 87: '20150527', 88: '20150528', 89: '20150529', 90: '20150530', 91: '20150531', 92: '20150601', 93: '20150602', 94: '20150603', 95: '20150604', 96: '20150605', 97: '20150606', 98: '20150607', 99: '20150608', 100: '20150609', 101: '20150610', 102: '20150611', 103: '20150612', 104: '20150613', 105: '20150614', 106: '20150615', 107: '20150616', 108: '20150617', 109: '20150618', 110: '20150619', 111: '20150620', 112: '20150621', 113: '20150622', 114: '20150623', 115: '20150624', 116: '20150625', 117: '20150626', 118: '20150627', 119: '20150628', 120: '20150629', 121: '20150630', 122: '20150701', 123: '20150702', 124: '20150703', 125: '20150704', 126: '20150705', 127: '20150706', 128: '20150707', 129: '20150708', 130: '20150709', 131: '20150710', 132: '20150711', 133: '20150712', 134: '20150713', 135: '20150714', 136: '20150715', 137: '20150716', 138: '20150717', 139: '20150718', 140: '20150719', 141: '20150720', 142: '20150721', 143: '20150722', 144: '20150723', 145: '20150724', 146: '20150725', 147: '20150726', 148: '20150727', 149: '20150728', 150: '20150729', 151: '20150730', 152: '20150731', 153: '20150801', 154: '20150802', 155: '20150803', 156: '20150804', 157: '20150805', 158: '20150806', 159: '20150807', 160: '20150808', 161: '20150809', 162: '20150810', 163: '20150811', 164: '20150812', 165: '20150813', 166: '20150814', 167: '20150815', 168: '20150816', 169: '20150817', 170: '20150818', 171: '20150819', 172: '20150820', 173: '20150821', 174: '20150822', 175: '20150823', 176: '20150824', 177: '20150825', 178: '20150826', 179: '20150827', 180: '20150828', 181: '20150829', 182: '20150830', 183: '20150831', 184: '20150901', 185: '20150902', 186: '20150903', 187: '20150904', 188: '20150905', 189: '20150906', 190: '20150907', 191: '20150908', 192: '20150909', 193: '20150910', 194: '20150911', 195: '20150912', 196: '20150913', 197: '20150914', 198: '20150915', 199: '20150916', 200: '20150917', 201: '20150918', 202: '20150919', 203: '20150920', 204: '20150921', 205: '20150922', 206: '20150923', 207: '20150924', 208: '20150925', 209: '20150926', 210: '20150927', 211: '20150928', 212: '20150929', 213: '20150930', 214: '20151001', 215: '20151002', 216: '20151003', 217: '20151004', 218: '20151005', 219: '20151006', 220: '20151007', 221: '20151008', 222: '20151009', 223: '20151010', 224: '20151011', 225: '20151012', 226: '20151013', 227: '20151014', 228: '20151015', 229: '20151016', 230: '20151017', 231: '20151018', 232: '20151019', 233: '20151020', 234: '20151021', 235: '20151022', 236: '20151023', 237: '20151024', 238: '20151025', 239: '20151026', 240: '20151027', 241: '20151028', 242: '20151029', 243: '20151030', 244: '20151031'}
===end generate date rank==================================
===start load songs==================================
# 歌曲的数量
songs num 10842
songs_id_to_songinfo num 10842
# 歌手的数量
artist num 50
# 语言类型的数量
language type num 9
# 歌手的性别类型(男、女、乐队)
artist gender num 3
# 编号为k的歌手出的歌曲的数量
k th artist songs num {0: 579, 1: 310, 2: 97, 3: 50, 4: 84, 5: 291, 6: 16, 7: 313, 8: 110, 9: 50, 10: 28, 11: 172, 12: 118, 13: 22, 14: 685, 15: 322, 16: 722, 17: 112, 18: 55, 19: 243, 20: 38, 21: 154, 22: 1861, 23: 146, 24: 11, 25: 134, 26: 180, 27: 150, 28: 124, 29: 89, 30: 117, 31: 86, 32: 38, 33: 621, 34: 67, 35: 152, 36: 60, 37: 57, 38: 12, 39: 176, 40: 675, 41: 203, 42: 568, 43: 274, 44: 25, 45: 197, 46: 75, 47: 10, 48: 44, 49: 119}
It takes 0.094105 s to load songs
===end load songs===================================
===start user statistics==================================
# 用户的数量
user num 349946
# 被播放的、下载过的、收藏过的歌曲的数量
song num that has action 10278
# 用户行为的数量:播放、下载、收藏
action type num 3
It takes 17.047225 s to do user statistics
===end user statistics===================================
===start action statistics==================================
# 运行更多的统计
It takes 138.617669 s to do action statistics
===end actions statistics===================================
```
### 画图
* plotJA(j),显示第j个歌手的播放量、下载量、收藏量
```
plotJA(0)
```
得到的结果如图
![](s1d1/pic/artist_trend_0.png)
* plotJS(j),显示第j首歌的播放量、下载量、收藏量
```
plotJS(0)
```
得到的结果如图
![](s1d1/pic/song_trend_0.png)
## le.py
学习和测试部分的代码
* 在ipython里面,运行学习和测试部分的代码(pp.py会被le.py调用运行)
```
%run le.py
```
* 运行fitJU(j),对第j个user的播放量用多项式进行拟合并画图
```
fitJU(0)
```
得到的结果如图
![](s1d1/pic/artist_trend_predict_0.png)
## test(degree)
`test(degree)`函数用前4个月的播放量预测后两个月的播放量,拿预测值和真值比较,计算出总体的F值,实现了如下图的F值计算公式
![](F.png)
# 第二个数据集(100个歌手),文件夹`s1d2`
## data文件夹
`p2_mars_tianchi_songs.csv`、`p2_mars_tianchi_user_actions.csv`两个数据文件
* 下载链接:https
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
阿里音乐流行趋势预测大赛程序 (224个子文件)
iterate.dat 64KB
.gitignore 81B
README.md 16KB
.placeholder 0B
.placeholder 0B
.placeholder 0B
.placeholder 0B
artist_trend_0.png 146KB
song_trend_0.png 110KB
artist_99_play_week_seasonal.png 102KB
51.png 95KB
artist_99_play_month_seasonal.png 88KB
89.png 87KB
10.png 86KB
32.png 85KB
16.png 85KB
21.png 85KB
92.png 84KB
68.png 83KB
9.png 83KB
34.png 83KB
41.png 83KB
33.png 83KB
67.png 82KB
39.png 81KB
80.png 81KB
17.png 81KB
95.png 81KB
76.png 80KB
91.png 80KB
89.png 80KB
78.png 79KB
71.png 79KB
31.png 79KB
81.png 79KB
27.png 79KB
46.png 79KB
artist_34_play_pred.png 79KB
73.png 78KB
63.png 78KB
49.png 77KB
75.png 77KB
26.png 77KB
7.png 76KB
97.png 76KB
8.png 76KB
11.png 76KB
77.png 75KB
83.png 75KB
41.png 75KB
92.png 75KB
98.png 74KB
29.png 74KB
18.png 73KB
artist_trend_predict_0.png 73KB
30.png 73KB
36.png 73KB
56.png 73KB
19.png 73KB
90.png 73KB
31.png 73KB
94.png 72KB
13.png 72KB
artist_99_play.png 72KB
85.png 71KB
88.png 71KB
21.png 71KB
40.png 71KB
50.png 71KB
57.png 71KB
3.png 70KB
6.png 70KB
93.png 70KB
57.png 70KB
59.png 69KB
37.png 69KB
66.png 69KB
65.png 69KB
14.png 69KB
76.png 69KB
52.png 69KB
10.png 68KB
42.png 68KB
8.png 68KB
20.png 68KB
68.png 68KB
80.png 67KB
44.png 67KB
58.png 67KB
23.png 67KB
81.png 67KB
78.png 66KB
9.png 66KB
38.png 66KB
63.png 66KB
99.png 66KB
71.png 66KB
90.png 66KB
15.png 65KB
96.png 65KB
共 224 条
- 1
- 2
- 3
资源评论
- 童童20102018-12-12感谢您的分享,请问有米有阿里音乐的数据集呀?谢谢~
- 西厂程序员2018-03-04大佬厉害!lCcsdn1232018-03-04不是我做的,最近在参加比赛,收集到的一些程序,分享学习yi'xia。
lCcsdn123
- 粉丝: 4
- 资源: 9
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功