# Concept Drift Datasets v1.0
## Background
**Concept drift** describes unforeseeable changes in the underlying distribution of streaming data over time[1]. Concept drift problem exists in many real-world situations, such as sensor drift and the change of operating mode[2][3]. Detecting concept drift timely and accurately is of great significance for judging system state and providing decision suggestions[4]. In order to better test and evaluate the performance of concept drift detection algorithm, we have made some datasets with known drift types and drift time points, hoping to help the development of concept drift detection.
## Usage
- If you want to use the datasets in the project, you can download them directly and import them using the pandas library.
- Example:
```
import pandas as pd
data = pd.read_csv('xxxxxx/nonlinear_gradual_chocolaterotation_noise_and_redunce.csv')
data = data.values
X = data[:, 0 : 5]
Y = data[:, 5]
```
- Or you can download *DatasetsInput.py*, and then import the class, as shown in *DatasetsInput_main.py*.(**Recommended**)
- Example:
```
from DatasetsInput import Datasets
Data = Datasets()
X, Y = Data.CNNS_Nonlinear_Gradual_ChocolateRotation()
```
- If you want to regenerate the dataset and import it directly, you can download *DataStreamGenerator.py* and put it under the file where your code is located, and then import the class.
- Example:
```
from DataStreamGenerator import DataStreamGenerator
C = DataStreamGenerator(class_count=2, attribute_count=2, sample_count=100000, noise=True, redunce_variable=True)
X, Y = C.Nonlinear_Sudden_RollingTorus(plot=True, save=True)
```
- If you want to modify the source code, you can download it and do it in *DataStreamGenerator.py*.
## Dataset Introduction
We have made four categories of datasets, including *linear*, *rotating cake*, *rotating chocolate* and *rolling torus*. All of them contain four types of drifts: *Abrupt*, *Sudden*, *Gradual* and *Recurrent*. Users can choose whether to draw distribution of samples in real time, save pictures, and make sample change videos. Users can choose whether to add noise or redundant variables as well. See the picture below for a more detailed introduction. Note that the dataset name in the following figure is also the name of the intra class function.
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/QQ%E5%9B%BE%E7%89%8720230105200626.png"/></div>
### Linear
In the dataset *Linear*, the decision boundary is a straight line. We simulate the change of the decision boundary through the rotation of the straight line. Users can freely select the rotation axis within the range of [-10, 10]×[-10, 10].
- Data distribution display:
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_linear_gradual_rotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
<img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_linear_sudden_rotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
</div>
<p align="center">(a)Gradual                (b)Sudden</p>
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_linear_recurrent_rotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px" alt="Gradual"/>
<img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_linear_abrupt_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
</div>
<p align="center">(c)Recurrent                (d)Abrupt</p>
### CakeRotation
In the dataset *CakeRotation*, samples with odd angle area belong to one class, while samples with even angle area belong to another class. We simulate concept drift by rotating the disk, and the range of the angle area will change during the rotation. **If you need data sets of multiple categories, you can achieve it by using modulus instead of odd and even numbers on this basis[5].**
- Data distribution display:
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_gradual_cakerotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
<img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_sudden_cakerotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
</div>
<p align="center">(a)Gradual                (b)Sudden</p>
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_recurrent_cakerotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px" alt="Gradual"/>
<img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_abrupt_cakerotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
</div>
<p align="center">(c)Recurrent                (d)Abrupt</p>
### ChocolateRotation
In the dataset *ChocolateRotation*, samples with odd *x+y* area belong to one class, while samples with even angle area belong to another class. We simulate concept drift by rotating the chocolate plate, and use the rotation matrix to calculate the coordinates of the samples in the new coordinate system and reclassify them. **If you need data sets of multiple categories, you can achieve it by using modulus instead of odd and even numbers on this basis.**
- Data distribution display:
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/nonlinear_gradual_chocolaterotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
<img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_sudden_chocolaterotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
</div>
<p align="center">(a)Gradual                (b)Sudden</p>
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_recurrent_chocolaterotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px" alt="Gradual"/>
<img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_abrupt_chocolaterotation_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
</div>
<p align="center">(c)Recurrent                (d)Abrupt</p>
### RollingTorus
In the dataset *RollingTorus*, we set two torus of the same size close together, and the samples in different torus belong to different classes. We let the third torus roll over at a constant speed, and the samples overlapping the first two tori will become the opposite category. **If you need a dataset with unbalanced number of category samples, you can adjust the initial torus radius to achieve[6].**
- Data distribution display:
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_gradual_rollingtorus_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
<img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_sudden_rollingtorus_noise_and_redunce.gif?raw=true" width="320px" hegiht="240px"/>
</div>
<p align="center">(a)Gradual                (b)Sudden</p>
<div align=center><img src="https://github.com/songqiaohu/pictureandgif/blob/main/figure_nonlinear_recurrent_rollingtorus_noise_and_redunce.gif?raw=tr
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
These are the concept drift datasets we made, and we open-source the data and corresponding interfaces. Welcome to use them for free if there is a need. https://github.com/songqiaohu/THU-Concept-Drift-Datasets-v1.0
资源推荐
资源详情
资源评论
收起资源包目录
概念漂移仿真数据集及其实现代码 (280个子文件)
000c5f98ab32ccc20107e5ff4916784adfefe5 560B
020a0a554ad4397b71ac77a53a19bb1959fdda 856B
0298c66b447e42642ae77c4080c42d41a5c522 749B
02ec958c9556419b11b23cd15318cf6bdc9f00 173B
02f2e9be9a2d070590a0c12571c9276603aff3 546B
05146dd16ec03711a079fe9f52ea98ba1d9c75 264B
06d63ff8de73d5e3b454bd2c40962dd9dfff77 159B
07aaf17df149fa60daf98db4e65c3e659febaa 4.7MB
0822b28e607c54e7facd2ce32d633ef51b312b 750B
083371dc88ce1acb811cbd8bf9b218e109ce48 94KB
0a37f9b21a70d3bb124e3dbec3804fde67fea8 750B
0c0ddea502dd5d5e6b2ed0d006b0b42d20968b 157B
0d679fc2d3fb271e9c038cf4ee7ecd2d2539c3 750B
0d87f435512f231348b467da806186fe6e0ab0 4.7MB
0d914dd2cafc4879ad7719e46314f4277d8e03 6KB
0f694b83f6c31eee17d34cc9ba9cd78014da6b 92B
115a3ab24a19efc26eab843aa3b811aa09dba4 3.08MB
11d5727785965ccbdec6f3d834a130c2dc5832 150B
11e324a1504387a8d7e219a3e16d0893383fc6 4KB
1271d68692e28425e65d6ffdf55f66e60ffcfa 561B
12eebbfa939bbda3d71a1da8c760e82738cc9e 795KB
13cea0870bb25e8d7feb3a05f04cc7cf9101b9 165B
15604dd114a34d5e7429fd57076cfc88458e44 4KB
170b0716752dff42009ce9666222af95506246 752B
17823c5c8c9fa2532ff51a0e8d33de613325f1 166B
1a28b9cb5f778dafdc7d3489d1f8b458da8d56 751B
1a360f7a81e20da928ecf8db8b0249ec310916 110B
1c462aad8b7d0fe9702fde0555f8692a131844 795KB
1d5fa2748afe8c152196debde61470e587f720 3.08MB
1eab8327b15d44a13bb639e951d68ebc878f2a 2KB
216db7c706df3ee46d7a430af21b22e203bb25 3.08MB
252e98751b7d6adcba462c490d562bf98df4f6 2KB
282e7666f18c1f3047c2aedcdb10c3e8b642b7 3.07MB
28e0401f660cbf1d29d5e77aeda0c2d6e454ad 3.07MB
2a27dc34c7b37a1ee39ffa817302e1c5f678e8 776KB
2a755b60dfd6642592f28e2abb853650e0f624 560B
2c050df77cbab3ae8bdc84311d432e9d2cf019 3.08MB
2c516d2a9fecd723b4e32e0033f0703a4aa211 644B
2ce726cbb0f329674e07be352fd63845ffd9ab 750B
2d5c63c7df087261fedc7a8b15607a5e0e3621 302B
2def2e2035b698a8f061728e3c0fdf10e26b8a 864KB
2e2b557ec21b4efb88e4d63d7a5e60766b3b9b 674B
2fcbc46a62e0a7a1acc502516843d88e763d44 237B
2ffbf87d037b5cd24a2446fc025a9fba367b3e 427B
31bd17296a391ead431eecf8d0561f70aea857 1.15MB
3351f4ff4d1a6f4e9dfb718fd61833c92f3cac 4.7MB
35ee4db59159abb8bdb22284e72f0c6485c57b 750B
39a5f307c3cd724861a5669d48c43b9529edf6 206B
3cb5739e872f03b472dde587db44a12fe7680a 560B
3d3673dc5f6198bf3c209ca04691023f5e0455 560B
3df6264c35fb85c2ea1d294dd8097d69b2e3b8 843KB
402c6a2b2c17f0d2808a1dde47b0e5ed48845e 79B
40fa6048e74e5b2f96f47261ecf2ac5a61d34c 856B
4127cd2cebaed4a4b60f2d36200b90f828ee84 1.12MB
4275a357dfabe47dfad02079887b10ba8a1b47 547B
429a3f71800712c130042fd77180d3ce4b86b3 2KB
42e79cbed19a93672558e4ca9dcd10e14b3b8e 750B
445d6d7af2c6fa855b358de2504b8d42394edb 273B
4825851096db371215813de0f365434ce3219f 114B
494bc0000dbbbfd4ce2a78dba9e8768e793e4c 545B
4950c46d24fb1e144953564a690954dffd7579 2KB
4aef55a6bcf1027808826c04034f4b2e2c296d 801B
4b1feab80edefe1420630ad23d916c2fa4fe3c 559B
54d196a14c1d0e297d5e01521c883ab222d131 4KB
54f334834a125bf5d9792e5679e6189212ef3c 751B
571d64cd60b6e10688624016f855a741960dad 152B
58ad377c3c522d76eba76b9677c6dc2800fe2c 796KB
5ce2da2d6447d11dfe32bfb846c3d5b199fc99 142B
5eb3b7fc2376b9f6e1dce3ef48f5829221b0ef 107B
5f0cfb1304c1394e79b2b95da52b495c9c083e 829B
5fb0375f2283bf357559b1ae8d44791f0e7d32 837B
5fda1c98d06eef18d998211787619c1c6ea703 114B
60a9bd29042dfb5f8f2c8b752bd204538d16ec 38B
614fd42893c71d0c7520cfc38ad7d8b8c2d128 259B
62c9551c40eb5eed1829ec18a4f4d33f6b9d11 796KB
63a2673caebb021d7901ed39022e2b5cd40026 117B
644b1a17c02793de878b99e9929c22f35866d9 4.7MB
6657add3eb3c246989284ec6e6a8475603cf1d 147B
67b4a59dbd48976cf260ba859b17652885dbcf 3.08MB
691a130f5f131f5966694b9be718bd2d6bfc79 750B
6962fbf3fefa4c40f05688bbbed64397196666 557B
6b9f99ed857650bec84de3d1f3b5059c404281 755B
6da1e838559bbfc8bb3adf4a23c066a3b06ad8 149B
6f769f722754bf9352607776bfdfe0323b5cf1 803B
713a49ac1001ebae83a884a171b7f225b8a497 101KB
713b3deddd75c1d2255b73eda2f3251c13820c 548B
719dd2b1b601a64d34ffef3a20871408c85057 2KB
71c7bb9fc10c96d882b4fb1a62a8e34dd3d269 37B
72b23ff19ab52fd4c65e7bad2ce37419ed8f9e 864KB
739f25962565bebe1b58c34b3c7e3c8e76ba7b 3.08MB
74511f214cb19f9787af63f884997faa06becd 750B
759c76ff7091ec4cc2280e8f3d4997a7189947 176B
775387f70609335266e001ae913e992873ab59 559B
7785c4c5e3d76a71b64fc634723bc364b4ec4e 2KB
78a99d478cc7b49cb772587c0c774471fd5d8c 173B
78e5153f15fc3f2edadf2a6d43112b9b34b886 752B
790ce4e2de73cf6a760ed3be8f6fef21ca8670 546B
7953208c2ffa680a77d7827eaf9b3083dd4923 195B
79c9644799baa336f5ffcfffefa4b49119e6d1 4.7MB
7b75f03e23af328e7a33be035551e960c9737a 750B
共 280 条
- 1
- 2
- 3
资源评论
dangerousrabbit
- 粉丝: 1
- 资源: 4
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- JavaScript《基于自动分析数据并给出营业建议的餐厅管理系统(接入AI) 》+源代码+项目说明及资料
- 355670834783295707ad04e-427f-4cde-9589-e578224a8459.zip
- 动态sql解析引擎,类似mybatis动态sql的功能
- EDA365-Skill-V2.5安装包,支持Allegro17.x版本
- C# 常用单词汇总,常用单词汇总
- 【ERP标准流程-标准流程-库内业务管理】(DOC 14页).doc
- Python《数据库期末作业-餐厅点单系统 》+源代码+设计资料
- 学生成绩管理系统(C++课程设计
- 双指针法判断链表有环-go语言实现
- MyBatis动态SQL是一种强大的特性,它允许我们在SQL语句中根据条件动态地添加或删除某些部分,从而实现更加灵活和高效的数据
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功