# *k*-Shape: Efficient and Accurate Clustering of Time Series
*k*-Shape is a highly accurate and efficient unsupervised method for ***univariate*** and ***multivariate*** time-series clustering. *k*-Shape appeared at the ***ACM SIGMOD 2015*** conference, where it was selected as one of the (2) ***best papers*** and received the inaugural ***2015 ACM SIGMOD Research Highlight Award***. An extended version appeared in the ***ACM TODS 2017*** journal. Since then, *k*-Shape has achieved state-of-the-art performance in both ***univariate*** and ***multivariate*** time-series datasets (i.e., *k*-Shape is among the fastest and most accurate time-series clustering methods, ranked in the top positions of established benchmarks with 100+ datasets).
*k*-Shape has been widely adopted across scientific areas (e.g., computer science, social science, space science, engineering, econometrics, biology, neuroscience, and medicine), Fortune 100-500 enterprises (e.g., Exelon, Nokia, and many financial firms), and organizations such as the European Space Agency.
If you use *k*-Shape in your project or research, cite the following two papers:
* [ACM SIGMOD 2015](https://www.paparrizos.org/papers/PaparrizosSIGMOD15.pdf)
* [ACM TODS 2017](https://www.paparrizos.org/papers/PaparrizosTODS17.pdf)
## References
> "k-Shape: Efficient and Accurate Clustering of Time Series"<br/>
> John Paparrizos and Luis Gravano<br/>
> 2015 ACM SIGMOD International Conference on Management of Data (**ACM SIGMOD 2015**)<br/>
```bibtex
@inproceedings{paparrizos2015k,
title={{k-Shape: Efficient and Accurate Clustering of Time Series}},
author={Paparrizos, John and Gravano, Luis},
booktitle={Proceedings of the 2015 ACM SIGMOD international conference on management of data},
pages={1855--1870},
year={2015}
}
```
> "Fast and Accurate Time-Series Clustering"<br/>
> John Paparrizos and Luis Gravano<br/>
> ACM Transactions on Database Systems (**ACM TODS 2017**), volume 42(2), pages 1-49<br/>
```bibtex
@article{paparrizos2017fast,
title={{Fast and Accurate Time-Series Clustering}},
author={Paparrizos, John and Gravano, Luis},
journal={ACM Transactions on Database Systems (ACM TODS)},
volume={42},
number={2},
pages={1--49},
year={2017}
}
```
## Acknowledgements
We thank [Teja Bogireddy](https://github.com/bogireddytejareddy) for his valuable help on this repository.
# *k*-Shape's Matlab Repository
This repository contains the Matlab implementation for *k*-Shape. For the Python version, check [here](https://github.com/thedatumorg/kshape-python).
## Data
To ease reproducibility, we share our results over two established benchmarks:
* The UCR Univariate Archive, which contains 128 univariate time-series datasets.
* Download all 128 preprocessed datasets [here](https://www.thedatum.org/datasets/UCR2022_DATASETS.zip).
* The UAE Multivariate Archive, which contains 28 multivariate time-series datasets.
* Download the first 14 preprocessed datasets [here](https://www.thedatum.org/datasets/UAE2022_DATASETS_1.zip).
* Download the remaining 14 preprocessed datasets [here](https://www.thedatum.org/datasets/UAE2022_DATASETS_2.zip).
For the preprocessing steps check [here](https://github.com/thedatumorg/UCRArchiveFixes).
## Usage
### Univariate Example
```
$ matlab
> Datasets = [cellstr('Coffee')]
> DS = LoadUCRdataset(char(Datasets(i)))
> [labels centroids] = kShape_univariate(DS.Data, length(DS.ClassNames));
```
### Multivariate Example
```
$ matlab
> Datasets = [cellstr('ERing')]
> DS = LoadUAEdataset(char(Datasets(i)))
> [labels centroids] = kShape_multivariate(DS.Data, length(DS.ClassNames));
```
Check the [Univariate](https://github.com/thedatumorg/kshape-matlab/blob/main/DEMO_univariate.m) and [Multivariate](https://github.com/thedatumorg/kshape-matlab/blob/main/DEMO_multivariate.m) code examples for benchmarking on the UCR and UAE datasets, respectively.
## Results
The following tables contain the average Rand Index (RI), Adjusted Rand Index (ARI), and Normalized Mutual Information (NMI) accuracy values over 10 runs for *k*-Shape on the univariate and multivariate datasets.
Note: We collected the results using a single core implementation.
Server Specifications: Dual Intel(R) Xeon(R) Silver 4116 (24 cores/48 HT), 2.10 GHz, 196GB RAM.
### Results on the 128 univariate datasets:
| Datasets | RI | ARI | NMI | Runtime (secs) |
|:-----------------------:|:------------:|:------------:|:------------:|:-----------:|
| ACSF1 | 0.720130 | 0.133853 | 0.38816 | 16.44156 |
| Adiac | 0.950243 | 0.245107 | 0.5885544 | 65.73705 |
| AllGestureWiimoteX | 0.8312724 | 0.097974 | 0.206865 | 44.08482 |
| AllGestureWiimoteY | 0.8322620 | 0.1298562 | 0.2612072 | 41.394241 |
| AllGestureWiimoteZ | 0.8305639 | 0.0805551 | 0.1834998| 36.600462|
| ArrowHead | 0.623006 | 0.17425450 | 0.2533444| 1.3054324 |
| BME | 0.623202 | 0.1905601 | 0.2877219 | 0.4999676 |
| Beef | 0.6586440 | 0.093608 | 0.27548189 | 0.8396471 |
| BeetleFly | 0.52217948 | 0.04438771 | 0.05563189 | 0.536824 |
| BirdChicken | 0.557179 | 0.1147453 | 0.1115865 | 0.3971603 |
| CBF | 0.8754116 | 0.7241217 | 0.76718 | 2.6086939 |
| Car | 0.662184 | 0.135845 | 0.2161395 | 2.8708315 |
| Chinatown | 0.5275538 | 0.043759 | 0.016924 | 0.3752007 |
| ChlorineConcentration | 0.5261843 | -0.0009891 | 0.0007648 | 45.2590362 |
| CinCECGTorso | 0.63144 | 0.0627054 | 0.105833 | 118.6877229 |
| Coffee | 0.7746103 | 0.549642 | 0.5130821 | 0.1543948 |
| Computers | 0.5296809 | 0.05959965 | 0.0573684 | 4.7628411 |
| CricketX | 0.86968 | 0.17770 | 0.358468 | 18.7591078 |
| CricketY | 0.8716223 | 0.202953 | 0.372466 | 20.3061207 |
| CricketZ | 0.8708478 | 0.181479 | 0.366086 | 23.5766044 |
| Crop | 0.922896 | 0.2378824 | 0.4379565 | 2016.38332 |
| DiatomSizeReduction | 0.919138 | 0.8000443 | 0.82079 | 2.0050213 |
| DistalPhalanxOutlineAgeGroup | 0.7089805 | 0.40880 | 0.3327341 | 1.8217814 |
| DistalPhalanxOutlineCorrect | 0.4994557 | -0.0010303 | 2.97467e-05 | 0.9867624 |
| DistalPhalanxTW | 0.861218 | 0.66677209 | 0.5412476 | 2.6783893 |
| DodgerLoopDay | 0.7667177 | 0.2080549 | 0.403120 | 1.537474 |
| DodgerLoopGame | 0.5592195 | 0.118973 | 0.1007804 | 0.4339152 |
| DodgerLoopWeekend | 0.883705 | 0.7639901 | 0.726488 | 0.4244024 |
| ECG200 | 0.615723 | 0.221028 | 0.1355204 | 0.3517059 |
| ECG5000 | 0.794273 | 0.5789588 | 0.551086 | 62.80226 |
| ECGFiveDays | 0.8450622 | 0.69024 | 0.65035860 | 1.9896606 |
| EOGHorizontalSignal | 0.8621825 | 0.22106 | 0.3988588 | 76.3076898 |
| EOGVerticalSignal | 0.8712630 | 0.1987407 | 0.3630311 | 136.628252 |
| Earthquakes | 0.515463 | 0.002441935 | 0.00365934 | 5.8951894 |
| ElectricDevices | 0.699713 | 0.08102712 | 0.1900975 | 798.8596981 |
| EthanolLevel | 0.622721 | 0.0032826 | 0.0076| 63.510865 |
| FaceAll | 0.914647 | 0.446507 | 0.621303 | 77.628496 |
| FaceFour | 0.756274 | 0.37390466 | 0.459848 | 0.666998 |
| FacesUCR | 0.905414 | 0.407250 | 0.602981 | 82.0091669 |
| FiftyWords | 0.951268 | 0.353808 | 0.646822 | 77.2777564 |
| Fish | 0.78469 | 0.1885622 | 0.31931 | 7.698090 |
| FordA | 0.5729417 | 0.14588 | 0.108051 | 392.9051991 |
| FordB | 0.512885 | 0.025769 | 0.0192114 | 338.176240 |
| FreezerRegularTrain | 0.638638 | 0.277277 | 0.211358 | 21.8496562 |
| FreezerSmallTrain | 0.63912 | 0.2782464 | 0.2121770 | 20.948636 |
| Fungi | 0.8383608 | 0.370585 | 0.7441787 | 2.4766722 |
| GestureMidAirD1 | 0.944996 | 0.2924181 | 0.630078 | 20.7455286 |
| GestureMidAirD2 | 0.945983 | 0.32512 | 0.668287 | 17.3725095 |
| GestureMidAirD3 | 0.93191 | 0.1287144 | 0.462995 | 20.0660984 |
| GesturePebbleZ1 | 0.882812| 0.58672 | 0.672185 | 5.3699548 |
| GesturePebbleZ2 | 0.865687 | 0.531216 | 0.627707 | 5.6506422 |
| GunPoint | 0.497487 | -0.005050 | 0.0 | 0.27812729 |
| GunPointAgeSpan | 0.532133 | 0.06442548 | 0.0534333 | 0.8154745 |
| GunPointMaleVersusFemale | 0.7919389 | 0.583864 | 0.574584 | 0.9656176 |
| GunPointOldVersusYoung | 0.518569 | 0.0371419 | 0.02792863 | 0.8558587 |
| Ham |
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
Matlab implementation for k-Shape.zip (21个子文件)
新建文本文档.txt 0B
kshape-matlab-main
RandIndex.m 400B
MULTIVARIATE_DATASETS
ERing
ERing_LABEL.h5 4KB
ERing_DATA.h5 611KB
NormalizedMutualInformation.m 490B
kShape_univariate.m 1KB
LICENSE 1KB
DATASETS
Coffee
Coffee_TRAIN 90KB
Coffee_TEST 91KB
docs
kShape.png 143KB
SBD_univariate.m 338B
LoadUCRdataset.m 831B
AdjustedRandIndex.m 948B
LoadUAEdataset.m 456B
NCCc_univariate.m 278B
kShape_multivariate.m 2KB
DEMO_univariate.m 960B
SBD_multivariate.m 474B
README.md 15KB
DEMO_multivariate.m 969B
NCCc_multivariate.m 470B
共 21 条
- 1
资源评论
- meter72024-03-04这个资源内容超赞,对我来说很有价值,很实用,感谢大佬分享~
AbelZ_01
- 粉丝: 1018
- 资源: 5440
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功