SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting
Challenge at KDD Cup 2022
Jingbo Zhou
1∗
, Xinjiang Lu
1∗
, Yixiong Xiao
1∗
, Jiantao Su
3
, Junfu Lyu
4
, Yanjun Ma
2
, Dejing Dou
1
1
Baidu Research
2
Baidu Inc.
3
Longyuan Power Group Corp. Ltd.,
4
Tsinghua University
{zhoujingbo, luxinjiang, xiaoyixiong, mayanjun02, doudejing}@baidu.com
3
12091329@chnenergy.com.cn,
4
lvjf@mail.tsinghua.edu.cn,
ABSTRACT
The variability of wind power supply can present substantial chal-
lenges to incorporating wind power into a grid system. Thus, Wind
Power Forecasting (WPF) has been widely recognized as one of
the most critical issues in wind power integration and operation.
There has been an explosion of studies on wind power forecasting
problems in the past decades. Nevertheless, how to well handle the
WPF problem is still challenging, since high prediction accuracy
is always demanded to ensure grid stability and security of supply.
We present a unique Spatial Dynamic Wind Power Forecasting
dataset: SDWPF, which includes the spatial distribution of wind
turbines, as well as the dynamic context factors. Whereas, most of
the existing datasets have only a small number of wind turbines
without knowing the locations and context information of wind
turbines at a ne-grained time scale. By contrast, SDWPF provides
the wind power data of 134 wind turbines from a wind farm over
half a year with their relative positions and internal statuses. We
use this dataset to launch the Baidu KDD Cup 2022 to examine
the limit of current WPF solutions. The dataset is released at https:
//aistudio.baidu.com/aistudio/competition/detail/152/0/datasets.
ACM Reference Format:
Jingbo Zhou
1∗
, Xinjiang Lu
1∗
, Yixiong Xiao
1∗
, Jiantao Su
3
, Junfu Lyu
4
,
Yanjun Ma
2
, Dejing Dou
1
. 2022. SDWPF: A Dataset for Spatial Dynamic
Wind Power Forecasting Challenge at KDD Cup 2022 . In Proceedings of
The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
(Baidu KDD Cup 2022). ACM, New York, NY, USA, 4 pages. https://doi.org/
10.1145/nnnnnnn.nnnnnnn
1 INTRODUCTION
Wind Power Forecasting (WPF) aims to accurately estimate the
wind power supply of a wind farm at dierent time scales. Wind
power is a kind of clean and safe source of renewable energy, but
cannot be produced consistently, leading to high variability. Such
variability can present substantial challenges to incorporating wind
power into a grid system. To maintain the balance between elec-
tricity generation and consumption, the uctuation of wind power
requires power substitution from other sources that might not be
∗
Equal contribution.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
Baidu KDD Cup 2022, Mar. 16 – Jul. 17, 2022,
© 2022 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
https://doi.org/10.1145/nnnnnnn.nnnnnnn
available at short notice (for example, usually it takes at least 6
hours to re up a coal plant). Thus, WPF has been widely recog-
nized as one of the most critical issues in wind power integration
and operation. There has been an explosion of studies on wind
power forecasting problems appearing in the data mining and ma-
chine learning community. Nevertheless, how to well handle the
WPF problem is still challenging, since high prediction accuracy is
always demanded to ensure grid stability and security of supply.
We present a unique Spatial Dynamic Wind Power Forecast-
ing dataset: SDWPF, which includes the spatial distribution of
wind turbines, as well as the dynamic context factors like tempera-
ture, weather, and turbine internal status. Whereas, many existing
datasets and competitions treat WPF as a time series prediction
problem without knowing the locations and context information
of wind turbines.
SDWPF is obtained from the real-world data from Longyuan
Power Group Corp. Ltd. (the largest wind power producer in China
and Asia). There are two unique features for this competition task
dierent from previous WPF competition settings: 1) Spatial distri-
bution: this competition provides the relative location of all wind
turbines given a wind farm for modeling the spatial correlation
among wind turbines. 2) Dynamic context: the weather situations
and turbine internal status detected by each wind turbine are pro-
vided to facilitate the forecasting task.
With aiming to examine the limit of WFP methods, we use the
SDWPF dataset to launch the Baidu KDD Cup 2022 Challenge. SD-
WPF contains the wind power data obtained from the Supervisory
Control And Data Acquisition (SCADA) system of a wind farm
which has 134 wind turbines. The dataset provide the information
about the wind, temperature, turbine angle and historical wind
power. The time range of the dataset is over half a year. We also
provide a baseline for this dataset
1
. The introduction about the
challenge can found in the Baidu KDD Cup 2022 website
2
and the
dataset can be down after registration
3
.
2 RELATED WORK
Wind power forecasting (WPF) has been extensively investigated
over the past decades [
4
,
5
,
13
,
15
]. According to the spatial scale
of the wind power, the problem can be categorised as a single wind
turbine, a wind farm and a group of wind farms [
9
]. The dataset
of this challenge belongs to the wind farm scale. A few of delicate
models have been specially designed for WPF problem with variant
of spatial and temporal scales based on statistic models [
12
,
13
],
machine learning methods [
8
,
17
] and deep learning methods [
6
,
14
].
1
https://github.com/PaddlePaddle/PaddleSpatial/tree/main/apps/wpf_baseline_gru
2
https://aistudio.baidu.com/aistudio/competition/detail/152/0/introduction
3
https://aistudio.baidu.com/aistudio/competition/detail/152/0/datasets