【免费】用电量数据、用电量预测模型，xgboost资源-CSDN文库

共22个文件

csv：13个

xml：4个

zip：1个

用电量预测

需积分: 0 100 浏览量 2024-05-13 10:22:26 上传评论收藏 21.21MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

Energy.zip （22个子文件）

Energy

main.py 3KB

EnergyData

DEOK_hourly.csv 1.49MB

NI_hourly.csv 1.55MB

est_hourly.paruqet 3.51MB

AEP_hourly.csv 3.24MB

COMED_hourly.csv 1.76MB

PJME_hourly.csv 3.88MB

DUQ_hourly.csv 3.07MB

EKPC_hourly.csv 1.16MB

archive.zip 11.42MB

PJM_Load_hourly.csv 900KB

DOM_hourly.csv 3.06MB

DAYTON_hourly.csv 3.12MB

FE_hourly.csv 1.62MB

PJMW_hourly.csv 3.69MB

pjm_hourly_est.csv 12.12MB

.idea

workspace.xml 3KB

misc.xml 310B

Energy.iml 291B

inspectionProfiles

profiles_settings.xml 174B

modules.xml 271B

.gitignore 50B

import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import xgboost as xgb from sklearn.metrics import mean_squared_error color_pal = sns.color_palette() plt.style.use('fivethirtyeight') df = pd.read_csv('./EnergyData/PJME_hourly.csv') df = df.set_index('Datetime') df.index = pd.to_datetime(df.index) df.plot(style='.', figsize=(15, 5), color=color_pal[0], title='PJME Energy Use in MW') plt.show() train = df.loc[df.index < '01-01-2015'] test = df.loc[df.index >= '01-01-2015'] fig, ax = plt.subplots(figsize=(15, 5)) train.plot(ax=ax, label='Training Set', title='Data Train/Test Split') test.plot(ax=ax, label='Test Set') ax.axvline('01-01-2015', color='black', ls='--') ax.legend(['Training Set', 'Test Set']) plt.show() df.loc[(df.index > '01-01-2010') & (df.index < '01-08-2010')] \ .plot(figsize=(15, 5), title='Week Of Data') plt.show() def create_features(df): """ Create time series features based on time series index. """ df = df.copy() df['hour'] = df.index.hour df['dayofweek'] = df.index.dayofweek df['quarter'] = df.index.quarter df['month'] = df.index.month df['year'] = df.index.year df['dayofyear'] = df.index.dayofyear df['dayofmonth'] = df.index.day df['weekofyear'] = df.index.isocalendar().week return df df = create_features(df) fig, ax = plt.subplots(figsize=(10, 8)) sns.boxplot(data=df, x='hour', y='PJME_MW') ax.set_title('MW by Hour') plt.show() fig, ax = plt.subplots(figsize=(10, 8)) sns.boxplot(data=df, x='month', y='PJME_MW', palette='Blues') ax.set_title('MW by Month') plt.show() train = create_features(train) test = create_features(test) FEATURES = ['dayofyear', 'hour', 'dayofweek', 'quarter', 'month', 'year'] TARGET = 'PJME_MW' X_train = train[FEATURES] y_train = train[TARGET] X_test = test[FEATURES] y_test = test[TARGET] reg = xgb.XGBRegressor(base_score=0.5, booster='gbtree', n_estimators=1000, early_stopping_rounds=50, objective='reg:linear', max_depth=3, learning_rate=0.01) reg.fit(X_train, y_train, eval_set=[(X_train, y_train), (X_test, y_test)], verbose=100) fi = pd.DataFrame(data=reg.feature_importances_, index=reg.feature_names_in_, columns=['importance']) fi.sort_values('importance').plot(kind='barh', title='Feature Importance') plt.show() test['prediction'] = reg.predict(X_test) df = df.merge(test[['prediction']], how='left', left_index=True, right_index=True) ax = df[['PJME_MW']].plot(figsize=(15, 5)) df['prediction'].plot(ax=ax, style='.') plt.legend(['Truth Data', 'Predictions']) ax.set_title('Raw Dat and Prediction') plt.show() ax = df.loc[(df.index > '04-01-2018') & (df.index < '04-08-2018')]['PJME_MW'] \ .plot(figsize=(15, 5), title='Week Of Data') df.loc[(df.index > '04-01-2018') & (df.index < '04-08-2018')]['prediction'] \ .plot(style='.') plt.legend(['Truth Data','Prediction']) plt.show() score = np.sqrt(mean_squared_error(test['PJME_MW'], test['prediction'])) print(f'RMSE Score on Test set: {score:0.2f}') test['error'] = np.abs(test[TARGET] - test['prediction']) test['date'] = test.index.date test.groupby(['date'])['error'].mean().sort_values(ascending=False).head(10)

评论收藏

内容反馈