Skip to content

Datasets & Models

Dataset

The available datasets are listed in the data folder, with metadata provided in data/metadata.xlsx. Below are some of the datasets currently included in the library:

Ausgrid Solar Home Datasets

This dataset has been widely used in net load forecasting research since 2016. It includes data from 300 solar-equipped households within the Ausgrid network in Sydney. The original source is here.

Dataset Name Description
ds1_ahsd.csv Ausgrid Solar Home Dataset, aggregate of 300 household data in Ausgrid network
ds4_ashd_with_weather.csv ds1 enhanced with temperature, relative humidity, and wind speed data
ds13_ashd_with_cloud_solcast.csv ds4 further enriched with cloud data from Solcast

Australia Energy Data Platform (AEDP) Datasets

These datasets were compiled by UNSW Sydney using data from Solar Analytics and Wattwatchers. Sensitive information such as customer addresses, names, and NMIs has been removed. The original source is here.

Dataset Name Description
ds10_aedp_cluster2_30min.csv AEDP dataset for Cluster 2 with 30-minute resolution
ds11_aedp_cluster2_30min_with_weather.csv ds10 enhanced with weather data including temperature, humidity, etc.

Ausgrid Zone Substation (ZS) Datasets – Mascot

Unlike the previous household-focused datasets, this dataset covers a zone substation, which includes residential, commercial & industrial (C&I), and major customers. The original source is here.

Dataset Name Description
ds14_ausgrid_zs_mascot.csv Zone Substation data for Mascot
ds15_ausgrid_zs_mascot_30min_with_weather.csv ds14 enhanced with weather data at 30-minute resolution

Model

Forecasting Models Overview

Model ID Model Name Short Description
m1_naive Naive Forecast equals the last observed value
m2_snaive Seasonal Naive Forecast equals the value from the same season in the previous cycle

Statistical Models

Model ID Model Name Short Description
m3_ets ETS Exponential smoothing model with error, trend, and seasonality components
m4_arima ARIMA Autoregressive Integrated Moving Average model for time series forecasting
m5_sarima SARIMA Seasonal ARIMA model with seasonal components
m6_lr Linear Regression Predicts future values using a linear combination of input features

Machine Learning Models

Model ID Model Name Short Description
m7_ann ANN Basic Artificial Neural Network with one hidden layers
m8_dnn Deep Neural Network ANN with more than one hidden layer
m9_rt Regression Tree Decision tree model for regression tasks
m10_rf Random Forest Ensemble of regression trees for improved accuracy and robustness
m11_svr Support Vector Regression Uses support vectors to perform regression with margin of tolerance
m12_rnn Recurrent Neural Network Neural network with feedback loops for sequential data
m13_lstm Long Short-Term Memory RNN variant designed to capture long-term dependencies
m14_gru Gated Recurrent Unit Simplified LSTM with fewer parameters
m15_transformer Transformer Attention-based model for sequence modeling without recurrence
m16_prophet Prophet Time series model developed by Facebook for business forecasting
m17_xgb XGBoost Gradient boosting framework optimized for speed and performance
m18_nbeats N-BEATS Deep learning model for univariate time series forecasting

Hyperparameter

The list of available model and its hyperparameter can be seen on config/model_hyperparameters.ipynb. The values currently available are the hyperparameter values mostly used in academic literature, but not necessarily the optimum value.