How to add a model
Model Structure¶
All models are stored in the model folder. Each model has a dedicated .ipynb file, making it easy to navigate and understand its implementation.
Each model typically includes two functions: one for training and one for generating forecasts. Tasks such as train-test splitting, 10-fold cross-validation, evaluation, and plotting are handled in general_functions.ipynb, keeping model files focused solely on model-specific code.
Training Function¶
The training function trains the model using the training set predictors and target values. It returns a single object, model, which contains the trained model.
Testing Function¶
The testing function uses the trained model along with the training and testing predictors. It returns two DataFrames: forecasted values for both the training and testing sets.
Evaluation and plotting are managed by notebooks/config/general_functions.ipynb.
How to Add a Model¶
To add a new model, follow these three steps:
1. Create a new model file in the notebooks/model/ folder¶
For example, to add a model named new_model, create a file called m19_new_model.ipynb. Define the training and testing functions in this file, named train_model_m19_new_model and produce_forecast_model_m19_new_model.
You can refer to existing models for examples, such as the ANN model:
def train_model_m7_ann(hyperparameter, train_df_X, train_df_y):
''' Train and test a linear model for point forecasting.
Args:
hyperparameter (df) : hyperparameter value of the model consisting of number of features
train_df_X (df) : features matrix for training
train_df_y (df) : target matrix for training
Returns:
model (model) : trained model with all features
'''
#UNPACK HYPERPARAMETER
# Set random seed for reproducibility
def set_seed(seed):
random.seed(seed)
os.environ["PYTHONHASHSEED"] = str(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.deterministic = True
seed = int(hyperparameter['seed'])
hidden_size = hyperparameter['hidden_size']
activation_function = hyperparameter['activation_function']
learning_rate = hyperparameter['learning_rate']
# learning_rate = 0.001
solver = hyperparameter['solver']
epochs = hyperparameter['epochs']
# Use proper format for X and y
X = torch.tensor(train_df_X.values, dtype=torch.float32)
y = torch.tensor(train_df_y.values, dtype=torch.float32).view(-1, 1)
# Define the ANN model
class ANNModel(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(ANNModel, self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.fc2 = nn.Linear(hidden_size, output_size)
self.relu = nn.ReLU() # Activation function
def forward(self, x):
x = self.fc1(x)
if activation_function == 'relu':
x = self.relu(x)
elif activation_function == 'sigmoid':
x = torch.sigmoid(x)
else:
x = torch.tanh(x)
x = self.fc2(x)
return x
# Model initialization
input_size = X.shape[1]
output_size = y.shape[1]
set_seed(seed)
model_ann = ANNModel(input_size, hidden_size, output_size)
if solver == 'adam':
optimizer = optim.Adam(model_ann.parameters(), lr=learning_rate)
elif solver == 'sgd':
optimizer = optim.SGD(model_ann.parameters(), lr=learning_rate)
else:
raise ValueError('Solver not found')
# Loss function
criterion = nn.MSELoss() # Mean Squared Error loss for regression
#TRAIN MODEL
# Training loop
for epoch in range(epochs):
model_ann.train()
# Forward pass
output = model_ann(X)
loss = criterion(output, y)
# Backward pass
optimizer.zero_grad()
loss.backward()
# Update weights
optimizer.step()
if epoch % 10 == 0:
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')
# PACK MODEL
model = {"model_ann": model_ann}
return model
def produce_forecast_m7_ann(model, train_df_X, test_df_X):
"""Create forecast at the train and test set using the trained model
Args:
model (dictionary): all parameters of the trained model
train_df_X (df): predictors of train set
test_df_X (df): predictors of test set
Returns:
train_df_y_hat (df) : forecast result at train set
test_df_y_hat (df) : forecast result at test set
"""
# UNPACK MODEL
model_ann = model["model_ann"]
# PREPARE FORMAT
train_df_X_tensor = torch.tensor(train_df_X.values, dtype=torch.float32)
test_df_X_tensor = torch.tensor(test_df_X.values, dtype=torch.float32)
# PRODUCE FORECAST
# Switch model to evaluation mode for inference
model_ann.eval()
# TRAIN SET FORECAST
with torch.no_grad(): # Disable gradient calculation to save memory
train_df_y_hat_tensor = model_ann(train_df_X_tensor)
# TEST SET FORECAST
with torch.no_grad(): # Disable gradient calculation to save memory
test_df_y_hat_tensor = model_ann(test_df_X_tensor)
# Create DataFrames of result
train_df_y_hat = pd.DataFrame(train_df_y_hat_tensor, index=train_df_X.index, columns=['y_hat'])
test_df_y_hat = pd.DataFrame(test_df_y_hat_tensor, index=test_df_X.index, columns=['y_hat'])
return train_df_y_hat, test_df_y_hat
You can add any model you’ve developed or proposed, as long as it can be trained using the training set features and target values. The next steps are straightforward.
2. Update train_model function in notebooks/config/general_functions.ipynb file¶
This file contains utility functions, including train_model, which dispatches training based on the selected model.
Add a new condition like:
elif model_name == 'm19_new_model':
model = train_model_m19_new_model(hyperparameter, train_df_X, train_df_y)
3. Update produce_forecast function in notebooks/config/general_functions.ipynb file¶
Similarly, update the produce_forecast function by adding:
elif model_name == 'm18_nbeats':
train_df_y_hat, test_df_y_hat = produce_forecast_m19_new_model(model, train_df_X, test_df_X)