In a forward chaining setting, we retrain/update the model every horizon day/week/months/etc.
and test the model on the next horizon day/week/months/etc., i.e. the data that is included at the next training step.
We call this training/testing steps iterations and start at iteration 0 where only the first timepoint is only included.
At iteration 1, timepoints 1 to 1 + horizon are included.
Usage
get_fc_iteration(t, horizon)
split_fc_dataset(df, it)
detail_fc_training(df, horizon)
get_fc_training_iteration(it_test)Value
get_fc_iterationreturns a vector corresponding to iteration numberssplit_fc_datasetreturns a named list of dataframes (TrainingandTesting)get_fc_training_iterationreturns a vector of unique training iteration numberdetail_fc_trainingreturns a dataframe with columns:Iteration,N,Proportion,LastTime
Details
get_fc_iterationassociate a vector of timepointstto the corresponding iteration.split_fc_datasetsplit datasetdfinto a training and testing set, and computes prediction horizon and last available score in the test set.get_fc_training_iterationidentify training iterations such as test set is not emptydetail_fc_trainingderive training data characteristics for each iteration indf, including the number of training observations, the proportion of training observations on the total number of observations, and the maximum timepoint in the training sets.
Time=t is in Iteration=i means that:
Time=t is in the new training data of iteration i
Time=t is not in the training data of iterations < i
Time=t is in the testing data of iteration i-1
Time=t is in the training data of iterations >= i
Examples
h <- 2
df <- get_index2(t_max = rpois(10, 10))
df$Score <- rnorm(nrow(df))
df$Iteration <- get_fc_iteration(df$Time, h)
sp <- split_fc_dataset(df, 1)
train_it <- get_fc_training_iteration(df$Iteration)
fc_char <- detail_fc_training(df, h)