Time Series Regression Experiments¶
In this notebook we will show how to run a simple time series regression experiment using the tsml-eval
package. Time series regression is the task of predicting a continuous value for a time series.
[ ]:
import numpy as np
import pandas as pd
from aeon.benchmarking import get_estimator_results
from aeon.datasets import load_regression
from aeon.regression import DummyRegressor
from aeon.visualisation import plot_critical_difference
from sklearn.metrics import mean_squared_error
from tsml.datasets import load_minimal_gas_prices
from tsml_eval.evaluation.storage import load_regressor_results
from tsml_eval.experiments import (
experiments,
get_regressor_by_name,
run_regression_experiment,
)
[ ]:
X_train, y_train = load_minimal_gas_prices(split="train")
X_test, y_test = load_minimal_gas_prices(split="test")
[ ]:
# set_regressor can be used to find various regressors by string, but
# any aeon, tsml or sklearn regressor can be used in the experiments function
regressor = get_regressor_by_name("DummyRegressor")
# record memory usage every 0.1 seconds, just here for notebook speed
# does not need to be changed for usage
experiments.MEMRECORD_INTERVAL = 0.1
run_regression_experiment(
X_train,
y_train,
X_test,
y_test,
regressor,
"./generated_results/",
dataset_name="GasPrices",
resample_id=0,
)
A function is also available to load the dataset as well as run an experiment, see load_and_run_regression_experiment
in tsml_eval.experiments
.
Both experiment functions will output a results file in the {results_dir}/{regressor_name}/Predictions/{dataset_name}/
directory. These files can be loaded individually, or used as a collection in the evaluation
module. See the evaluation notebook for more details.
[ ]:
rr = load_regressor_results(
"./generated_results/DummyRegressor/Predictions/GasPrices/testResample0.csv"
)
print(rr.predictions)
print(rr.mean_squared_error)
print(rr.root_mean_squared_error)
print(rr.mean_absolute_percentage_error)
print(rr.r2_score)
A common use-case is wanting to compare a new algorithm against provided benchmark results. The tsml group stores their publication results and provides an aeon
function to load them. An example of this is shown below for regression.
[ ]:
reg = DummyRegressor()
datasets = [
"CardanoSentiment",
"Covid3Month",
"FloodModeling1",
"FloodModeling2",
"NaturalGasPricesSentiment",
]
# find RMSE for each of our datasets on our estimator
results = {}
for d in datasets:
train_X, train_y = load_regression(d, split="train")
test_X, test_y = load_regression(d, split="test")
reg = reg.fit(train_X, train_y)
y_pred = reg.predict(test_X)
results[d] = mean_squared_error(test_y, y_pred, squared=False)
results
[ ]:
benchmarks = ["InceptionE", "FreshPRINCE", "DrCIF"]
res = get_estimator_results(
datasets=datasets, estimators=benchmarks, task="regression", measure="rmse"
)
res
[ ]:
res["Dummy"] = results
table = pd.DataFrame(res)
table
[ ]:
plt, _ = plot_critical_difference(
np.array(table), list(table.columns), lower_better=True
)
plt.show()
Generated using nbsphinx. The Jupyter notebook can be found here.