load_and_run_clustering_experiment

tsml_eval.experiments.load_and_run_clustering_experiment(problem_path, results_path, dataset, clusterer, row_normalise=False, n_clusters=None, clusterer_name=None, resample_id=0, build_test_file=False, write_attributes=False, att_max_shape=0, benchmark_time=True, overwrite=False, predefined_resample=False, combine_train_test_split=False)[source]

Load a dataset and run a clustering experiment.

Function to load a dataset, run a basic clustering experiment for a <dataset>/<clusterer>/<resample> combination, and write the results to csv file(s) at a given location.

Parameters:
problem_pathstr

Location of problem files, full path.

results_pathstr

Location of where to write results. Any required directories will be created.

datasetstr

Name of problem. Files must be <problem_path>/<dataset>/<dataset>+”_TRAIN.ts”, same for “_TEST.ts”.

clustererBaseClusterer

Clusterer to be used in the experiment.

row_normalisebool, default=False

Whether to normalise the data rows (time series) prior to fitting and predicting.

n_clustersint or None, default=None

Number of clusters to use if the clusterer has an n_clusters parameter. If None, the clusterers default is used. If -1, the number of classes in the dataset is used.

clusterer_namestr or None, default=None

Name of clusterer used in writing results. If None, the name is taken from the clusterer.

resample_idint, default=0

Seed for resampling. If set to 0, the default train/test split from file is used. Also used in output file name.

build_test_filebool, default=False

Whether to generate test files or not. If true, the clusterer will assign clusters to the loaded test data.

benchmark_timebool, default=True

Whether to benchmark the hardware used with a simple function and write the results. This will typically take ~2 seconds, but is hardware dependent.

overwritebool, default=False

If set to False, this will only build results if there is not a result file already present. If True, it will overwrite anything already there.

predefined_resamplebool, default=False

Read a predefined resample from file instead of performing a resample. If True the file format must include the resample_id at the end of the dataset name i.e. <problem_path>/<dataset>/<dataset>+<resample_id>+”_TRAIN.ts”.

combine_train_test_split: bool, default=False

Whether the train/test split should be combined. If True then the train/test split is combined into a single train set. If False then the train/test split is used as normal.