gloria.Gloria.load_data#

Gloria.load_data(toml_path=None, **kwargs)[source]#

Load and configure the time-series input data for fit method.

Reads a .csv-file that must contain at least two columns: a timestamp and a metric column named according to self.timestamp_name and self.metric_name, respectively. The timestamp column is converted to a series of pd.Timestamps and the metric column is cast to dtype_kind.

Parameters:
  • toml_path (Optional[Union[str, Path]], optional) – Path to a TOML file whose [load_data] section overrides the model defaults. Ignored when None.

  • source (Union[str, Path]) – Location of the CSV file to load the input data from. This key must be provided.

  • dtype_kind (bool, optional) – Desired kind of the metric column as accepted by NumPy ("u" unsigned int, "i" signed int, "f" float, "b" boolean). If omitted, the metric dtype is cast to float.

  • self (Self)

  • kwargs (dict[str, Any])

Returns:

data – The preprocessed dataframe ready for modelling

Return type:

pandas.DataFrame

Notes

The configuration of the load_data method via source and dtype_kind is composed in four layers, each one overriding the previous:

  1. Model defaults - the baseline configuration with defaults given above.

  2. Global TOML file - key-value pairs in the [load_data] table of the TOML file passed to Gloria.from_toml() if the current Gloria instance was created this way.

  3. Local TOML file - key-value pairs in the [load_data] table of the TOML file provided for toml_path.

  4. Keyword overrides - additional arguments supplied directly to the method take highest precedence.