Decomposition Types#

Gloria models are based on the well-known decomposition of time-series \(Y(t)\) as a function of time \(t\) into trend \(T(t)\), seasonality \(S(t)\), and a residual error term \(E(t)\). There are two major ways to perform the decomposition:

  • Additive decomposition: \(Y(t) = T(t) + S(t) + E(t)\)

  • Multiplicative decomposition: \(Y(t) = T(t) \times S(t) \times E(t)\)

Additive models assume independent components, i.e. they simply add up. Multiplicative models on the other hand can be used to introduce interactions, eg. when seasonalities grow with the trend.

In the last section on model selection, we saw how different distribution models support various data types and bounds. In this tutorial, we explore how models inherently support different decomposition types, making an explicit selection between additive and multiplicative modes unnecessary.

Note

Forcing a certain decomposition type can even violate the bounds of distribution models. Think of the gamma distribution with lower bound of zero. With vanishing trend \(T(t) = 0\) and a simple sinusoidal seasonality \(S(t) = \sin(t)\), an additive model would cause \(Y(t)\) to be negative half the time.

Comparison of Types#

To illustrate additive vs. multiplicative behaviour, we take a look at a well-known example showing multiplicative behaviour, namely the Air Passengers data set available at Kaggle. The following code fits and plots the data with a poisson model for count data.

Note

It is worth mentioning that the data set is sampled on a monthly basis. As the length of a month varies, Gloria cannot handle monthly data directly. Instead we resample the data using the average length of a month.

import pandas as pd
from gloria import Gloria, cast_series_to_kind

# Read the data
url = "https://raw.githubusercontent.com/e-dyn/gloria/main/scripts/data/real/AirPassengers.csv"
data = pd.read_csv(url)


# Rename columns for convenience
data.rename(
    {"Month": "date", "#Passengers": "passengers"}, axis=1, inplace=True
)

# Convert to datetime
data["date"] = pd.to_datetime(data["date"])

# Make a new monthly column with equidistant spacing
freq = pd.Timedelta(f"{365.25/12}d")
data["date"] = pd.Series(
    pd.date_range(
        start=data["date"].min(), periods=len(data["date"]), freq=freq
    )
)

# Cast to unsigned data type so it is digestible by count models
data["passengers"] = cast_series_to_kind(data["passengers"], "u")

# Set up the model
m = Gloria(
    model="poisson",              # <-- change to "normal" for comparison
    metric_name="passengers",
    timestamp_name="date",
    sampling_period=freq,
    n_changepoints=0,
)

# Add observed seasonalities
m.add_seasonality("yearly", "365.25 d", 4)

# Fit the model to the data
m.fit(data)

# Plot
m.plot(m.predict(), include_legend=True)

The result is shown in the plot below. We can see an exponentially growing trend, nicely tracing the data average. Also, the oscillations scale with the trend as one expects from a multiplicative model.

Fitting the air passengers data with a multiplicative decomposition model

The next image shows the result of the same code, but using the normal distribution. The fit is much worse, showing a strictly linear trend and constant seasonality amplitudes.

Fitting the air passengers data with an additive decomposition model