-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Open
Description
With the new Pandas 3.0.0 release, Prophet throws the following error upon calling model.fit() with additional regressors.
Traceback (most recent call last):
File "/home/user/repos/repo/libs/proj/src/scripts/forecast.py", line 45, in <module>
m.fit(train)
~~~~~^^^^^^^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/prophet/forecaster.py", line 1220, in fit
model_inputs = self.preprocess(df, **kwargs)
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/prophet/forecaster.py", line 1141, in preprocess
self.make_all_seasonality_features(self.history))
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/prophet/forecaster.py", line 849, in make_all_seasonality_features
component_cols, modes = self.regressor_column_matrix(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
seasonal_features, modes
^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/prophet/forecaster.py", line 901, in regressor_column_matrix
component_cols = pd.crosstab(
~~~~~~~~~~~^
components['col'], components['component'],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
).sort_index(level='col')
^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/reshape/pivot.py", line 1099, in crosstab
df = DataFrame(data, index=common_idx)
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/frame.py", line 769, in __init__
mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy)
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/internals/construction.py", line 447, in dict_to_mgr
return arrays_to_mgr(arrays, columns, index, dtype=dtype, consolidate=copy)
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/internals/construction.py", line 117, in arrays_to_mgr
arrays, refs = _homogenize(arrays, index, dtype)
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/internals/construction.py", line 555, in _homogenize
val = val.reindex(index)
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/series.py", line 5525, in reindex
return super().reindex(
~~~~~~~~~~~~~~~^
index=index,
^^^^^^^^^^^^
...<5 lines>...
copy=copy,
^^^^^^^^^^
)
^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/generic.py", line 5476, in reindex
return self._reindex_axes(
~~~~~~~~~~~~~~~~~~^
axes, level, limit, tolerance, method, fill_value
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
).__finalize__(self, method="reindex")
^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/generic.py", line 5498, in _reindex_axes
new_index, indexer = ax.reindex(
~~~~~~~~~~^
labels, level=level, limit=limit, tolerance=tolerance, method=method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/user/repos/repo/.venv/lib/python3.13/site-packages/pandas/core/indexes/base.py", line 4253, in reindex
raise ValueError("cannot reindex on an axis with duplicate labels")
ValueError: cannot reindex on an axis with duplicate labels
Here is a minimal script to reproduce the error:
import pandas as pd
from prophet import Prophet
if __name__ == "__main__":
prophet_model = Prophet(
weekly_seasonality=False,
daily_seasonality=False,
changepoint_prior_scale=0.1,
seasonality_prior_scale=0.1,
holidays_prior_scale=0.1,
)
prophet_model.add_regressor("is_weekend")
prophet_model.add_regressor(
"holiday_PH",
)
[prophet_model.add_regressor(f"dow_{index}") for index in range(7)]
h = pd.DataFrame(
{
"ds": pd.to_datetime(
[
"2024-06-27",
"2024-06-28",
"2024-06-29",
"2024-06-30",
"2024-07-01",
]
),
"y": [42, 45, 38, 35, 47],
"is_weekend": [0, 0, 1, 1, 0],
"holiday_PH": [0, 0, 0, 0, 0],
"dow_0": [0, 0, 0, 0, 1],
"dow_1": [0, 0, 0, 0, 0],
"dow_2": [0, 0, 0, 0, 0],
"dow_3": [1, 0, 0, 0, 0],
"dow_4": [0, 1, 0, 0, 0],
"dow_5": [0, 0, 1, 0, 0],
"dow_6": [0, 0, 0, 1, 0],
}
)
future_end_date = pd.to_datetime("2024-07-07")
train = h[
(h.ds >= pd.to_datetime("2024-06-27"))
& (h.ds < pd.to_datetime("2024-06-30"))
]
future = h[
(h.ds >= pd.to_datetime("2024-06-30")) & (h.ds <= future_end_date)
]
m = prophet_model
m.fit(train)
forecasts = m.predict(pd.DataFrame(future))
The issue seems to only arise when additional regressors are added, but the code works fine with Pandas 2.3.3. Is there any workaround where additional regressors can be preserved?
chamalgomes, V1ammer and SeijiSuenaga
Metadata
Metadata
Assignees
Labels
No labels