Statsmodels: NaN NA in the result of sm.tsa.seasonal_decompose components

Created on 17 Aug 2017  路  4Comments  路  Source: statsmodels/statsmodels

Hi there,
I got an issue when I used the seasonal_decompose function in statsmodels package in Python. I am using Python 2.7.13. The data is looks like this, showed below.
date
2016-01-01聽 聽 20.086905
2016-02-01聽 聽 20.071920
2016-03-01聽 聽 20.149253
2016-04-01聽 聽 20.045424
2016-05-01聽 聽 20.049403
2016-06-01聽 聽 20.066260
2016-07-01聽 聽 20.003315
2016-08-01聽 聽 20.022434
2016-09-01聽 聽 20.063003
2016-10-01聽 聽 19.989281
2016-11-01聽 聽 20.005214
2016-12-01聽 聽 20.209121
2017-01-01聽 聽 20.027342
2017-02-01聽 聽 19.941969
2017-03-01聽 聽 20.050094
2017-04-01聽 聽 19.956648
2017-05-01聽 聽 19.969304
2017-06-01聽 聽 19.977306
2017-07-01聽 聽 19.943466
Name: Values, dtype: float64

The code I used is this

import statsmodels.api as sm
res = sm.tsa.seasonal_decompose(data)
resplot = res.plot()
plt.show()

trend = decomposition.trend
seasonal = decomposition.seasonal
residual = decomposition.resid

And after I access to different components in the model. It shows a lot of NaN, like this:
Don't understand what happened. Find some blogs talks about the parameters setting about freq and filt. Not really understand how to set them up. Also not sure whether it is their issues.
Thank you very much.

Trend component:
date
2016-01-01 NaN
2016-02-01 NaN
2016-03-01 NaN
2016-04-01 NaN
2016-05-01 NaN
2016-06-01 NaN
2016-07-01 20.060979
2016-08-01 20.053083
2016-09-01 20.043537
2016-10-01 20.035706
2016-11-01 20.028669
2016-12-01 20.021626
2017-01-01 20.015425
2017-02-01 NaN
2017-03-01 NaN
2017-04-01 NaN
2017-05-01 NaN
2017-06-01 NaN
2017-07-01 NaN
Name: Values, dtype: float64

Seasonal component:
date
2016-01-01 NaN
2016-02-01 NaN
2016-03-01 NaN
2016-04-01 NaN
2016-05-01 NaN
2016-06-01 NaN
2016-07-01 NaN
2016-08-01 NaN
2016-09-01 NaN
2016-10-01 NaN
2016-11-01 NaN
2016-12-01 NaN
2017-01-01 NaN
2017-02-01 NaN
2017-03-01 NaN
2017-04-01 NaN
2017-05-01 NaN
2017-06-01 NaN
2017-07-01 NaN
Name: Values, dtype: float64

Residual component:
date
2016-01-01 NaN
2016-02-01 NaN
2016-03-01 NaN
2016-04-01 NaN
2016-05-01 NaN
2016-06-01 NaN
2016-07-01 NaN
2016-08-01 NaN
2016-09-01 NaN
2016-10-01 NaN
2016-11-01 NaN
2016-12-01 NaN
2017-01-01 NaN
2017-02-01 NaN
2017-03-01 NaN
2017-04-01 NaN
2017-05-01 NaN
2017-06-01 NaN
2017-07-01 NaN
Name: Values, dtype: float64

Original Data
date
2016-01-01 20.086905
2016-02-01 20.071920
2016-03-01 20.149253
2016-04-01 20.045424
2016-05-01 20.049403
2016-06-01 20.066260
2016-07-01 20.003315
2016-08-01 20.022434
2016-09-01 20.063003
2016-10-01 19.989281
2016-11-01 20.005214
2016-12-01 20.209121
2017-01-01 20.027342
2017-02-01 19.941969
2017-03-01 20.050094
2017-04-01 19.956648
2017-05-01 19.969304
2017-06-01 19.977306
2017-07-01 19.943466
Name: Values, dtype: float64

FAQ comp-tsa

Most helpful comment

I think this is all a consequence of your time series being too short. You need at least two and maybe more full cycles to estimate a seasonal pattern.
AFAIR, if the data is monthly, then by default a 12 month annual seasonality is assumed. The freq keyword can be used to override the default seasonal cycle length.

Half a cycle at each end is lost and set to nan because the filter has currently no special handling of endpoints. However, having all nans in the seasonal component comes most likely from the shortness of the time series.

I don't know and never checked what the minimum length for seasonal_decompose is, but unless there are several cycles, there is no way to distinguish seasonal fluctuations from other components including noise.

All 4 comments

I think this is all a consequence of your time series being too short. You need at least two and maybe more full cycles to estimate a seasonal pattern.
AFAIR, if the data is monthly, then by default a 12 month annual seasonality is assumed. The freq keyword can be used to override the default seasonal cycle length.

Half a cycle at each end is lost and set to nan because the filter has currently no special handling of endpoints. However, having all nans in the seasonal component comes most likely from the shortness of the time series.

I don't know and never checked what the minimum length for seasonal_decompose is, but unless there are several cycles, there is no way to distinguish seasonal fluctuations from other components including noise.

Hi, is there any possibility not to lose half a cycle at right end?
I need this to check if the right most point (the current measurement) is anomalous or no regarding to history.

Thanks.
Zaven.

@znavoyan This has been changed in the current master. Now trend extrapolation allow optional filling in of the nans at the front and end that were previously lost. #4007 and #4197

It still requires the minimum length to estimate the seasonal component as pointed out in this issue.

seaonal_decompose now requires two complete cycles, which is the same as R.

Was this page helpful?
0 / 5 - 0 ratings