I experienced a weird issue with tsfresh while working as usual within the Jupyter Lab/Notebook environment. Reproducing the example from the documentation, the call to
selected_features = tsfresh.extract_relevant_features(ts, y, column_id='id', column_sort='time')
seems to run to completion (100% progress), then outputs a large number of warnings and never returns. This means that no more notebook cells can be executed.
Feature Extraction: 100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 38/38 [00:04<00:00, 7.91it/s]
WARNING:tsfresh.utilities.dataframe_functions:The columns ['F_x__agg_linear_trend__f_agg_"max"__chunk_len_50__attr_"intercept"'
'F_x__agg_linear_trend__f_agg_"max"__chunk_len_50__attr_"rvalue"'
'F_x__agg_linear_trend__f_agg_"max"__chunk_len_50__attr_"slope"' ...
'T_z__fft_coefficient__coeff_9__attr_"imag"'
'T_z__fft_coefficient__coeff_9__attr_"real"'
'T_z__spkt_welch_density__coeff_8'] did not have any finite values. Filling with zeros.
WARNING:tsfresh.feature_selection.relevance:Infered classification as machine learning task
WARNING:tsfresh.feature_selection.relevance:[test_feature_significance] Feature F_x__agg_linear_trend__f_agg_"max"__chunk_len_10__attr_"stderr" is constant
WARNING:tsfresh.feature_selection.relevance:[test_feature_significance] Feature F_x__agg_linear_trend__f_agg_"max"__chunk_len_50__attr_"intercept" is constant
WARNING:tsfresh.feature_selection.relevance:[test_feature_significance] Feature F_x__agg_linear_trend__f_agg_"max"__chunk_len_50__attr_"rvalue" is constant
WARNING:tsfresh.feature_selection.relevance:[test_feature_significance] Feature F_x__agg_linear_trend__f_agg_"max"__chunk_len_50__attr_"slope" is constant
WARNING:tsfresh.feature_selection.relevance:[test_feature_significance] Feature F_x__agg_linear_trend__f_agg_"max"__chunk_len_50__attr_"stderr" is constant
....
It seems to be some kind of interaction between tsfresh and Jupyter, because the issue does not appear when the same code is run via the IPython shell.
Could the large number of log messages be a problem? Is it possible to deactivate them? One indication that the output may be an issue for Jupyter is that the Jupyter Notebook process keeps printing: IOStream.flush timed out
.
Hi @clstaudt,
we have a show_warnings
parameter in the extract_relevant_features
function, but that only suppresses the warnings during feature extraction, not during the subsequent selection.
Its an easy pr, will take care of that later.
In the meantime you can catch the warnings with
import warnings
warnings.simplefilter("ignore")
put that snippet in the top of your notebook where your imports are.
Just out of curiosity, on what kind of time series are you using tsfresh on?
Thanks. Will try that out and get back with the result. I tried suppressing the warnings before in a similar way, with no effect.
Just out of curiosity, on what kind of time series are you using tsfresh on?
I am preparing a report on the applicability and readiness of tsfresh for a client. I am using freely available data sets for now, and I have just discovered the Kepler explanet time series data that seems an interesting example. I'm throwing tsfresh at it right now. Any recommendations for data on which tsfresh is likely to work well are appreciated.
import warnings
warnings.simplefilter("ignore")
^^ does not suppress the output for me
(In the meantime, I built a workaround that serializes the arguments with pickle, calls tsfresh in a script and reads back the input... I'd rather work without this hack.)
Strangely, the issue appears with the Robot execution failures dataset but not with the Kepler dataset. This makes it look more like a bug in tsfresh, not a problem with Jupyter.
^^ does not suppress the output for me
I forgot the context manager, its actually
import warnings
with warnings.catch_warnings():
warnings.simplefilter("ignore")
selected_features = tsfresh.extract_relevant_features(ts, y, column_id='id', column_sort='time')
I forgot the context manager
Makes no difference. This is unexpected.
I had the same issue with another multivariate time series data set, namely https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/#turbofan
Can confirm as well, both with the robot example and with a custom dataset. Using provided context manager does not suppress warnings in jupyterlab, ipython, or default interpreter. Executing a cell in jupyterlab with either select_features
or extract_relevant_features
functions results in a hang - the function completes but does not return control of the kernel. HTH.
Python 3.5.2
ipython 6.4.0
jupyter-notebook 5.5.0
jupyter-lab 0.34.8
Got the same problem on function select_features
, Could anybody help ? :(
Got the same problem on function
select_features
, Could anybody help ? :(
@zhouwubai: I have not tried it in Jupyter, but in your main script which imports tsfresh (e.g. main.py) you can do
import tsfresh
import logging
logging.getLogger('tsfresh').setLevel(logging.ERROR)
to ignore ALL warnings coming from the module (might not be what you want).
Other than that, I took a brief look at the code and I don't think there's a variable to disable warnings in the feature selection part.
@khdlim Thank you..It works perfectly now.
The warning filter is currently broken because of the way we use the multiprocessing package. This will hopefully be fixed soon.
However, glad you found a solution!
Most helpful comment
@zhouwubai: I have not tried it in Jupyter, but in your main script which imports tsfresh (e.g. main.py) you can do
import tsfresh
import logging
logging.getLogger('tsfresh').setLevel(logging.ERROR)
to ignore ALL warnings coming from the module (might not be what you want).
Other than that, I took a brief look at the code and I don't think there's a variable to disable warnings in the feature selection part.