1.OS: Ubuntu 18.04.1 LTS
I keep getting following error:
File "<ipython-input-122-1f1001a26d12>", line 1, in <module>
tsfresh.extract_features(df, column_id = "ID")
File "/home/matej/anaconda3/lib/python3.6/site-packages/tsfresh/feature_extraction/extraction.py", line 152, in extract_features
distributor=distributor)
File "/home/matej/anaconda3/lib/python3.6/site-packages/tsfresh/feature_extraction/extraction.py", line 233, in _do_extraction
function_kwargs=kwargs)
File "/home/matej/anaconda3/lib/python3.6/site-packages/tsfresh/utilities/distribution.py", line 149, in map_reduce
result = list(itertools.chain.from_iterable(result))
File "/home/matej/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1002, in __iter__
for obj in iterable:
File "/home/matej/anaconda3/lib/python3.6/multiprocessing/pool.py", line 735, in next
raise value
TypeError: Cannot cast array data from dtype('float64') to dtype('<U32') according to the rule 'safe'`
when I run:
import tsfresh
import pandas as pd
import numpy as np
a = np.arange(10)
df = pd.DataFrame(a,columns = ["value"])
ids = [1,1,1,1,1,5,5,5,5,5]
df['ID'] = ids
tsfresh.extract_features(df, column_id = "ID")
Same error occurs when I try to run it with real data.
Hi @matej14086
I have tried producing your reported issue with the versions you reported and I have not been able to reproduce the issue from the code snippet above. I tested with Python 3.6.4 and Python 3.5 and the code runs fine with tsfresh==0.11.0
and also tsfresh==0.11.1
along with numpy 1.14.1 through to numpy 1.16.0
The error suggests that numpy is complaining the data is not numeric but string. Could you try and run this is a normal python terminal (not ipython) and just check that you get the same result. Perhaps there is something unseen in a copy and paste. If you still get errors, could you also report what version of pandas you are using as well.
I downgraded pandas from version 0.24.0 to 0.23.4 and now everything works fine, so this is pandas issue.
Thanks for the help @earthgecko :)
Had the same issue
I created an issue about this on pandas project. https://github.com/pandas-dev/pandas/issues/25087
Thanks for spotting this, will limit pandas version to 0.23.4
same issue, with pandas it works 0.23.4
So, using pandas 0.23.4 fixes it for you?
@MaxBenChrist yes it does! :)
I am having same issue, the only problem is that I already have pandas 0.23.4 ....
I am having same issue, the only problem is that I already have pandas 0.23.4 ....
Can you please post the output of pandas.show_versions()
here?
I am also having the same issue. My pandas version is 0.24.2. Is there any solution instead of downgrading?
Aside -- For time-series length < 1000. It seems to be working for me (atleast).
Same issue - can confirm that downgrading to 23.4 fixes it. Love tracing pandas bugs, i somehow already found 2 this quarter.
Same issue- Solved by downgrading to pandas 0.23.4. My tsfresh version is 0.11.2. numpy version is 1.16.3.
The example works perfectly well for me with
>>> pd.__version__, np.__version__, tsfresh.__version__
('0.25.1', '1.16.2', '0.12.0')
We have excluded the "bad" pandas versions and are now on a newer one.
Most helpful comment
I downgraded pandas from version 0.24.0 to 0.23.4 and now everything works fine, so this is pandas issue.
Thanks for the help @earthgecko :)