Tsfresh: Tsfresh features for unsupervised clustering

Created on 29 Apr 2020  路  7Comments  路  Source: blue-yonder/tsfresh

Dear All,

I have extracted the features using tsfresh for accelerometer signal data.
Can anyone please help me which clustering algorithm to be used for tsfresh features.

Regards,
Sukanya

All 7 comments

Hi @Sukanya2191 !
It seems no one in the community has any real ideas to share here!
Unfortunately, I can also not really help you. Other than: you will need to try it out (as with all ML methods) and probably do some grid-search. I would not expect there is anything special on clustering tsfresh data in contrast to any other ML clustering problem.

Did you maybe in the meantime already came up with a solution that you want to share?

Hi @nils-braun

I am attempting some clustering and will share the method. In the meanwhile, is there a way to extract relevant features when y is not defined? Just wondering if I am missing something or if y is required to compute relevance. Thanks!

Hi @e5k !
That would be much appreciated - thanks!

No, it is impossible to extract relevant features without knowing the target. Not because it is not implemented in tsfresh, but because it is not possible: when the target is (yet) unknown, a relevance of the feature is undefined (think about it this way: a feature is relevant for one target, but could be irrelevant for another target. If the target is unknown, how should we know if it is relevant or not?).
That being said, if you only want to know if the features for example describe your data well, you could try to create a forecasting task out of your data. Your target would then be the value at the next (time)step and you could use this for feature selection. However, depending on your real problem/target, that might not help.

In order to clusterize timeseries you could have a look at pyts and tslearn. Behind both these packages there is the same guy @johannfaouzi and in this paper there is an introduction and comparative of several time series packages in python. tslearn do the clustering using classical ML algorithms and it looks to me it's applying them directly to the time series without extract features before.

@rpanai Thanks a lot for these references Roberto

pyts is focused on time series classification only and does not provide utilities for clustering. tslearn has a clustering submodule with several algorithms.

@e5k Did you had a chance to try out clustering with tsfresh or were you using one of the libraries proposed in the other comments?

Was this page helpful?
0 / 5 - 0 ratings