Joss-reviews: [REVIEW]: tsfeaturex: An R Package for Automating Time Series Feature Extraction

Created on 26 Feb 2019  ·  72Comments  ·  Source: openjournals/joss-reviews

Submitting author: @nelsonroque (Nelson Roque)
Repository: https://github.com/nelsonroque/tsfeaturex
Version: v.0.3.10
Editor: @xuanxu
Reviewer: @acolum, @aj2duncan
Archive: 10.5281/zenodo.3235466

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/aa9198d80b72aecc2418ad94f4e7ab1a"><img src="http://joss.theoj.org/papers/aa9198d80b72aecc2418ad94f4e7ab1a/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/aa9198d80b72aecc2418ad94f4e7ab1a/status.svg)](http://joss.theoj.org/papers/aa9198d80b72aecc2418ad94f4e7ab1a)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@acolum & @aj2duncan, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @xuanxu know.

Please try and complete your review in the next two weeks

Review checklist for @acolum

Conflict of interest

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Version: v.0.3.10
  • [x] Authorship: Has the submitting author (@nelsonroque) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Authors: Does the paper.md file include a list of authors with their affiliations?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @aj2duncan

Conflict of interest

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Version: v.0.3.10
  • [x] Authorship: Has the submitting author (@nelsonroque) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Authors: Does the paper.md file include a list of authors with their affiliations?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
accepted published recommend-accept review

All 72 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @acolum, it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
Attempting PDF compilation. Reticulating splines etc...

Here are some things you can ask me to do:

# List Whedon's capabilities
@whedon commands

# List of editor GitHub usernames
@whedon list editors

# List of reviewers together with programming language preferences and domain expertise
@whedon list reviewers

EDITORIAL TASKS

# Compile the paper
@whedon generate pdf

# Compile the paper from alternative branch
@whedon generate pdf from branch custom-branch-name

# Ask Whedon to check the references for missing DOIs
@whedon check references

Hi @nelsonroque, I'm having a little trouble with the example that you've got in the README.md of your repo. When I run

library(tsfeaturex)

# for reproducibility of this example
set.seed(516)

# create test data
dat <- data.frame(expand.grid(day=c(1:7),id=c(1:100)))
dat$y <- rnorm(nrow(dat),5,1.5)
dat$y[1:3] <- NA # introduce NAs to check

# run function
out.list <- extract_features(df=dat,group_var="id",value_var="y",features="all")

I get the error

 Error in packageVersion("featuRe") : package ‘featuRe’ not found 

I will keep working on my checklist but wanted to flag this up just now.

@nelsonroque ☝️

@nelsonroque — Did you get the initial comments to start working on?

@acolum, @aj2duncan — What's your status with the continued review of this submission?

There has actually been progress made, in that having noticed the package has been updated, I reinstalled it and reran the example code. This time I get a different error.

The example code currently reads

# load library
library(tsfeaturex)

# for reproducibility of this example
set.seed(516)

# create test data
dat <- data.frame(expand.grid(day=c(1:7),id=c(1:100)))
dat$y <- rnorm(nrow(dat),5,1.5)
dat$y[1:3] <- NA # introduce NAs to check

# run function
out.list <- extract_features(df=dat,group_var="id",value_var="y",features="all")

# convert list to data.frame (MapReduce)
final.df <- features_to_df(out.list, data.format="wide")

# get feature correlations
cor.df <- feature_correlations(final.df, data.format="wide")

# view results
View(final.df)

This gives errors about group_var and id_var within the functions final.df() and cor.d(). However, if I add these so the code becomes

# load library
library(tsfeaturex)

# for reproducibility of this example
set.seed(516)

# create test data
dat <- data.frame(expand.grid(day=c(1:7),id=c(1:100)))
dat$y <- rnorm(nrow(dat),5,1.5)
dat$y[1:3] <- NA # introduce NAs to check

# run function
out.list <- extract_features(df=dat,group_var="id",value_var="y",features="all")

# convert list to data.frame (MapReduce)
final.df <- features_to_df(out.list, data.format="wide", group_var = "id")

# get feature correlations
cor.df <- feature_correlations(final.df, data.format="wide", id_var = "id")

# view results
View(final.df)

The code runs and I only get a warning of:

Warning message:
In cor(.) : the standard deviation is zero

As the package is now working I will continue with the review. @nelsonroque could you please update the example code so it runs without error.

I am so sorry @labarba @aj2duncan @arfon I am just seeing this now. I had noticed the error pointed out post submission. I will be sure to update the examples in the next day. Again sorry for my delay.

@labarba @aj2duncan @arfon the example code has been modified to run without error. Thank you for your time and feedback!

@nelsonroque I’ve gone back though things and although I still need investigate the functionality in more detail I do have a few points for you to consider and maybe help me out with some queries.

In the paper:

  • I think there needs to be a doi added for the Christ et al paper
  • I also think that R and any packages that you mention should be cited too.
  • Finally (@acolum be much obliged on your opinion here) I’m not sure about the use of “long format and wide format” for a non-specialist user. @arfon, @labarba perhaps you could provide some guidance for me here.

In the documentation:

  • I can’t find a statement of need, although there is a very good one in the paper.
  • you’ve got two issue formats but I wonder what you think about whether the reporting of issues could be highlighted in the readme? Also, where does a user go for help?
  • You don’t seem to have used any automated testing (please do correct me if I’m wrong)? How do the users know if the code if working as intended?

@nelsonroque I second @aj2duncan's suggestions and can add a few more.

In the paper under the Figures section, I think you should elaborate more on your example. It would be beneficial to see an example (with code) of how tsfeaturex can calculate different features and characterize differences between two time series with the same mean but different shapes/peaks. I also think that you should clarify with a diagram and/or more detailed description what long and wide format data means for this package. This would benefit both specialist and non-specialist users.

In the documentation, the dependencies in the description file should also be added to the readme.

@aj2duncan @acolum thank you for all your feedback. I have addressed all points raised, including a new draft of the paper, with updated references, additions and changes to readme.md, a new docs/ folder on root with changelog and feature list in markdown format.

@acolum I had a question regarding your comment: "It would be beneficial to see an example (with code) of how tsfeaturex can calculate different features and characterize differences between two time series with the same mean but different shapes/peaks." I can easily simulate multiple data streams to have a common mean, and then calculate all features for each 'person' -- but how would you like me to visualize this? A graph like the one that exists now, with perhaps a data table underneath, showing each feature (currently would be 82 feature columns), and highlighting cells that are different?

@nelsonroque After more thought, I've realized that including the example in my comment wouldn't be necessary given all of the other detailed documentation you've included.

Overall, I really liked all of the changes you've recently made to your package, documentation, and paper. With these improvements, I was able to check off everything on my review list and am recommending this paper for publication.

Thanks for making all those changes @nelsonroque. I agree entirely with @acolum and I can now too recommend this for publication.

Thank you @acolum and @aj2duncan!

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon check references

Attempting to check references...

```Reference check summary:

OK DOIs

MISSING DOIs

  • None

INVALID DOIs

  • None
    ```

@nelsonroque Some needed minor changes and typos I found in the compiled pdf:

  • In the Statement of need section: I think this sentence need rewording to make it clearer:
    In raw, form the 2.5 quintillion bytes of raw data generated daily are difficult to interpret -- noisy time-series
  • The Zenodo archive will be linked on the first page of the paper, so we ask that you don't add an additional link in the text or references. Can you edit the (Roque, 2019) citation out?
  • In the Functionality section: There is an orphan closing parentheses in ...separately for each burst)
  • Figures 1 and 2 are misplaced among the text. Please write each of them in its own paragraph.

You can also regenerate the pdf anytime with @whedon generate pdf if you want to give the paper further proof-reads.

@xuanxu thank you for your feedback. I've addressed points 1-3 and will be testing formatting of Figures 1 / 2 in generated pdf.

@whedon generate pdf

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@xuanxu the figures are shifting to new locations, and not nesting under the heading as I have it in syntax, see below. Should I place Figures heading after Acknowledgements heading?

# Figures
Figure 1 depicts example `wide`(top) and `long`(bottom) data structures for a dataset containing two (2) measurements from two (2) individuals.  Notice that there is one row for each individual in the `wide` format, and two (2) rows for each individual in the `long` format, one for each column.

![Figure 1. Flexible data structure output -- request `long` or `wide` format](datashape.PNG "Figure 1. Flexible data structure output -- request `long` or `wide` format")

Figure 2 depicts sample time series data from two participants, both with mean value of 5. You will notice, although they have identical means, the shape of the time series, and locations of peaks is different. `tsfeaturex` calculates features to better characterize differences such as these.

![Figure 2. The mean doesn't tell the whole story](figure.png "Figure 2. The mean does not fully describe the time-series.")

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

the figures are shifting to new locations

I don't know why that's happening.
Maybe the image is relocated if there is not enough space left in that page? :thinking:

the figures are shifting to new locations

I don't know why that's happening.
Maybe the image is relocated if there is not enough space left in that page? 🤔

Thank you! I think it is fixed now -- the latest proof places figures centered on a single page and doesn't break up other sections.

@nelsonroque Great! Next steps:

  • After the 0.3.6 version there are almost 60 commits with changes made during the review process, can you release a new version of tsfeaturex?
  • Then please make a Zenodo archive and report the DOI here

@xuanxu thank you! I've released version 0.3.10 on Github and created a Zenodo archive:
DOI: 10.5281/zenodo.3235466

@aj2duncan @arfon @labarba thank you all for your feedback along this process!

@whedon set v.0.3.10 as version

OK. v.0.3.10 is the version.

@nelsonroque please edit the Zenodo archive metadata (title and author list) to match the paper (you can also add your orcid).

@xuanxu I added a .zenodo.json to the root of the repo on Github with updated author / title -- but not seeing this information in the release on Zenodo (0.3.10). I will create a new release now that should have the updated info.

@xuanxu here is what the .zenodo.json looks like -- not sure where I went wrong:
https://github.com/nelsonroque/tsfeaturex/compare/v.0.3.10...master

@nelsonroque You can edit the metadata (change title and authors) directly in the zenodo website

@xuanxu sorry for the trouble -- I got it! https://zenodo.org/record/3235466#.XPAJHohKiUk

@xuanxu sorry for the trouble

No trouble at all

I got it!

:tada:

@whedon set 10.5281/zenodo.3235466 as archive

OK. 10.5281/zenodo.3235466 is the archive.

This is ready!
Pinging EIC for publication: @openjournals/joss-eics

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

Hi @nelsonroque, just a few minor changes needed for the paper:

Hi @kyleniemeyer absolutely, all changes have been made!

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@nelsonroque it look like the affiliation info is still not quite complete—really I'm just looking for city, state, country (like what'd you'd typically see following department/center and university)

@kyleniemeyer sorry I was adding affiliation on Zenodo side. I have added it to the paper.md

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon accept

Attempting dry run of processing paper acceptance...

```Reference check summary:

OK DOIs

  • 10.1016/j.neucom.2018.03.067 is OK
  • 10.1037/a0014173 is OK
  • 10.18637/jss.v014.i06 is OK
  • 10.5334/jors.123 is OK
  • 10.1016/j.physrep.2011.05.003 is OK

MISSING DOIs

  • None

INVALID DOIs

  • None
    ```

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/723

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/723, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

@whedon accept deposit=true

Doing it live! Attempting automated processing of paper acceptance...

🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/724
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01279
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! 🎉🌈🦄💃👻🤘

    Any issues? notify your editorial technical team...

Congrats @nelsonroque on your submission's publication in JOSS! Thanks to @acolum and @aj2duncan for reviewing and @xuanxu for editing!

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01279/status.svg)](https://doi.org/10.21105/joss.01279)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01279">
  <img src="http://joss.theoj.org/papers/10.21105/joss.01279/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.01279/status.svg
   :target: https://doi.org/10.21105/joss.01279

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Was this page helpful?
0 / 5 - 0 ratings