Joss-reviews: [REVIEW]: rGUIDANCE – alignment confidence score computation in R

Created on 26 Mar 2019  Β·  63Comments  Β·  Source: openjournals/joss-reviews

Submitting author: @FranzKrah (Franz-Sebastian Krah)
Repository: https://github.com/FranzKrah/rGUIDANCE
Version: 1.0
Editor: @karthik
Reviewer: @shaunpwilkinson
Archive: 10.5281/zenodo.2654302

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca"><img src="http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca/status.svg)](http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@shaunpwilkinson , please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @karthik know.

✨ Please try and complete your review in the next two weeks ✨

Review checklist for @shaunpwilkinson

Conflict of interest

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Version: Does the release version given match the GitHub release (1.0)?
  • [x] Authorship: Has the submitting author (@FranzKrah) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Authors: Does the paper.md file include a list of authors with their affiliations?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
accepted published recommend-accept review

Most helpful comment

Hi @labarba very sorry about the delay on this, I'm back now and working on this review, will have it done either this weekend or very early next week. Thanks for your patience, and apologies again - S

All 63 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @shaunpwilkinson it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
Attempting PDF compilation. Reticulating splines etc...

@shaunpwilkinson Please follow instructions here and work through the checklist. Let me know if you have any questions. You can use Whedon to generate a new pdf anytime with @whedon generate pdf or check references with @whedon check references

@whedon check references

Attempting to check references...

```Reference check summary:

OK DOIs

  • 10.1111/j.1461-0248.2009.01314.x is OK
  • 10.1186/1471-2105-7-471 is OK
  • 10.1371/journal.pone.0018093 is OK
  • 10.1093/sysbio/syv033 is OK
  • 10.1093/bioinformatics/btu033 is OK
  • 10.1093/molbev/msq066 is OK
  • 10.1093/nar/gkv318 is OK
  • 10.1142/9789812776136_0003 is OK
  • 10.1186/s12862-018-1229-7 is OK
  • 10.1016/j.patrec.2005.10.010 is OK
  • 10.1093/bioinformatics/bti623 is OK

MISSING DOIs

  • None

INVALID DOIs

  • None
    ```

I checked the proof version and I am not sure why but the Figure captions are not there...
Franz

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

Franz: Thanks for catching that. Your figure captions render locally for me but we'll work to get this sorted out before publication.

@shaunpwilkinson πŸ‘‹ β€” I see we have no checked items yet on your checklist. When do you think you might get to this review? Thanks!

Hi @labarba very sorry about the delay on this, I'm back now and working on this review, will have it done either this weekend or very early next week. Thanks for your patience, and apologies again - S

Thank you @shaunpwilkinson!

This is a very helpful package that fills a critical gap in the R workflow for phylogenetic analysis. The accompanying paper is clear and well written, and an example is included demonstrating the use of the package to improve the resolution of a phylogeny by factoring in column-wise alignment uncertainty.
The tutorial needs some refining and additional notes to help users through the process without encountering issues, particularly when calling the third-party executables (outlined below). The example also involves downloading sequences from NCBI, which means that a set of sequences downloaded may change from one day to the next. This caused me a few issues and required some troubleshooting. For example, the outgroup _Helvella aestivalis_ was not included in the returned sequence set when I ran the script, so the raxml calls produced an error. Also the names of the alignment caused raxml to fail with a rather enigmatic error message (it would be worth noting in the tutorial that the sequence names need to be truncated, not include special metacharacters, and all be unique). To avoid these issues I would suggest either including a pre-downloaded sequence data file with the package or use an existing sequence dataset such as the woodmouse alignment (ape package).

There is also a vital line missing from the example script, run(wd = wd) should be added before all_clusters <- read_phylota(wd).

Some other minor issues are included below:

The full list of dependencies should be included in the tutorial, along with links to the various repositories, optionally some basic installation instructions for Windows, Mac and Linux, and additional annotation/notes should be included to make it clear where paths to executables need to be modified. Some tips on finding the executables within the file system may also be helpful for some users.

The latest version of the ips package on CRAN is v0.0.7, but the rGUIDANCE package depends on v0.0.11. Please either add devtools::install_github("heibl/ips") to the example script or include the ips github repo in the β€˜Remotes’ field of the DESCRIPTION file.

The example script includes local directories that won’t apply for other users – e.g. wd <- 'Documents/PhD/proj/low_priority/phylotaR/helvella'. I would advise just using '[path-to-directory]' or something similar. Alternatively rOpenSci may have a standard for setting working directory in example scripts. @karthik may be able to advise on this?

Consider changing the number of threads in the example to something more manageable for non-server users. I would advise changing the ncore argument to 1 for the purposes of the tutorial.

str_extract needs the stringr package installed; this should be added to the example script. In this and other calls it would be safer to use the double-colon operator to specify the package, since the function called may depend on the order that the various packages were loaded. I would also specify the full method where applicable too, for example use w <- phylogram::as.dendrogram.phylo(w) instead of w <- as.dendrogram(w).

On a final note, I found the manuscript provided a really helpful and intuitive guide to using R for phylogenetic analysis. In case of interest (and at the risk of a plug) the aphid package also does multiple sequence alignment within R and doesn't require any third party programs.

I hope this helps - thanks for the opportunity to review. I'm looking forward to including this package in my own workflows.

All the best,
Shaun

Thanks @shaunpwilkinson!
@FranzKrah Can you please make these fixes and respond here? I opened a couple of issues on your repo.

Hi Karthik,

I copy-pasted Shauns comments below and answered to each. We addressed all the comments, which improved the package! Thanks!

Cheers,
Franz

This is a very helpful package that fills a critical gap in the R workflow for phylogenetic analysis. The accompanying paper is clear and well written, and an example is included demonstrating the use of the package to improve the resolution of a phylogeny by factoring in column-wise alignment uncertainty.
The tutorial needs some refining and additional notes to help users through the process without encountering issues, particularly when calling the third-party executables (outlined below). The example also involves downloading sequences from NCBI, which means that a set of sequences downloaded may change from one day to the next. This caused me a few issues and required some troubleshooting. For example, the outgroup Helvella aestivalis was not included in the returned sequence set when I ran the script, so the raxml calls produced an error. Also the names of the alignment caused raxml to fail with a rather enigmatic error message (it would be worth noting in the tutorial that the sequence names need to be truncated, not include special metacharacters, and all be unique). To avoid these issues I would suggest either including a pre-downloaded sequence data file with the package or use an existing sequence dataset such as the woodmouse alignment (apepackage).
>> Thanks. We added a cautionary note at the beginning of the tutorial regarding the third-party executables. There we also provide URLs to the program websites where they can be downloaded as well.
Further, it is correct that the code might run into trouble given that sequences are constantly updated on GenBank. Therefore, we revised the tutorial and provided a pre-downloaded sequence data file, so that users are able run guidance without errors.

There is also a vital line missing from the example script, run(wd = wd) should be added before all_clusters <- read_phylota(wd).

>> Thanks. We added this line

Some other minor issues are included below:

The full list of dependencies should be included in the tutorial, along with links to the various repositories, optionally some basic installation instructions for Windows, Mac and Linux, and additional annotation/notes should be included to make it clear where paths to executables need to be modified. Some tips on finding the executables within the file system may also be helpful for some users.
>> Thanks. We added URLs to the external programs and shortly introduced to the used R packages (the most important ones). We also added information on what executables are and how to specify them. We also included URLs to tutorials how to find executables on Windows or Mac.

The latest version of the ips package on CRAN is v0.0.7, but the rGUIDANCE package depends on v0.0.11. Please either add devtools::install_github("heibl/ips") to the example script or include the ips github repo in the β€˜Remotes’ field of the DESCRIPTION file.
>> We added Remotes: github::heibl/ips to the DESCRIPTION file

The example script includes local directories that won’t apply for other users – e.g. wd <- 'Documents/PhD/proj/low_priority/phylotaR/helvella'. I would advise just using '[path-to-directory]' or something similar. Alternatively rOpenSci may have a standard for setting working directory in example scripts. @karthik may be able to advise on this?
>> Sorry, we changed this to β€œ[path-to-directory]”

Consider changing the number of threads in the example to something more manageable for non-server users. I would advise changing the ncore argument to 1 for the purposes of the tutorial.
>> We changed this to 1 thread.

str_extract needs the stringr package installed; this should be added to the example script. In this and other calls it would be safer to use the double-colon operator to specify the package, since the function called may depend on the order that the various packages were loaded. I would also specify the full method where applicable too, for example use w <- phylogram::as.dendrogram.phylo(w) instead of w <- as.dendrogram(w).
>> Thanks. We added the package name using the double-colon operator where applicable.

On a final note, I found the manuscript provided a really helpful and intuitive guide to using R for phylogenetic analysis. In case of interest (and at the risk of a plug) the aphid package also does multiple sequence alignment within R and doesn't require any third party programs.
>> We looked into this and would be happy to include this in further versions. As far as we could see, it is currently not possible to implement aphid::align because it does not allow to specify a guide tree. This is however, necessary within the GUIDANCE algorithm. Maybe you can change this and we can include this in a next version.

@whedon check references

Attempting to check references...

```Reference check summary:

OK DOIs

  • 10.1111/j.1461-0248.2009.01314.x is OK
  • 10.1186/1471-2105-7-471 is OK
  • 10.1371/journal.pone.0018093 is OK
  • 10.1093/sysbio/syv033 is OK
  • 10.1093/bioinformatics/btu033 is OK
  • 10.1093/molbev/msq066 is OK
  • 10.1093/nar/gkv318 is OK
  • 10.1142/9789812776136_0003 is OK
  • 10.1186/s12862-018-1229-7 is OK
  • 10.1016/j.patrec.2005.10.010 is OK
  • 10.1093/bioinformatics/bti623 is OK

MISSING DOIs

  • None

INVALID DOIs

  • None
    ```

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@FranzKrah The captions are there now (after the conclusions).

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

Everything looks good to me. @shaunpwilkinson Did you skip checking off on the installation instructions list item for a good reason? If not please check that off and I can proceed to acceptance. πŸ™

@karthik done

@whedon accept

No archive DOI set. Exiting...

Oops. @FranzKrah Can you please archive the package on Zenodo and post a DOI here?

Sorry, wasn't aware of that...

10.5281/zenodo.2653906

@whedon set 10.5281/zenodo.2653906 as archive

OK. 10.5281/zenodo.2653906 is the archive.

@FranzKrah Almost there. Can you edit the metadata on Zenodo to reflect you as the author (rather than just your GitHub handle)? And while you're there, also check other fields (esp title) to make sure they are also correct.

@karthik I am not sure how to do this. I tried changing the .zenodo.json file but got an error...

@karthik Ok. My name is now changed in the metadata in Zenodo. I hope is what you meant...

@FranzKrah One more change (i.e. the title). It currently has your GitHub handle in the software title too. Can you fix that as well? Current citation:

Franz Sebastian Krah, & Christoph Heibl. (2019, April 30). FranzKrah/rGUIDANCE v1.0.8 (Version v1.0.8). Zenodo. http://doi.org/10.5281/zenodo.2654272

Otherwise the citation for the software archive will also have your Github handle rather than just the name of your software. See a correct example

@karthik I changed the title but I cannot get rid of "FranzKrah/rGUIDANCE: ".
I tried already to include a .zenodo.json file in the root of my GitHub repo but this creates errors in Zenodo...

@FranzKrah Can you just create a new Zenodo release from scratch? And make sure all the metadata look correct. Then I can update it here and proceed with accepting? Otherwise you'll have an incorrect citation.

@karthik No I do not see how this would be possible. I think it is ok as is...

@whedon set 10.5281/zenodo.2654302 as archive

OK. 10.5281/zenodo.2654302 is the archive.

@FranzKrah When you go to the above linked DOI, click edit on the top right, then edit the metadata. Remove your GitHub user/repo from the title and save. Then it will generate a new DOI. You can post that here.

@karthik I changed the title and saved but DOI is the same:
10.5281/zenodo.2654302

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon accept

Attempting dry run of processing paper acceptance...

```Reference check summary:

OK DOIs

  • 10.1111/j.1461-0248.2009.01314.x is OK
  • 10.1186/1471-2105-7-471 is OK
  • 10.1371/journal.pone.0018093 is OK
  • 10.1093/sysbio/syv033 is OK
  • 10.1093/bioinformatics/btu033 is OK
  • 10.1093/molbev/msq066 is OK
  • 10.1093/nar/gkv318 is OK
  • 10.1142/9789812776136_0003 is OK
  • 10.1186/s12862-018-1229-7 is OK
  • 10.1016/j.patrec.2005.10.010 is OK
  • 10.1093/bioinformatics/bti623 is OK

MISSING DOIs

  • None

INVALID DOIs

  • None
    ```

PDF failed to compile for issue #1350 with the following error:

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed

0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 13 0 13 0 0 143 0 --:--:-- --:--:-- --:--:-- 142
sh: 0: getcwd() failed: No such file or directory
sh: 0: getcwd() failed: No such file or directory
pandoc: 10.21105.joss.01350.pdf: openBinaryFile: does not exist (No such file or directory)
Looks like we failed to compile the PDF

@arfon Can you take a look at the error and proceed with accepting when resolved?

@whedon accept

Attempting dry run of processing paper acceptance...

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/648

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/648, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

```Reference check summary:

OK DOIs

  • 10.1111/j.1461-0248.2009.01314.x is OK
  • 10.1186/1471-2105-7-471 is OK
  • 10.1371/journal.pone.0018093 is OK
  • 10.1093/sysbio/syv033 is OK
  • 10.1093/bioinformatics/btu033 is OK
  • 10.1093/molbev/msq066 is OK
  • 10.1093/nar/gkv318 is OK
  • 10.1142/9789812776136_0003 is OK
  • 10.1186/s12862-018-1229-7 is OK
  • 10.1016/j.patrec.2005.10.010 is OK
  • 10.1093/bioinformatics/bti623 is OK

MISSING DOIs

  • None

INVALID DOIs

  • None
    ```

@whedon accept deposit=true

Doing it live! Attempting automated processing of paper acceptance...

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/649
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01350
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! πŸŽ‰πŸŒˆπŸ¦„πŸ’ƒπŸ‘»πŸ€˜

    Any issues? notify your editorial technical team...

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01350/status.svg)](https://doi.org/10.21105/joss.01350)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01350">
  <img src="http://joss.theoj.org/papers/10.21105/joss.01350/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.01350/status.svg
   :target: https://doi.org/10.21105/joss.01350

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Was this page helpful?
0 / 5 - 0 ratings