Joss-reviews: [REVIEW]: rGUIDANCE – alignment confidence score computation in R

Created on 26 Mar 2019 · 63Comments · Source: openjournals/joss-reviews

Submitting author: @FranzKrah (Franz-Sebastian Krah)
Repository: https://github.com/FranzKrah/rGUIDANCE
Version: 1.0
Editor: @karthik
Reviewer: @shaunpwilkinson
Archive: 10.5281/zenodo.2654302

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca"><img src="http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca/status.svg)](http://joss.theoj.org/papers/b8b70cb72a3ec8331353795e3f83cfca)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@shaunpwilkinson , please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

Make sure you're logged in to your GitHub account
Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @karthik know.

✨ Please try and complete your review in the next two weeks ✨

Review checklist for @shaunpwilkinson

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: Does the release version given match the GitHub release (1.0)?
[x] Authorship: Has the submitting author (@FranzKrah) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

accepted published recommend-accept review

Source

whedon

Most helpful comment

Hi @labarba very sorry about the delay on this, I'm back now and working on this review, will have it done either this weekend or very early next week. Thanks for your patience, and apologies again - S

shaunpwilkinson on 27 Apr 2019

👍2

All 63 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @shaunpwilkinson it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

whedon on 26 Mar 2019

Attempting PDF compilation. Reticulating splines etc...

whedon on 26 Mar 2019

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 26 Mar 2019

@shaunpwilkinson Please follow instructions here and work through the checklist. Let me know if you have any questions. You can use Whedon to generate a new pdf anytime with @whedon generate pdf or check references with @whedon check references

karthik on 26 Mar 2019

@whedon check references

karthik on 26 Mar 2019

Attempting to check references...

whedon on 26 Mar 2019

```Reference check summary:

OK DOIs

10.1111/j.1461-0248.2009.01314.x is OK
10.1186/1471-2105-7-471 is OK
10.1371/journal.pone.0018093 is OK
10.1093/sysbio/syv033 is OK
10.1093/bioinformatics/btu033 is OK
10.1093/molbev/msq066 is OK
10.1093/nar/gkv318 is OK
10.1142/9789812776136_0003 is OK
10.1186/s12862-018-1229-7 is OK
10.1016/j.patrec.2005.10.010 is OK
10.1093/bioinformatics/bti623 is OK

MISSING DOIs

None

INVALID DOIs

None
```

whedon on 26 Mar 2019

👍1

I checked the proof version and I am not sure why but the Figure captions are not there...
Franz

FranzKrah on 27 Mar 2019

@whedon generate pdf

karthik on 28 Mar 2019

Attempting PDF compilation. Reticulating splines etc...

whedon on 28 Mar 2019

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 28 Mar 2019

Franz: Thanks for catching that. Your figure captions render locally for me but we'll work to get this sorted out before publication.

karthik on 28 Mar 2019

@shaunpwilkinson 👋 — I see we have no checked items yet on your checklist. When do you think you might get to this review? Thanks!

labarba on 15 Apr 2019

👍1

shaunpwilkinson on 27 Apr 2019

👍2

Thank you @shaunpwilkinson!

karthik on 27 Apr 2019

👍1

This is a very helpful package that fills a critical gap in the R workflow for phylogenetic analysis. The accompanying paper is clear and well written, and an example is included demonstrating the use of the package to improve the resolution of a phylogeny by factoring in column-wise alignment uncertainty.
The tutorial needs some refining and additional notes to help users through the process without encountering issues, particularly when calling the third-party executables (outlined below). The example also involves downloading sequences from NCBI, which means that a set of sequences downloaded may change from one day to the next. This caused me a few issues and required some troubleshooting. For example, the outgroup _Helvella aestivalis_ was not included in the returned sequence set when I ran the script, so the raxml calls produced an error. Also the names of the alignment caused raxml to fail with a rather enigmatic error message (it would be worth noting in the tutorial that the sequence names need to be truncated, not include special metacharacters, and all be unique). To avoid these issues I would suggest either including a pre-downloaded sequence data file with the package or use an existing sequence dataset such as the woodmouse alignment (ape package).

There is also a vital line missing from the example script, run(wd = wd) should be added before all_clusters <- read_phylota(wd).

Some other minor issues are included below:

The latest version of the ips package on CRAN is v0.0.7, but the rGUIDANCE package depends on v0.0.11. Please either add devtools::install_github("heibl/ips") to the example script or include the ips github repo in the ‘Remotes’ field of the DESCRIPTION file.

The example script includes local directories that won’t apply for other users – e.g. wd <- 'Documents/PhD/proj/low_priority/phylotaR/helvella'. I would advise just using '[path-to-directory]' or something similar. Alternatively rOpenSci may have a standard for setting working directory in example scripts. @karthik may be able to advise on this?

Consider changing the number of threads in the example to something more manageable for non-server users. I would advise changing the ncore argument to 1 for the purposes of the tutorial.

str_extract needs the stringr package installed; this should be added to the example script. In this and other calls it would be safer to use the double-colon operator to specify the package, since the function called may depend on the order that the various packages were loaded. I would also specify the full method where applicable too, for example use w <- phylogram::as.dendrogram.phylo(w) instead of w <- as.dendrogram(w).

On a final note, I found the manuscript provided a really helpful and intuitive guide to using R for phylogenetic analysis. In case of interest (and at the risk of a plug) the aphid package also does multiple sequence alignment within R and doesn't require any third party programs.

I hope this helps - thanks for the opportunity to review. I'm looking forward to including this package in my own workflows.

All the best,
Shaun

shaunpwilkinson on 28 Apr 2019

Thanks @shaunpwilkinson!
@FranzKrah Can you please make these fixes and respond here? I opened a couple of issues on your repo.

karthik on 28 Apr 2019

Hi Karthik,

I copy-pasted Shauns comments below and answered to each. We addressed all the comments, which improved the package! Thanks!

Cheers,
Franz

This is a very helpful package that fills a critical gap in the R workflow for phylogenetic analysis. The accompanying paper is clear and well written, and an example is included demonstrating the use of the package to improve the resolution of a phylogeny by factoring in column-wise alignment uncertainty.
The tutorial needs some refining and additional notes to help users through the process without encountering issues, particularly when calling the third-party executables (outlined below). The example also involves downloading sequences from NCBI, which means that a set of sequences downloaded may change from one day to the next. This caused me a few issues and required some troubleshooting. For example, the outgroup Helvella aestivalis was not included in the returned sequence set when I ran the script, so the raxml calls produced an error. Also the names of the alignment caused raxml to fail with a rather enigmatic error message (it would be worth noting in the tutorial that the sequence names need to be truncated, not include special metacharacters, and all be unique). To avoid these issues I would suggest either including a pre-downloaded sequence data file with the package or use an existing sequence dataset such as the woodmouse alignment (apepackage).
>> Thanks. We added a cautionary note at the beginning of the tutorial regarding the third-party executables. There we also provide URLs to the program websites where they can be downloaded as well.
Further, it is correct that the code might run into trouble given that sequences are constantly updated on GenBank. Therefore, we revised the tutorial and provided a pre-downloaded sequence data file, so that users are able run guidance without errors.

There is also a vital line missing from the example script, run(wd = wd) should be added before all_clusters <- read_phylota(wd).

>> Thanks. We added this line

Some other minor issues are included below:

The full list of dependencies should be included in the tutorial, along with links to the various repositories, optionally some basic installation instructions for Windows, Mac and Linux, and additional annotation/notes should be included to make it clear where paths to executables need to be modified. Some tips on finding the executables within the file system may also be helpful for some users.
>> Thanks. We added URLs to the external programs and shortly introduced to the used R packages (the most important ones). We also added information on what executables are and how to specify them. We also included URLs to tutorials how to find executables on Windows or Mac.

The latest version of the ips package on CRAN is v0.0.7, but the rGUIDANCE package depends on v0.0.11. Please either add devtools::install_github("heibl/ips") to the example script or include the ips github repo in the ‘Remotes’ field of the DESCRIPTION file.
>> We added Remotes: github::heibl/ips to the DESCRIPTION file

The example script includes local directories that won’t apply for other users – e.g. wd <- 'Documents/PhD/proj/low_priority/phylotaR/helvella'. I would advise just using '[path-to-directory]' or something similar. Alternatively rOpenSci may have a standard for setting working directory in example scripts. @karthik may be able to advise on this?
>> Sorry, we changed this to “[path-to-directory]”

Consider changing the number of threads in the example to something more manageable for non-server users. I would advise changing the ncore argument to 1 for the purposes of the tutorial.
>> We changed this to 1 thread.

str_extract needs the stringr package installed; this should be added to the example script. In this and other calls it would be safer to use the double-colon operator to specify the package, since the function called may depend on the order that the various packages were loaded. I would also specify the full method where applicable too, for example use w <- phylogram::as.dendrogram.phylo(w) instead of w <- as.dendrogram(w).
>> Thanks. We added the package name using the double-colon operator where applicable.

On a final note, I found the manuscript provided a really helpful and intuitive guide to using R for phylogenetic analysis. In case of interest (and at the risk of a plug) the aphid package also does multiple sequence alignment within R and doesn't require any third party programs.
>> We looked into this and would be happy to include this in further versions. As far as we could see, it is currently not possible to implement aphid::align because it does not allow to specify a guide tree. This is however, necessary within the GUIDANCE algorithm. Maybe you can change this and we can include this in a next version.

FranzKrah on 29 Apr 2019

@whedon check references

karthik on 29 Apr 2019

Attempting to check references...

whedon on 29 Apr 2019

```Reference check summary:

OK DOIs

10.1111/j.1461-0248.2009.01314.x is OK
10.1186/1471-2105-7-471 is OK
10.1371/journal.pone.0018093 is OK
10.1093/sysbio/syv033 is OK
10.1093/bioinformatics/btu033 is OK
10.1093/molbev/msq066 is OK
10.1093/nar/gkv318 is OK
10.1142/9789812776136_0003 is OK
10.1186/s12862-018-1229-7 is OK
10.1016/j.patrec.2005.10.010 is OK
10.1093/bioinformatics/bti623 is OK

MISSING DOIs

None

INVALID DOIs

None
```

whedon on 29 Apr 2019

@whedon generate pdf

karthik on 29 Apr 2019

Attempting PDF compilation. Reticulating splines etc...

whedon on 29 Apr 2019

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 29 Apr 2019

@FranzKrah The captions are there now (after the conclusions).

karthik on 29 Apr 2019

@whedon generate pdf

FranzKrah on 29 Apr 2019

Attempting PDF compilation. Reticulating splines etc...

whedon on 29 Apr 2019

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 29 Apr 2019

Everything looks good to me. @shaunpwilkinson Did you skip checking off on the installation instructions list item for a good reason? If not please check that off and I can proceed to acceptance. 🙏

karthik on 29 Apr 2019

@karthik done

shaunpwilkinson on 29 Apr 2019

👍1

@whedon accept

karthik on 29 Apr 2019

No archive DOI set. Exiting...

whedon on 29 Apr 2019

Oops. @FranzKrah Can you please archive the package on Zenodo and post a DOI here?

karthik on 29 Apr 2019

Sorry, wasn't aware of that...

10.5281/zenodo.2653906

FranzKrah on 29 Apr 2019

@whedon set 10.5281/zenodo.2653906 as archive

karthik on 30 Apr 2019

OK. 10.5281/zenodo.2653906 is the archive.

whedon on 30 Apr 2019

@FranzKrah Almost there. Can you edit the metadata on Zenodo to reflect you as the author (rather than just your GitHub handle)? And while you're there, also check other fields (esp title) to make sure they are also correct.

karthik on 30 Apr 2019

@karthik I am not sure how to do this. I tried changing the .zenodo.json file but got an error...

FranzKrah on 30 Apr 2019

@karthik Ok. My name is now changed in the metadata in Zenodo. I hope is what you meant...

FranzKrah on 30 Apr 2019

@FranzKrah One more change (i.e. the title). It currently has your GitHub handle in the software title too. Can you fix that as well? Current citation:

Franz Sebastian Krah, & Christoph Heibl. (2019, April 30). FranzKrah/rGUIDANCE v1.0.8 (Version v1.0.8). Zenodo. http://doi.org/10.5281/zenodo.2654272

Otherwise the citation for the software archive will also have your Github handle rather than just the name of your software. See a correct example

karthik on 30 Apr 2019

@karthik I changed the title but I cannot get rid of "FranzKrah/rGUIDANCE: ".
I tried already to include a .zenodo.json file in the root of my GitHub repo but this creates errors in Zenodo...

FranzKrah on 30 Apr 2019

@FranzKrah Can you just create a new Zenodo release from scratch? And make sure all the metadata look correct. Then I can update it here and proceed with accepting? Otherwise you'll have an incorrect citation.

karthik on 30 Apr 2019

@karthik No I do not see how this would be possible. I think it is ok as is...

FranzKrah on 30 Apr 2019

@whedon set 10.5281/zenodo.2654302 as archive

karthik on 30 Apr 2019

OK. 10.5281/zenodo.2654302 is the archive.

whedon on 30 Apr 2019

@FranzKrah When you go to the above linked DOI, click edit on the top right, then edit the metadata. Remove your GitHub user/repo from the title and save. Then it will generate a new DOI. You can post that here.

karthik on 30 Apr 2019

@karthik I changed the title and saved but DOI is the same:
10.5281/zenodo.2654302

FranzKrah on 30 Apr 2019

👍1

@whedon generate pdf

karthik on 30 Apr 2019

Attempting PDF compilation. Reticulating splines etc...

whedon on 30 Apr 2019

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 30 Apr 2019

@whedon accept

karthik on 30 Apr 2019

Attempting dry run of processing paper acceptance...

whedon on 30 Apr 2019

```Reference check summary:

OK DOIs

10.1111/j.1461-0248.2009.01314.x is OK
10.1186/1471-2105-7-471 is OK
10.1371/journal.pone.0018093 is OK
10.1093/sysbio/syv033 is OK
10.1093/bioinformatics/btu033 is OK
10.1093/molbev/msq066 is OK
10.1093/nar/gkv318 is OK
10.1142/9789812776136_0003 is OK
10.1186/s12862-018-1229-7 is OK
10.1016/j.patrec.2005.10.010 is OK
10.1093/bioinformatics/bti623 is OK

MISSING DOIs

None

INVALID DOIs

None
```

whedon on 30 Apr 2019

PDF failed to compile for issue #1350 with the following error:

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed

0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 13 0 13 0 0 143 0 --:--:-- --:--:-- --:--:-- 142
sh: 0: getcwd() failed: No such file or directory
sh: 0: getcwd() failed: No such file or directory
pandoc: 10.21105.joss.01350.pdf: openBinaryFile: does not exist (No such file or directory)
Looks like we failed to compile the PDF

whedon on 30 Apr 2019

@arfon Can you take a look at the error and proceed with accepting when resolved?

karthik on 30 Apr 2019

@whedon accept

arfon on 30 Apr 2019

Attempting dry run of processing paper acceptance...

whedon on 30 Apr 2019

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/648

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/648, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

whedon on 30 Apr 2019

```Reference check summary:

OK DOIs

10.1111/j.1461-0248.2009.01314.x is OK
10.1186/1471-2105-7-471 is OK
10.1371/journal.pone.0018093 is OK
10.1093/sysbio/syv033 is OK
10.1093/bioinformatics/btu033 is OK
10.1093/molbev/msq066 is OK
10.1093/nar/gkv318 is OK
10.1142/9789812776136_0003 is OK
10.1186/s12862-018-1229-7 is OK
10.1016/j.patrec.2005.10.010 is OK
10.1093/bioinformatics/bti623 is OK

MISSING DOIs

None

INVALID DOIs

None
```

whedon on 30 Apr 2019

@whedon accept deposit=true

arfon on 30 Apr 2019

Doing it live! Attempting automated processing of paper acceptance...

whedon on 30 Apr 2019

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/649
Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01350
If everything looks good, then close this review issue.
Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? notify your editorial technical team...

whedon on 30 Apr 2019

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01350/status.svg)](https://doi.org/10.21105/joss.01350)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01350">
  <img src="http://joss.theoj.org/papers/10.21105/joss.01350/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.01350/status.svg
   :target: https://doi.org/10.21105/joss.01350

This is how it will look in your documentation:

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Volunteering to review for us sometime in the future. You can add your name to the reviewer list here: http://joss.theoj.org/reviewer-signup.html
Making a small donation to support our running costs here: https://numfocus.salsalabs.org/donate-to-joss

whedon on 30 Apr 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[REVIEW]: Virtual Bumblebees

whedon · 12Comments

[REVIEW]: The Pulsar Signal Simulator: A Python package for simulating radio signal data from pulsars

whedon · 9Comments

[REVIEW]: hei: Calculate Healthy Eating Index (HEI) Scores

whedon · 12Comments

[REVIEW]: IBCAO_py: A matplotlib library for using the International Bathymetric Chart of the Arctic Ocean with cartopy and matplotlib

whedon · 12Comments

[REVIEW]: GRUPO: Gauging Research University Publication Output

whedon · 10Comments