Joss-reviews: [REVIEW]: SeroTools: a Python package for Salmonella serotype data analysis

Created on 7 Aug 2020  ยท  55Comments  ยท  Source: openjournals/joss-reviews

Submitting author: @jdbaugher (Joseph D Baugher)
Repository: https://github.com/CFSAN-Biostatistics/SeroTools
Version: 0.2.1
Editor: @majensen
Reviewer: @amoeba, @Maghnuso
Archive: 10.5281/zenodo.4015335

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/5760d608a6b50f1bbab641e2089fed77"><img src="https://joss.theoj.org/papers/5760d608a6b50f1bbab641e2089fed77/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/5760d608a6b50f1bbab641e2089fed77/status.svg)](https://joss.theoj.org/papers/5760d608a6b50f1bbab641e2089fed77)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@amoeba & @Maghnuso, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @majensen know.

โœจ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest โœจ

Review checklist for @amoeba

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@jdbaugher) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @Maghnuso

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@jdbaugher) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
Makefile Python TeX accepted published recommend-accept review

Most helpful comment

@majensen @Maghnuso Ok understood. I have updated the Usage documentation with clarification as discussed - https://serotools.readthedocs.io/en/latest/usage.html#cluster

All 55 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @amoeba, @Maghnuso it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews ๐Ÿ˜ฟ

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
Reference check summary:

OK DOIs

- 10.1038/s41598-020-61254-1 is OK
- 10.1128/genomeA.00215-15 is OK
- 10.3389/fmicb.2020.00549 is OK
- 10.3389/fmicb.2019.02554 is OK
- 10.1016/j.fm.2020.103452 is OK
- 10.1016/j.resmic.2009.10.002 is OK
- 10.3389/fmicb.2018.02993 is OK
- 10.1016/j.resmic.2014.07.004 is OK
- 10.1128/JCM.00008-15 is OK
- 10.1017/s0022172400034677 is OK
- 10.3389/fmicb.2019.01591 is OK
- 10.1007/978-1-4939-9000-9_17 is OK
- 10.1128/AEM.02265-19 is OK
- 10.1128/JCM.00190-19 is OK
- 10.1128/AEM.00165-19 is OK
- 10.3389/fmicb.2017.01044 is OK
- 10.1371/journal.pone.0147101 is OK
- 10.1128/JCM.00323-15 is OK
- 10.1128/AEM.01746-19 is OK

MISSING DOIs

- https://doi.org/10.1016/b978-0-12-387730-7.00020-6 may be missing for title: Methods in Microbiology.

INVALID DOIs

- None

@whedon commands

@Maghnuso do you read me, over?

@Maghnuso one more time

Loud and clear

Hey @majensen & @jdbaugher: I sat down to start my review and ran into two things I think it'd be good to address before I get farther in:

1) The submission is _not_ licensed using an OSI-approved license but is licensed with a fairly permission but custom license. @jdbaugher Is my assessment right and is there any chance you have the ability to change the licensing here given the origin of the work?

2) Re: "Substantial scholarly effort", my take is that this submission is right at the breakpoint per the submission guidelines. I see the previous discussion over in https://github.com/openjournals/joss-reviews/issues/2503 which seems to imply we're all good but I wanted to check. @majensen can you give me a ๐Ÿ‘ if you think the submission satisfies the "Substantial scholarly effort" criterion specifically? The project, as assessed via the git commit history, appears to be under or near three months of effort and the core of the software (serotools.py) is only 629 lines according to cloc.

Thanks @amoeba. The license is a good catch- @jdbaugher, this is a hard requirement for JOSS; can you review Open Source Initiative-approved licenses to see if you would be willing to apply one of these? If there are special circumstances to discuss, please reach out to me directly (maj -dot- fortinbras -at- gmail -dot- com).

As for scholarly effort, I am satisfied with this work on that score, because lines of code belie the slog through data and the grokking of standard nomenclature to create a means for enabling others to avoid having to do same. That IMO is scholarly work that does not show up in a code base. On the user side, this work has the potential to enable scholarly efforts that may not otherwise be attempted, for example, by labs that don't have the expertise to develop such a tool.
So @amoeba ๐Ÿ‘ !

@amoeba @jdbaugher I've finally reviewed the license.

First, I state that I am not an attorney, and the following is merely an interpretation based the question of appropriateness for JOSS publication, and does not represent an opinion on any legal matter.

The SeroTools license as presented is not unusual in government-created open software, in my professional experience -- and also has a accepted precedent in JOSS here.

However, there is a stronger argument for this, in that the SeroTools license can be construed as a rather less detailed version of an OSI-approved license, the NASA Open Source Agreement v1.3 (NASA-1.3). This is written in a general format that allows any Government Agency to use it. The SeroTools license, in my view, can be interpreted as NASA-1.3, with FDA as the agency, @jdbaugher as the contact, section 3B using the 3rd alternative paragraph (no copyright), section 3F using the second alternative paragraph (voluntary user registration), and sections 4A and B clearly covering the "no warranty" and "no government endorsement" points in the SeroTools license.

In light of this, I am prepared to go forward with the assumption that the license requirement has been met, and will leave it to an associate editor-in-chief to complain.

Thanks for diving into this, @majensen. Your analysis seems reasonable to me so I'll go ahead and check this off. At this point my review is nearly an Accept and just waits on @jdbaugher's last changes Re: https://github.com/CFSAN-Biostatistics/SeroTools/issues/1. Will keep you updated.

Thanks @amoeba and @majensen for identifying and resolving the potential issues of substantial scholarly effort and whether or not the license requirement has been met. I concur on both as they currently stand. My apologies too for joining the process a little late, planning to evaluate SeroTools today and tomorrow. Best, Maghnus

Hey @majensen, my review is now an Accept as all items on my checklist are completed.

@amoeba Thanks as usual for your careful review!

Hi @jdbaugher and @majensen, I have reviewed the SeroTools package and manuscript. It's very clearly laid out and as a microbiologist with very little experience using Python or similar packages, I largely found it easy to install and execute. I commend the effort you have made to make this a useful and accessible tool for both computational and non-computational biologists. The paper is well written and SeroTools is clearly a valuable addition for those who wish to analyze serotyping data using the WKL scheme that is also amenable to updates. During my use of the software, I did come across a few small issues that might also be encountered by other biologists like me with limited coding/command line skills:

  1. For the 'cluster' command, it was not clear to me that I could not input the commands directly, but rather had to create a .txt file of the tab-delineated commands. When I did this, there was also an error that the input commands were not separated by one tab, so I had to correct that, too (I use a Mac, in case that is part of the issue). Some clear instruction here would be helpful to get novice users past these small barriers.
  2. The following links did not work for me (at Docs ยป Repository):
    An Excel spreadsheet (White-Kauffman-LeMinor-Scheme.xlsx)
    A tab-delimited text file (White-Kauffman-LeMinor_scheme.tsv)

SeroTools is a very useful software and after a little practice, was easy to use for a biologist with little to no experience in command line prompts. The minimally congruent designation is an important consideration and addition, to the other more intuitive designations that are possible.

@whedon remind @majensen in 4 days

Reminder set for @majensen in 4 days

Thanks @Maghnuso for your time and expertise! I have corrected the broken documentation hyperlinks at https://serotools.readthedocs.io/en/latest/repository.html. Thanks for catching those! I am not sure how to further clarify the usage instructions for the cluster command documented at https://serotools.readthedocs.io/en/latest/usage.html#cluster.

@jdbaugher I think all that is needed here (based on discussions with @Maghnuso ) is connecting the dots a little more. That is, to explicitly say that the data in the code box under the "Input" header is actually _in_ a text file, and then that the text file's name is what goes at in the command

serotools cluster -i <input_file>

This seems _almost_ obvious - but we found for microbiologist just getting started in computational analysis, the current doc wasn't quite enough.

I think you could almost do

Input File: test.txt

   cluster1    Dunkwa
   cluster1    Dunkwa
   cluster1    Utah
   cluster2    Hull

$ serotools cluster -i test.txt

@majensen @Maghnuso Ok understood. I have updated the Usage documentation with clarification as discussed - https://serotools.readthedocs.io/en/latest/usage.html#cluster

Thanks @majensen for clarifying my request and @jdbaugher for the changes in usage documentation, they will help the likes of me get full value from the software! All of my boxes are checked and I'm happy to recommend "Accept". -Maghnus

@whedon check references

@whedon generate pdf

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1038/s41598-020-61254-1 is OK
- 10.1128/genomeA.00215-15 is OK
- 10.3389/fmicb.2020.00549 is OK
- 10.3389/fmicb.2019.02554 is OK
- 10.1016/j.fm.2020.103452 is OK
- 10.1016/j.resmic.2009.10.002 is OK
- 10.1016/S0580-9517(08)70355-6 is OK
- 10.3389/fmicb.2018.02993 is OK
- 10.1016/j.resmic.2014.07.004 is OK
- 10.1128/JCM.00008-15 is OK
- 10.1017/s0022172400034677 is OK
- 10.3389/fmicb.2019.01591 is OK
- 10.1007/978-1-4939-9000-9_17 is OK
- 10.1128/AEM.02265-19 is OK
- 10.1128/JCM.00190-19 is OK
- 10.1128/AEM.00165-19 is OK
- 10.3389/fmicb.2017.01044 is OK
- 10.1371/journal.pone.0147101 is OK
- 10.1128/JCM.00323-15 is OK
- 10.1128/AEM.01746-19 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@jdbaugher we're closing in. I've looked over the paper itself. The text looks fine to me.

Can you consider the following improvements to the bibliography:

  • Please format the text in the references so that all Latin binomials are appropriately capitalized/italicized (e.g. _Escherichia coli_, _Salmonella_), and serovar names (e.g. Lubbock) are capitalized.
  • Please format same so that proper names (e.g. White-Kauffmann-Le Minor) and acronyms (e.g. UGA) are capitalized.
  • Henriksen ref is pretty long in the tooth. That's ok, but I would maybe include refs to (e.g.) the Pneumo, Salmonella, E. coli chapters in Man Clin Microbiol. The point is, conveying that serotypes are not obsolete, although they may be assessed in new, non-serological ways.

thx

@majensen I have updated the bibliography to correct (hopefully) all of the issues mentioned (and caught some other conversion issues) and modified the first paragraph as suggested. Definitely looking better! Thanks!

@whedon generate pdf

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@whedon commands

@jdbaugher oops, I'm jumping the gun here. Can I first ask you to create an archive of your repo using Zenodo, FigShare or similar? Then please report the DOI of the archive back here. Let me know if you need any help with this.

@majensen Zenodo archive created.
DOI: 10.5281/zenodo.4015335
Version: 0.2.1

@whedon set version as 0.2.1

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@whedon commands

@whedon set 0.2.1 as version

OK. 0.2.1 is the version.

@whedon set 10.5281/zenodo.4015335 as archive

OK. 10.5281/zenodo.4015335 is the archive.

Thanks @jdbaugher - could you do one thing, and name the archive with the name of the paper "SeroTools: a Python package for Salmonella serotype data analysis"? This strongly desired by JOSS.

@majensen I edited the 'title' of the archive as requested. Is it set up correctly now?

Perfect, let's pull the handle...

@whedon accept

Attempting dry run of processing paper acceptance...
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1038/s41598-020-61254-1 is OK
- 10.1128/genomeA.00215-15 is OK
- 10.3389/fmicb.2020.00549 is OK
- 10.3389/fmicb.2019.02554 is OK
- 10.1016/j.fm.2020.103452 is OK
- 10.1016/j.resmic.2009.10.002 is OK
- 10.1016/S0580-9517(08)70355-6 is OK
- 10.3389/fmicb.2018.02993 is OK
- 10.1016/j.resmic.2014.07.004 is OK
- 10.1128/JCM.00008-15 is OK
- 10.1128/9781555817381.ch36 is OK
- 10.1017/s0022172400034677 is OK
- 10.1128/9781555817381.ch22 is OK
- 10.1128/9781555817381.ch37 is OK
- 10.3389/fmicb.2019.01591 is OK
- 10.1007/978-1-4939-9000-9_17 is OK
- 10.1128/AEM.02265-19 is OK
- 10.1128/JCM.00190-19 is OK
- 10.1128/AEM.00165-19 is OK
- 10.3389/fmicb.2017.01044 is OK
- 10.1371/journal.pone.0147101 is OK
- 10.1128/JCM.00323-15 is OK
- 10.1128/AEM.01746-19 is OK

MISSING DOIs

- None

INVALID DOIs

- None

:wave: @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/1711

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/1711, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

@whedon accept deposit=true

Doing it live! Attempting automated processing of paper acceptance...

๐Ÿฆ๐Ÿฆ๐Ÿฆ ๐Ÿ‘‰ Tweet for this paper ๐Ÿ‘ˆ ๐Ÿฆ๐Ÿฆ๐Ÿฆ

๐Ÿšจ๐Ÿšจ๐Ÿšจ THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! ๐Ÿšจ๐Ÿšจ๐Ÿšจ

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/1712
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.02556
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! ๐ŸŽ‰๐ŸŒˆ๐Ÿฆ„๐Ÿ’ƒ๐Ÿ‘ป๐Ÿค˜

    Any issues? Notify your editorial technical team...

@amoeba, @Maghnuso - many thanks for your reviews here and to @majensen for editing this submission โœจ

@jdbaugher - your papers now accepted into JOSS :zap::rocket::boom:

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](https://joss.theoj.org/papers/10.21105/joss.02556/status.svg)](https://doi.org/10.21105/joss.02556)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.02556">
  <img src="https://joss.theoj.org/papers/10.21105/joss.02556/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: https://joss.theoj.org/papers/10.21105/joss.02556/status.svg
   :target: https://doi.org/10.21105/joss.02556

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

@jdbaugher - congrats! This is nice software that should be really useful in the biz. @Maghnuso - you're now officially a bioinformatician, congrats. @amoeba - appreciate your help once again from the Land of the Midnight Sun.

Thanks everyone! @arfon @majensen @amoeba @Maghnuso

Congratulations @jdbaugher! And many thanks @majensen.

Was this page helpful?
0 / 5 - 0 ratings