Joss-reviews: [REVIEW]: GFAKluge: A C++ library and command line utilities for the Graphical Fragment Assembly formats

Created on 16 Nov 2018  ยท  75Comments  ยท  Source: openjournals/joss-reviews

Submitting author: @edawson (Eric Dawson)
Repository: https://github.com/edawson/gfakluge
Version: 1.1.2
Editor: @mgymrek
Reviewer: @sjackman
Archive: 10.5281/zenodo.2546721

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/d731f6dfc6b77013caaccfd8333c684a"><img src="http://joss.theoj.org/papers/d731f6dfc6b77013caaccfd8333c684a/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/d731f6dfc6b77013caaccfd8333c684a/status.svg)](http://joss.theoj.org/papers/d731f6dfc6b77013caaccfd8333c684a)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@sjackman, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @mgymrek know.

โœจ Please try and complete your review in the next two weeks โœจ

Review checklist for @sjackman

Conflict of interest

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Version: 1.1.2
  • [x] Authorship: Has the submitting author (@edawson) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Authors: Does the paper.md file include a list of authors with their affiliations?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
accepted published recommend-accept review

Most helpful comment

thanks, congrats @edawson! and thanks @sjackman for the helpful review

All 75 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @sjackman it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews ๐Ÿ˜ฟ

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
Attempting PDF compilation. Reticulating splines etc...

Installation: Does installation proceed as outlined in the documentation?

You can build libgfakluge and the command line gfak utilities by typing make in the repo.

On macOS with Apple clang, make fails with error: 'omp.h' file not found, because gfakluge requires OpenMP, and Apple clang does not provide OpenMP. Please add OpenMP to the list of dependencies. Please include instructions in README.md for building from source on macOS.

brew install gcc@8
make CXX=g++-8

GFAKluge [@GFAKluge] is a set of command line utilities

@online{GFAKluge,
  author = {Eric T. Dawson and Richard Durbin},
  title = {GFAKluge},
  year = 2018,
  url = {https://github.com/edawson/gfakluge}
}

I don't believe it's typical to cite the software repository from the paper.

@mgymrek Opinion?

Please mention the homepage and license in the Summary. For example

Homepage: https://github.com/edawson/gfakluge
License: MIT

GFAKluge facilitates interprogram exchange by providing a high-level C++ API for developers

To use GFAKluge in your program, you'll need to add a few lines to your code. First, add the necessary include line to your C++ code: #include "gfakluge.hpp"
Next, make sure that the library is on the proper system paths and compile line:

GFAKluge would be more easily used as a library if

  1. it were a header-only library (so it does not require compilation and -Lโ€ฆ -lโ€ฆ)
  2. make install and the Brew package installed that header

To make gfakluge a header-only library, merge gfakluge.hpp and gfakluge.cpp into a single file and remove gfakluge.cpp. Make every method an inline method by either defining it within class GFAKluge or decorating the definition with static inline if it's defined outside the class.

Header-only libraries are much easier to use in practice than libraries that require linking. It avoids the challenges of distributing shared libraries, whose details vary between operating systems.

Please add a make install target to Makefile.

PREFIX=/usr/local

install:
    install gfak $(DESTDIR)$(PREFIX)/bin/
    install src/gfakluge.hpp $(DESTDIR)$(PREFIX)/include/

and converting between legacy GFA formats.

Please be more specific. I suggest and converting between GFA versions 1 and 2.

To our knowledge, GFAKluge is the only publically-available software package that can consume and produce both GFA1 and GFA2

GfaPy is able to consume and produce both GFA1 and GFA2.
https://doi.org/10.1093/bioinformatics/btx398

RGFA handles only GFA1 (I believe). I suggest citing it.
https://doi.org/10.7717/peerj.2681

abyss-todot (now a misnomer, since it handles multiple formats) included with ABySS 2 produces both GFA1 and GFA2 (with the command line options --gfa1 and --gfa2), and consumes both with some format limitations (it does not handle all record types).
https://doi.org/10.1101/gr.214346.116

See https://github.com/GFA-spec/GFA-spec#implementations

the gfak convert tool

I'd suggest typesetting this asโ€ฆ

the gfak convert tool

Ditto vg msga

We see the command line utilities as being useful to the development community in the short term.

The need for file conversion will likely never go away. Older tools that support only GFA1 may never be updated to support GFA2. Newer tools may support producing only GFA2 and not GFA1. To make these tools interoperate, a conversion tool is needed. That need is likely not temporary.

Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

Please list in the manuscript the currently available commands.

   convert: Convert between GFA 0.1 <-> 1.0 <-> 2.0
   diff:    Determine whether two GFA files have identical graphs
   extract: Convert the S lines of a GFA file to FASTA format.
   fillseq: Add sequences from a FASTA file to S lines.
   ids:     Coordinate the ID spaces of multiple GFA graphs.
   concat:  Merge GFA graphs (without ID collisions).
   sort:    Print a GFA file in HSLP / HSEFGUO order.
   stats:   Get assembly statistics (e.g. N50) for a GFA file.
   subset:  Extract the subgraph between two IDs in a graph.
   trim:    Remove elements from a GFA graph.

Please give command line examples of three or more commands.

Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

Is the API documented? If not, please document the API using Doxygen or a similar tool. A one-line summary of the purpose of each public method is sufficient. Please post the rendered HTML API documentation online using GitHub Pages or similar.

Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?

Please include a make check rule in the Makefile that runs the automated tests.

Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

I've ticked this off as complete, since I think it's self evident how to use GitHub to submit issues and pull requests. You could if you liked add Reporting bugs and Contributing sections to the README.md or CONTRIBUTING.md.

@whedon commands

Here are some things you can ask me to do:

# List Whedon's capabilities
@whedon commands

# List of editor GitHub usernames
@whedon list editors

# List of reviewers together with programming language preferences and domain expertise
@whedon list reviewers

# Compile the paper
@whedon generate pdf

๐Ÿšง ๐Ÿšง ๐Ÿšง Experimental Whedon features ๐Ÿšง ๐Ÿšง ๐Ÿšง

# Compile the paper from a custom git branch
@whedon generate pdf from branch custom-branch-name

My review is complete.

Thanks for these helpful comments @sjackman.

To summarize for @edawson:

  • Please address the comments regarding installation and documentation.
  • I agree no need to cite the software repository in the paper.
  • There is no need to include extensive documents of the different commands in the manuscript itself. However please be sure these are well documented e.g. in the readme or manual.
  • For completeness it would be helpful to add a CONTRIBUTING.md file.

@edawson: I wanted to check in on this and see if you were able to make the requested changes.

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@mgymrek I believe I've just addressed the last of these. I've included my checklist, which I pulled out from @sjackman 's comments.

  • [X] Fix install instructions
  • [X] Remove citation to repo in summary
  • [X] Mention the homepage and license in the summary
  • [X] Make GFAKluge a header-only library
  • [X] Add a make install target to makefile
  • [X] Change the wording of "legacy GFA formats"
  • [X] Reference GFApy and RGFA and ABYSS, all of which are other programs which can convert GFA 1 -> 2
  • [X] Fix typesetting of convert
  • [X] Fix typesetting of vg msga
  • [X] Change wording to reflect that GFA2 and GFA1 interconversion needs are not temporary
  • [X] List the currently available commands in the manuscript.
  • [X] Give command line examples of three or more commands
  • [X] Mention the API documentation in the manuscript
  • [X] Add a make test target in the makefile that runs automated tests
  • [X] Add a contributing section to the Readme

For the authors of the reference "Gene Myers, Jason Chin, & Durbin, R. (2015)", I suggest adding Shaun Jackman (myself), Heng Li, and Giorgio Gonnella based on this text:

Jason Chin, Richard Durbin, and myself (Gene Myers) found ourselves together at a workshop meeting in Dagstuhl Germany and hammered out an initial proposal for an assembly format. We started with GFA 1 and proceeded to build a more comprehensive design around it. After extensive revision and discussion on Github with the GFA group including Shaun Jackman, Heng Li, and Giorgio Gonnella, we arrived at GFA 2.0.

https://github.com/gfa-spec/gfa-spec/#gfa-20-graphical-fragment-assembly-gfa2-format-specification-20

convert: Convert between GFA 0.1 <-> 1.0 <-> 2.0

What is GFA 0.1, and how does it differ from GFA 1.0?

available in the interface.md file file.

There's a typo here: file file.

My review is complete.

Thanks @sjackman, if all the items have been completed could you just update the checklist above?

@sjackman I've addressed your remaining comments. We consider the original GFA proposal by Heng Li to be "GFA 0.1," and it's supported as a legacy format that predates any real specification. Thanks for your review!

@whedon generate pdf

We consider the original GFA proposal by Heng Li to be "GFA 0.1," and it's supported as a legacy format that predates any real specification.

Are there any tangible technical differences between Heng's proposal GFA 0.1 and GFA 1.0?

Off the top of my head, the only things that come to mind are the x and a lines sometimes seen in early minimap2 runs and the W walk line used in place of the P path line syntax.

Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

Examples of various commands are included in the examples.md file.

Can you please make examples.md a link to that file?

Please set the GitHub website link to the appropriate documentation. Click the button Edit next to the description A C++ library and utilities for manipulating the Graphical Fragment Assembly format. I recommend GitHub Pages if you have no documentation web site currently.

Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

Examples of the C++ API are included in the interface.md file.

Can you please make interface.md a link to that file?

Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?

$ make check
prove test/gfa_test.t
test/gfa_test.t .. test/gfa_test.t: line 5: ./bash-tap/bash-tap-bootstrap: No such file or directory
test/gfa_test.t: line 7: plan: command not found
test/gfa_test.t: line 10: is: command not found
test/gfa_test.t: line 14: is: command not found
test/gfa_test.t: line 17: is: command not found
test/gfa_test.t: line 20: is: command not found
test/gfa_test.t: line 23: is: command not found
test/gfa_test.t: line 26: is: command not found
Processed 1 graphs...
Done.
test/gfa_test.t: line 29: is: command not found
Merging 2 graphs...
test/gfa_test.t: line 32: is: command not found
test/gfa_test.t .. Dubious, test returned 127 (wstat 32512, 0x7f00)
No subtests run 

Test Summary Report
-------------------
test/gfa_test.t (Wstat: 32512 Tests: 0 Failed: 0)
  Non-zero exit status: 127
  Parse errors: No plan found in TAP output
Files=1, Tests=0,  1 wallclock secs ( 0.04 usr  0.01 sys +  0.06 cusr  0.06 csys =  0.17 CPU)
Result: FAIL
make: *** [Makefile:40: check] Error 1

Should the file bash-tap/bash-tap-bootstrap be included in the source code, or are further instructions needed for the automated tests?

You may want to consider using Autotools (autoconf and automake) if you're interested. Not a requirement though.

make install installs gfakluge.hpp, which depends on tinyfa.hpp and pliib.hpp, but those files are not installed by make install.

I've updated brewsci/bio/gfakluge to 1.1.0. See PR https://github.com/brewsci/homebrew-bio/pull/547
It'd be awesome if you'd like to submit PRs to bump the version for future releases. I'd be happy to show you around Homebrew/Linuxbrew and Brewsci/bio if you like. To open a PR for a new version release, you can do it in the GitHub web editor if you like. You need only update the url and sha256.

  • [X] make examples.md a link to that file
  • [X] set the GitHub website link to the appropriate documentation (now links to README / GitHub Pages)
  • [X] make interface.md a link to that file
  • [X] bash-tap/bash-tap-bootstrap be included in the source code (made bash-tap) a git submodule, meaning it's now downloaded with git clone --recursive
  • [X] You may want to consider using Autotools (I'm going to run through some tutorials and see if I can incorporate this - thanks for the suggestion!)
  • [X] make install installs pliib.hpp and tinyfa.hpp

I've updated brewsci/bio/gfakluge to 1.1.0.

Awesome! I was trying to do this before but struggling. I'd definitely appreciate the help learning how to bump a version and cut a fresh release on homebrew.

Awesome! I was trying to do this before but struggling. I'd definitely appreciate the help learning how to bump a version and cut a fresh release on homebrew.

Open an issue or PR next time over at Brewsci/bio, and if you run into any trouble, I'll be happy to walk you through it.

Thanks @sjackman, if all the items have been completed could you just update the checklist above?

@mgymrek All checked off, Melissa! โœ…

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

Thanks @sjackman!

@edawson could you (1) create a Zenodo archive and report the DOI in this thread and (2) check over the proof that was just generated by whedon?

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@mgymrek I had to make a few changes to the bibliography that I missed earlier, but the latest proof is now good. The DOI is: 10.5281/zenodo.2546721 . Thanks (and thanks again @sjackman )!

Congrats, Eric! ๐Ÿพ

@whedon set 10.5281/zenodo.2546721 as archive

OK. 10.5281/zenodo.2546721 is the archive.

@whedon set 1.1.2 as version

OK. 1.1.2 is the version.

thanks, congrats @edawson! and thanks @sjackman for the helpful review

@whedon accept

Attempting dry run of processing paper acceptance...

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/449

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/449, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

@whedon accept deposit=true

Doing it live! Attempting automated processing of paper acceptance...

๐Ÿšจ๐Ÿšจ๐Ÿšจ THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! ๐Ÿšจ๐Ÿšจ๐Ÿšจ

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/450
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01083
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! ๐ŸŽ‰๐ŸŒˆ๐Ÿฆ„๐Ÿ’ƒ๐Ÿ‘ป๐Ÿค˜

    Any issues? notify your editorial technical team...

@sjackman - many thanks for your review here and to @mgymrek for editing this submission โœจ

@edawson - your paper is now accepted into JOSS :zap::rocket::boom:

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01083/status.svg)](https://doi.org/10.21105/joss.01083)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01083">
  <img src="http://joss.theoj.org/papers/10.21105/joss.01083/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.01083/status.svg
   :target: https://doi.org/10.21105/joss.01083

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

The URL https://doi.org/10.21105/joss.01083 resolves to JOSS, but doi.org does not yet have a Bibtex entry. I imagine it will soon enough.

$ curl -sLH "Accept: text/bibliography; style=bibtex" "https://doi.org/10.21105/joss.01083"
Resource not found.

The URL https://doi.org/10.21105/joss.01083 resolves to JOSS, but doi.org does not yet have a Bibtex entry. I imagine it will soon enough.

Yeah, it can take a few hours for the DOI metadata to propagate properly.

Was this page helpful?
0 / 5 - 0 ratings