Submitting author: @betteridiot (Marcus)
Repository: https://github.com/betteridiot/bamnostic
Version: 0.8.4
Editor: @pjotrp
Reviewers: @luizirber, @peterjc
Author instructions
Thanks for submitting your paper to JOSS @betteridiot. The JOSS editor (shown at the top of this issue) will work with you on this issue to find a reviewer for your submission before creating the main review issue.
@betteridiot if you have any suggestions for potential reviewers then please mention them here in this thread. In addition, this list of people have already agreed to review for JOSS and may be suitable for this submission.
Editor instructions
The JOSS submission bot @whedon is here to help you find and assign reviewers and start the main review. To find out what @whedon can do for you type:
@whedon commands
Hello human, I'm @whedon. I'm here to help you with some common editorial tasks. @pjotrp it looks like you're currently assigned as the editor for this paper :tada:
For a list of things I can do to help you, just type:
@whedon commands
Attempting PDF compilation. Reticulating splines etc...
馃憢@pjotrp - the submitting author suggested you as the handling editor. Feel free to nominate someone else if you think there's a better fit with a different editor.
Unofficially: I love that the JOSS bot is called whedon.
I can be the editor. Pretty cool software.
@remills As last author
As instructed, using the provided list of potential reviewers, I think the following would be well suited and qualified if they were selected to review this submission (in no particular order):
@ctb
@afrubin
@conradstack
@jean997
@wdecoster
@Adrianzo
@luizirber
I can review it, but would recommend to wait til Monday to see if anyone else wants to review it =]
I am actually asking @peterjc to be reviewer because he wrote a BAM parser himself in Python. I am waiting for Peter's response. We can have multiple reviewers, even so.
For the sake of complete transparency, a portion of my BgzfReader is a heavily modified version of the BGZF reader that @peterjc wrote. There is even attribution within that module to it. Furthermore, early on, I had consulted @peterjc about the project.
Yes, @betteridiot got in touch with me about BAMnostic be earlier this year - I've not been able to make time for any detailed input, but did suggest switching to a standard OSI approved open source license (since done).
My BGZF code referred to is in Biopython, see https://github.com/biopython/biopython/blob/master/Bio/bgzf.py
That BGZF code is now dual licensed (3-clause BSD or the legacy Biopython License Agreement), so having a derivate of that code here in BAMnostic under the 3-clause BSD license is absolutely fine from my point of view.
My pure-Python SAM/BAM code written on top of this still exists on branches, currently the latest code is here:
https://github.com/peterjc/biopython/tree/SamBam2015/Bio/Sequencing/SamBam
I have not had the time to polish this up to release ready status, and was wary of the continuously evolving nature of the SAM/BAM/CRAM file formats turning that into a continuous support burden. Currently the latest branch is here:
I would be technically able to review this, but the timing is awkward - I am away next week for a conference etc. Unless you mind waiting until mid July 2018, it would be better to ask someone else. Sorry.
Thanks @peterjc. I think we can afford to wait a few weeks and we would appreciate your review.
Hi @peterjc, are you in a position to review? Or should we appoint someone else?
It is now mid July, I am now on leave, but setting the time aside to play with the code will be harder than I'd expected (e.g. verifying the behaviour vs pysam).
One thing on the text: "This is a significant limitation as no other Python implementation (besides pysam) can perform random access operation on BAM files." maybe insert the rider "published" or "released" as there are several, including my old Python code can do this but has not been formally published or released, https://github.com/JohnLonginotto/pybam ("A simple, 100% python, BAM file reader.") and some work at https://github.com/nijibabulu/pypysam/
As a potential user I would also want to know about plans for CRAM support (logged as https://github.com/betteridiot/bamnostic/issues/4), CSI indexing (https://github.com/betteridiot/bamnostic/issues/3), and BAM output (https://github.com/betteridiot/bamnostic/issues/5).
I know this is a lot to ask, and perhaps out of scope for JoSS, but as a user I would worry about the long term support as the SAM/BAM file format continues to evolve, (e.g. the overdue fix for long CIGAR strings last year, important with long reads - logged as https://github.com/betteridiot/bamnostic/issues/6). In contrast, by building on top of the well supported HTSlib, pysam gets a lot of this maintenance work "for free", but a re-implementation would not (as I wrote earlier, this was a factor in me not formally releasing my own Python SAM/BAM code a years back).
If you can find another relevant reviewer able to give you a quick turn around, please go ahead. Or are my comments here already close to what JoSS wants from a reviewer?
Thank you greatly for your points. Regardless of @pjotrp decision to keep you on as the reviewer, you made great points that I would like to take the opportunity to address.
Your points regarding BAM output and long CIGAR strings are definitely revisions that can (and should) be made to bamnostic. The CIGAR string revision is still a relatively recent change, and was not written into bamnostic yet. Thank you for pointing this out, and that pull request should come very soon. The BAM output point is a bigger issue, but still fair. For the purposes of the original project that birthed bamnostic, BAM output was not desired, and therefore fell victim to oversight. However, due to incorporating @peterjc BGZF module, the framework for connecting the pieces is there. They just have not been formally implemented. Making this change should be relatively straight-forward (much like you stated in the respective issue thread).
I also agree with you that CRAM support would be a great boon for bamnostic. However, like you stated in betteridiot/bamnostic#4, it is a much more complex format that would require a non-trivial expenditure of time and resources to write up. Not saying that this feature is not in the future of bamnostic, but we believe that the present form of bamnostic could still be a very impactful tool for current pipelines and analysis. The same applies to CSI indexing. While it is a successor to BAI indexing, it has not permeated the domain to the degree that BAI has yet.
With regard to the phrasing: "This is a significant limitation as no other Python implementation (besides pysam) can perform random access operation on BAM files.", your point is fair that published should be added.
Additionally, in the vein of JoSS' purpose, this is an open-source project as well, and contributions are not only welcomed, they are encouraged. If the community would like to add features into bamnostic, pull requests are fully appreciated.
Thanks @peterjc for your comments! We can have multiple reviewers so I'll add @luizirber to take the review on from here as the primary reviewer, if he does not mind.
@whedon assign @luizirber as reviewer
OK, the reviewer is @luizirber
@whedon assign @pjotrp as editor
OK, the editor is @pjotrp
@whedon add @peterjc as reviewer
OK, @peterjc is now a reviewer
@whedon commands
Here are some things you can ask me to do:
# List all of Whedon's capabilities
@whedon commands
# Assign a GitHub user as the sole reviewer of this submission
@whedon assign @username as reviewer
# Add a GitHub user to the reviewers of this submission
@whedon add @username as reviewer
# Remove a GitHub user from the reviewers of this submission
@whedon remove @username as reviewer
# List of editor GitHub usernames
@whedon list editors
# List of reviewers together with programming language preferences and domain expertise
@whedon list reviewers
# Change editorial assignment
@whedon assign @username as editor
# Set the software archive DOI at the top of the issue e.g.
@whedon set 10.0000/zenodo.00000 as archive
# Open the review issue
@whedon start review
馃毀 馃毀 馃毀 Experimental Whedon features 馃毀 馃毀 馃毀
# Compile the paper
@whedon generate pdf
@whedon start review magic-word=bananas
OK, I've started the review over in https://github.com/openjournals/joss-reviews/issues/826. Feel free to close this issue now!
Most helpful comment
I can be the editor. Pretty cool software.