Joss-reviews: [REVIEW]: h5preserve

Created on 14 Feb 2018 · 23Comments · Source: openjournals/joss-reviews

Submitting author: @aragilar (James Tocknell)
Repository: https://github.com/h5preserve/h5preserve
Version: v0.15.0
Editor: @arfon
Reviewer: @mattpitkin
Archive: 10.5281/zenodo.1179292

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea"><img src="http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea/status.svg)](http://joss.theoj.org/papers/2f5fa0cbfc075383d7764d271889b6ea)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@mattpitkin, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

Make sure you're logged in to your GitHub account
Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @arfon know.

### Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: Does the release version given match the GitHub release (v0.15.0)?
[x] Authorship: Has the submitting author (@aragilar) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

accepted published recommend-accept review

Source

whedon

All 23 comments

Hello human, I'm @whedon. I'm here to help you with some common editorial tasks. @mattpitkin it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

whedon on 14 Feb 2018

Attempting PDF compilation. Reticulating splines etc...

whedon on 14 Feb 2018

--> Check article proof :page_facing_up: <--

whedon on 14 Feb 2018

@mattpitkin - please carry out your review in this issue by updating the checklist above and giving feedback in this issue. The reviewer guidelines are available here: http://joss.theoj.org/about#reviewer_guidelines

Any questions/concerns please let me know.

arfon on 14 Feb 2018

Hi @aragilar - could you include explicit instructions for installation in the README and documentation? E.g., something like:

install the software using:
pip install h5preserve
or by cloning this repository and running:
python setup.py install --user

Could you also add a requirements file?

mattpitkin on 14 Feb 2018

Having downloaded h5preserve using pip, I'm trying the example shown here, and in both Python 2.7.12 and 3.5.2 I get the following error:

In [6]: my_cool_experiment = Experiment(np.array([1,2,3,4,5]), 10)

In [7]: with open("my_data_file.hdf5", new_registry_list(registry)) as f:
   ...:     f["cool experiment"] = my_cool_experiment
   ...: 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-f3958ebfc497> in <module>()
----> 1 with open("my_data_file.hdf5", new_registry_list(registry)) as f:
      2     f["cool experiment"] = my_cool_experiment

TypeError: file() argument 2 must be string, not RegistryContainer

If I edit it to use with open("my_data_file.hdf5", "w") as f: I get the following error:

In [10]: with open("my_data_file.hdf5", "w") as f:
    ...:     f["cool experiment"] = my_cool_experiment
    ...: 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-63e87c258186> in <module>()
      1 with open("my_data_file.hdf5", "w") as f:
----> 2     f["cool experiment"] = my_cool_experiment

TypeError: 'file' object does not support item assignment

Do you know what the problem could be?

mattpitkin on 14 Feb 2018

@mattpitkin That error due to me not checking that the code runs, it should be h5open not open (I've updated the docs), that example also used an older version of the api, where dictionaries could be used (I dropped that as there were a bunch of problems with it).

aragilar on 15 Feb 2018

I've also added more explicit instructions and referred to https://packaging.python.org/ as needed (I was avoiding explicit instructions, as I've noticed far too many projects refer to old methods e.g. python setup.py install, eazy_install).

I haven't added a requirements.txt file for the dependencies of h5preserve as h5preserve is a library, not an application, so it shouldn't have a requirements.txt as per https://caremad.io/posts/2013/07/setup-vs-requirement/. If you meant for the tests, tox handles all that.

aragilar on 15 Feb 2018

👍1

Thanks for these additions. Now that you've removed explicit Python 2 support could you update the version on PyPI, so that the badge on the README page no longer lists Python 2.7?

mattpitkin on 15 Feb 2018

Regarding the automated testing, I'm not really familiar with tox, but what does it mean when the output says

================================ 94 passed, 43 xfailed in 0.78 seconds ================================

Is it ok/expected that 43 xfail?

mattpitkin on 15 Feb 2018

Just a minor typo thing - the links to the repo in the "Contributing" doc page point to github.org rather than github.com, so they don't work.

mattpitkin on 15 Feb 2018

Fixed the urls, and added a test to the CI to make sure they don't break, and uploaded a new release to PyPI (so everything should show python 3 only).

The xfail come from pytest, it means the test is expected to fail (and hence "passes"), if they passed then there'd be a problem (I think most of those xfailing are stubs for unit tests I haven't had time to implement yet, most of the code is covered by integration tests).

aragilar on 16 Feb 2018

👍1

I just have a couple more question/requests?

Is it generally the case that you would define the registry.loader and registry.dumper functions outside of the class they're referring to, or could/should you have them as methods of the class just called, e.g., dump and load? The latter seems to make more sense to me, but maybe I'm thinking of these methods in the wrong way.

Would it be possible to give explicit examples of creating your own RegistryContainer, and using the GroupContainer, as only the DatasetContainer is used in the current example? Also, would you be able to shown an example of versioning in action, where you create two different versions, as I'm not entirely sure what it means at the moment?

mattpitkin on 16 Feb 2018

A couple of minor typos to fix in the paper.md file:

change "...data structures produce by some modelling..." to "...data structures produced by some modelling..."
change "...design flaws in pickle have brought up..." to "...design flaws in pickle have been brought up..."

mattpitkin on 16 Feb 2018

Can you add the doi = {10.1109/MCSE.2011.37} to the van_der_walt_numpy_2011 reference in the paper.bib file?

mattpitkin on 16 Feb 2018

@mattpitkin I've added the paper changes, and I've added more examples, do they make sense to you?

Whilst there's nothing stopping you from making the dumpers/loaders static methods on the class, there a few reasons not to:

Sometimes it's hard to add methods to classes (or where it's inconvenient to): e.g. namedtuples, or other people's classes
Subclasses will use the same method for loading/dumping, unless you override it: this is in effect the same problem that pickle has (it has options for controlling dumping via a bunch of methods: https://docs.python.org/3/library/pickle.html#pickle-inst), you lose control over what you dump/load (which is why you can choose not to use the built-in dumpers/loaders).
Versioning becomes much harder: you either end up with n different functions, which is the same as the existing API, or you have a whole bunch of if statements, which you can screw up.

I'll try to translate that into an advice section, but once you try to implement the system as suggested in https://www.youtube.com/watch?v=7KnfGDajDQw, then you do come to the conclusion that separate functions just works better (though it took me two previous iterations of the code to learn this, it's definitely not obvious).

aragilar on 17 Feb 2018

@aragilar - thanks, the examples look good to me, so I'm happy to sign things off.

@arfon - I'm happy for this to be accepted.

mattpitkin on 19 Feb 2018

👍1

@aragilar - At this point could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive? I can then move forward with accepting the submission.

arfon on 19 Feb 2018

Zenodo doi: https://doi.org/10.5281/zenodo.1179292
Zenodo record: https://zenodo.org/record/1179292

aragilar on 20 Feb 2018

@whedon set 10.5281/zenodo.1179292 as archive

arfon on 20 Feb 2018

OK. 10.5281/zenodo.1179292 is the archive.

whedon on 20 Feb 2018

@mattpitkin many thanks for your review here ✨

@aragilar - your paper is now accepted into JOSS and your doi is https://doi.org/10.21105/joss.00581 :zap: :rocket: :boom:

arfon on 20 Feb 2018

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippet:

[![DOI](http://joss.theoj.org/papers/10.21105/joss.00581/status.svg)](https://doi.org/10.21105/joss.00581)

This is how it will look in your documentation:

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Volunteering to review for us sometime in the future. You can add your name to the reviewer list here: http://joss.theoj.org/reviewer-signup.html
Making a small donation to support our running costs here: https://www.flipcause.com/secure/cause_pdetails/Mjk3Nzk=

whedon on 20 Feb 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[REVIEW]: iheatmapr: Interactive complex heatmaps in R

whedon · 11Comments

[PRE REVIEW]: Neuropsydia.py: A Python Module for Creating Experiments, Tasks and Questionnaires

whedon · 8Comments

[PRE REVIEW]: xphyle: Extraordinarily simple file handling

whedon · 12Comments

[REVIEW]: Pyret: A Python package for analysis of neurophysiology data

whedon · 12Comments

[REVIEW]: PyGDH ("pigged"): Python Grid Discretization Helper

whedon · 11Comments