Joss-reviews: [REVIEW]: SMACT: Semiconducting Materials by Analogy and Chemical Theory

Created on 1 Apr 2019 · 33Comments · Source: openjournals/joss-reviews

Submitting author: @dandavies99 (D. W. Davies)
Repository: https://github.com/WMD-group/SMACT/
Version: v2.1
Editor: @usethedata
Reviewer: @symmy596, @utf
Archive: 10.5281/zenodo.3242315

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/7efd2f2ad60d25bdccee3fbd3fc11448"><img src="http://joss.theoj.org/papers/7efd2f2ad60d25bdccee3fbd3fc11448/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/7efd2f2ad60d25bdccee3fbd3fc11448/status.svg)](http://joss.theoj.org/papers/7efd2f2ad60d25bdccee3fbd3fc11448)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@symmy596 & @utf, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

Make sure you're logged in to your GitHub account
Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @usethedata know.

✨ Please try and complete your review in the next two weeks ✨

Review checklist for @symmy596

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: v2.1
[x] Authorship: Has the submitting author (@dandavies99) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @utf

Conflict of interest

[x] As the reviewer I confirm that I have read the JOSS conflict of interest policy and that there are no conflicts of interest for me to review this work.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Version: v2.1
[x] Authorship: Has the submitting author (@dandavies99) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Authors: Does the paper.md file include a list of authors with their affiliations?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

accepted published recommend-accept review

Source

whedon

Most helpful comment

As an associate editor in chief, I don't think this conflict should prevent you from reviewing the submission, but we do greatly appreciate you mentioning and recording it.

danielskatz on 2 Apr 2019

👍2

All 33 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @symmy596, it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

whedon on 1 Apr 2019

Attempting PDF compilation. Reticulating splines etc...

whedon on 1 Apr 2019

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 1 Apr 2019

I am happy and indeed keen to review SMACT.

However I would like to flag that I know the lead author Dan as we have worked in the same department (Bath University) in the past, although he has since moved on. With that said we have had no collaborations, have no current plans for future collaborations. I feel there is no conflict of interest, however I do want to flag it now in the interest of transparency.

Update - Dan left Bath in October.

symmy596 on 2 Apr 2019

As an associate editor in chief, I don't think this conflict should prevent you from reviewing the submission, but we do greatly appreciate you mentioning and recording it.

danielskatz on 2 Apr 2019

👍2

I would like to start by saying that SMACT is a really nice piece of work that has a clear research use. The authors have done a great job.

Reviewer comments

General Comments

The README on PyPi has not rendered correctly. PyPi expects readme files to be rst files however you can get away with markdown by putting - long_description_content_type=’text/markdown into your setup.py. See https://github.com/symmy596/SurfinPy/blob/master/setup.py for an example.
It is not clear from the README what to do if you have any general quaetions that may not be suited to the issue tracker. It needs to be made clear in the README how a user can contact the developers with questions. This falls under point 3 of the community guidelines section of the review criteria.
Two of the repo badges have not rendered correctly.
There are automated tests however there is no mention of them in the README. A sentence explaining that tests exist and an explanation of how to run them is needed.

Examples

The examples work as expected, however I feel that some of the descriptions within the notebooks are a bit thin and could be fleshed out to provide more detail. Some of your future users may be completely new to this field and anything that you can do in your examples to help them get started should be encouraged.

There is no real background provided in the notebooks. In the Inverse_formate_perovskite example it is clear what you are doing but there is no real explanation of why you are doing it. Someone who comes across SMACT is more likely to start using it if they can immediately see how it can fit into their research. A sentence or two that explains why an example exists helps a user to see how they could use it for their own research.
The "next step" is missing in some examples. For example in the cation_mutation example how could a user take the information and convert it into something useful e.g. a POSCAR.
The documentation does a good job at explaining the workflow. In the docs you point to the Generate_compositons_list.ipynb and say that this is an example of generating a pandas dataframe which could be used as an input for ML. It would be useful to users if you provided an example of how to actually do that, even if this was just a code block within the markdown (As I guess it could be computationally intensive).
It is reasonable to expect a certain level of background knowledge; however as these are tutorials I think it is better to provide more background than you think you need. For example, in Inverse_formate_perovskites you mention the electronegativity order test and the Goldschmidt tolerance factor. It may be worth providing definitions of these terms (and others like it in other examples) in order to help users who have not come across these terms before.
In the Inverse_formate_perovskites tutorial you mention that the methodology can be carried out in fewer lines of code but that the method has been broken down for clarity. This is a great idea from a learning standpoint however as a user will typically be using the "fewer lines of code" method I think that method should be provided. It could be put into a code block within the markdown.

symmy596 on 15 Apr 2019

Thanks for the really useful feedback @symmy596 and with many helpful tips. We have tried to address all your points as best we can through changes to the repository.

General

We have added the long_description_content_type flag to get the README to render correctly on PyPi for the next release.
We have added developer contact details of to the README.
It appears that since submitting the repo to JOSS, two of the badge providers had developed problems (Hitcount and Badginator). The former is working again. As fun as the Badginator badge is, it is in no way essential so we have just removed it. 🤷‍♂️
Regarding automated tests: There were already instructions under Code contributions on how to run them, but we have now made this more obvious by adding a separate section.

Examples

Thanks for looking at these in detail and for the useful comments. We have worked on the accessibility of all the examples including:

We have fleshed out the READMEs for the example notebooks to provide some extra background, images and links to resources for defining key terms (e.g. inverse formate perovskite example).
The cation mutation example now includes a demo of how the structure objects can be written out to a cif file.
The Generate_compositions_lists example now includes a Next steps section which has a code block demonstrating featurisation for machine learning applications. Instead of including an actual machine learning section -- which would be quite long and isn't really the focus of SMACT -- we have provided a link to a full example in another repo for the interested user.
We now demonstrate both the long (beginner) and short (advanced) approaches fully in the inverse formate preovskite example. This also revealed a bug regarding how ionic radii are read in so that has also been fixed!

dandavies99 on 18 Apr 2019

I have finished reviewing the paper and the repository. Overall, SMACT is a clean and well documented package that has many useful features. It will be of use to many researchers in the materials informatics and materials design communities.

I have a number of suggestions for improving the package.

Documentation and examples

The readme could be adjusted to make the functionality of the package more obvious. The statement of need is rather broad and not too revealing, whereas the contents section details what the code components do but not how they fit together. Overall, there is a lot of functionality which is often not obviously related (e.g., the distortion module). In my opinion, the readme would be better if it a) clearly listed the features of the code and b) gave specific use cases with corresponding links to examples (and maybe a paper).
The readme would benefit from a more prominent link to the online documentation. I actually had to search the page for the word "doc" to find it.
I found the examples page in the online docs to give the clearest explanation of the functionality of the code. I would direct users there first and suggest reading the other examples afterwards.
There are examples in both the package repository and a separate repository. The separation doesn't seem entirely clear to me. I would consolidate the examples to avoid confusion. On that note, the readme lists both example locations but the Introduction page of the online docs just lists the separate repo.
The Requirements section of the getting started page is a bit funky.
There is a typo in the "Inverse_formate_perovskites.ipynb" example: "senseibe" (or maybe this is a new type of perovskite?)
Some of the examples use the matminer str_to_composition function. Seeing this reminded me that these functions are deprecated and I was supposed to remove them in December last year. You should update the examples to use the StrToComposition conversion featurizers instead.

SMACT package

These are more general comments that do not necessarily need to be addressed but which occurred to me during the review.

Personally I find smact_test a bad function name. It isn't clear what the function does from the name alone and actually suggests it is something to do with testing. Could this be called smact_filter instead (or better charge_balance_filter)?
In many of the examples, I see the paradigm:

for element in search:
    with open(os.path.join(data_directory, 'shannon_radii.csv'),'r') as f:
        reader = csv.reader(f)

As you are often looping over ~100 elements, to constantly reload the file seems excessive. It would be better if all the additional data files were just json dictionaries. The data could be loaded once and then just accessed using shannon_radii[el][coordination].

Having the data files as dictionaries might also remove the need for the complex data_loader caching functionality. The data folder is only 200 KB, so all the tables could be loaded into memory when the package is imported (in __init__.py). The data would then just be module level dictionaries that are accessed by all element/species objects.

utf on 12 May 2019

Thank you both @symmy596 and @utf for your time.

@utf, thanks for the the thorough review and useful comments. We have tried to address all the points by making these changes:

We agree that the separation between the examples in the repo and the workflows in the separate workflow is actually not that helpful. We have moved all examples into the examples folder in the repo and provided a separate README there to signpost to each example.
We have re-worked the main README:
- providing an obvious link to the docs near the top of the page, as well as links to examples.
- We have changed the old “Usage” section to a “Getting started” section, again signposting the docs as a first port of call, and that further examples can be found in the exampled directory.
- We have provided a more in-depth list of “Code features” with links to examples and publications, as suggested.
We have brought the Requirements section of the docs up to date with the Requirements in the README and fixed the (very well-spotted) ‘sensible’ typo. Sadly, not a new type of perovskite 😞
We updated the matminer StrToComposition featurizer in the solaroxides and counting examples.
We have changed the smact_test function to smact_filter in the screening module (and in all relevant examples and tests), as we agree this better reflects its purpose. We chose this over anything to do with charge_balance as the charge neutrality and electronegativity order filters may be applied independently if desired.
We agree that continually reopening files within loops should be avoided. We have kept the paradigm as it is for now in the formate perovskite example in an attempt to make it clear where exactly the data is coming from. We have, however, also added a note and pseudo-code snippet explaining how this should be done for larger production runs.
Thank you for the great idea regarding loading the data into memory when the package is imported. This is a discussion that some of the developers have been having recently and it is likely that we will simplify the way in which data is loaded in future versions of SMACT.

dandavies99 on 23 May 2019

Hi @dandavies99 the new readme is a lot clearer and is very easy to follow. Really nice work.

@danielskatz I’m happy that all my comments have been thoroughly addressed.

utf on 23 May 2019

I am also happy that all of my comments have been addressed.

symmy596 on 23 May 2019

Thanks both, that's great! @usethedata, should I create an archive on Zenodo?

dandavies99 on 27 May 2019

:wave: Hey @dandavies99...

Letting you know, @usethedata is currently OOO until Saturday, June 1st 2019. :heart:

ooo[bot] on 27 May 2019

@usethedata — Looks like this submission is ready for a recommendation?

labarba on 9 Jun 2019

@dandavies99 Yes, please go ahead and create the Zenodo archive. Post the DOI here when you're done.

Sorry for the delays on my part. I was back on 6/1, but getting dug out from under took longer than I'd hoped.

usethedata on 10 Jun 2019

Before you make the archive, @dandavies99, we're going to need a new tagged release of the software repository, after all revisions.

[x] Please list the latest version tag here, to update the metadata
[x] Archive the reviewed software in Zenodo
[x] Check the Zenodo deposit has the correct metadata, this includes the title (should match the paper title) and author list (make sure the list is correct and people who made a minor commit are not on it)
[x] Please list the Zenodo DOI of the archived version here.

labarba on 10 Jun 2019

No problem, @usethedata.
Thanks, @labarba.

The latest version is v2.1
This has now been archived on Zenodo with the correct metadata -->

dandavies99 on 10 Jun 2019

@whedon set v2.1 as version

labarba on 10 Jun 2019

OK. v2.1 is the version.

whedon on 10 Jun 2019

@whedon set 10.5281/zenodo.3242315 as archive

labarba on 10 Jun 2019

OK. 10.5281/zenodo.3242315 is the archive.

whedon on 10 Jun 2019

@whedon accept

labarba on 10 Jun 2019

Attempting dry run of processing paper acceptance...

whedon on 10 Jun 2019

```Reference check summary:

OK DOIs

10.1016/j.chempr.2016.09.010 is OK
https://doi.org/10.1016/0022-3697(64)90176-3 is OK
https://doi.org/10.1016/0022-3697(58)90050-7 is OK
10.1021/cm400893e is OK
10.1021/ja204670s is OK
10.1039/C8FD00032H is OK
10.1039/TF9292500253 is OK
10.1103/PhysRevLett.33.1088 is OK
https://doi.org/10.1016/j.commatsci.2018.05.018 is OK
https://doi.org/10.1016/j.commatsci.2012.10.028 is OK
10.1038/s41586-018-0337-2 is OK
10.1186/1758-2946-3-33 is OK

MISSING DOIs

None

INVALID DOIs

None
```

whedon on 10 Jun 2019

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/741

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/741, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

whedon on 10 Jun 2019

@dandavies99 : could you merge my stylistic fix here?
https://github.com/WMD-group/SMACT/pull/20

labarba on 10 Jun 2019

Sure thing @labarba.

dandavies99 on 10 Jun 2019

@whedon accept deposit=true

labarba on 10 Jun 2019

Doing it live! Attempting automated processing of paper acceptance...

whedon on 10 Jun 2019

🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦

whedon on 10 Jun 2019

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/742
Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.01361
If everything looks good, then close this review issue.
Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? notify your editorial technical team...

whedon on 10 Jun 2019

Congratulations, @dandavies99, your JOSS paper is published!

Big thanks to our editor: @usethedata, and reviewers: @symmy596, @utf — we appreciate your contribution to JOSS 🙏

labarba on 10 Jun 2019

🎉1

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01361/status.svg)](https://doi.org/10.21105/joss.01361)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.01361">
  <img src="http://joss.theoj.org/papers/10.21105/joss.01361/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.01361/status.svg
   :target: https://doi.org/10.21105/joss.01361

This is how it will look in your documentation:

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Volunteering to review for us sometime in the future. You can add your name to the reviewer list here: http://joss.theoj.org/reviewer-signup.html
Making a small donation to support our running costs here: https://numfocus.salsalabs.org/donate-to-joss

whedon on 10 Jun 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[REVIEW]: Pyret: A Python package for analysis of neurophysiology data

whedon · 12Comments

[REVIEW]: gwdegree: A Shiny App to Aid Interpretation of Geometrically-Weighted Degree Estimates in Exponential Random Graph Models

whedon · 8Comments

[REVIEW]: tidytext: Text Mining and Analysis Using Tidy Data Principles in R

whedon · 10Comments

[REVIEW]: The Experiment Factory: Reproducible Experiment Containers

whedon · 12Comments

[REVIEW]: GRUPO: Gauging Research University Publication Output

whedon · 10Comments