Joss-reviews: [REVIEW]: JavaPermutationTools: A Java Library of Permutation Distance Metrics

Created on 16 Sep 2018  ยท  57Comments  ยท  Source: openjournals/joss-reviews

Submitting author: @cicirello (Vincent A Cicirello)
Repository: https://github.com/cicirello/JavaPermutationTools
Version: 1.0
Editor: @yochannah
Reviewer: @ChrisJefferson, @motib
Archive: 10.5281/zenodo.1478192

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/e2477e28f623a8a87046a8831c7c55ef"><img src="http://joss.theoj.org/papers/e2477e28f623a8a87046a8831c7c55ef/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/e2477e28f623a8a87046a8831c7c55ef/status.svg)](http://joss.theoj.org/papers/e2477e28f623a8a87046a8831c7c55ef)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@ChrisJefferson & @motib, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @yochannah know.

โœจ Please try and complete your review in the next two weeks โœจ

Review checklist for @ChrisJefferson

Conflict of interest

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Version: Does the release version given match the GitHub release (1.2)?
  • [x] Authorship: Has the submitting author (@cicirello) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Authors: Does the paper.md file include a list of authors with their affiliations?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @motib

Conflict of interest

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Version: Does the release version given match the GitHub release (1.0)?
  • [x] Authorship: Has the submitting author (@cicirello) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Authors: Does the paper.md file include a list of authors with their affiliations?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
accepted published recommend-accept review

All 57 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @ChrisJefferson, it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews ๐Ÿ˜ฟ

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
Attempting PDF compilation. Reticulating splines etc...

@motib Just replying to your question from the pre-review issue.... JUnit tests don't have main methods. They're executed differently (either in an IDE or with a runner, such as the one within ant). In this case, I have the ant build script set up to run all of the JUnit tests during the build.

When you run ant to build the project, you'll see the JUnit test results among the output. The relevant lines are labeled as JUnit. After each JUnit test class is executed, ant will output a line indicating the number of test cases, some other summary data, and indicating if any failed and why.

The ant build script is also configured to terminate if any tests fail. So you'll only get a successful build if they all pass.

There are also example programs in the examples and replication folders, which you can run from command line (see the readmes in those folders).

Thanks, @cicirello, I eventually got it to work.

I suggest that the readme explain how to compile and run the programs using only "javac" and "java" in directories not part of the distribution. I imagine that some researchers in algorithms are inexperienced programmers who are not familiar with the entire ecosystem (e.g. ant), and who might need help with nested packages. But you know your community well and if you think that I am misjudging their level of software expertise, feel free to leave everything as it.

After reviewing according to the review criteria, I recommend "accept with minor revisions". Suggestions follow:

JOSS paper

The paper is too long (the author guidelines state 250-1000 words). If you still think that a long paper is necessary, please give _subsections_ according to the JOSS guidelines: * A summary describing the high-level functionality, and purpose of the software for a diverse, non-specialist audience, * A clear statement of need that illustrates the purpose of the software, * Mentions (if applicable) of any ongoing research projects using the software or recent scholarly publications enabled by it

Given the guideline of "non-specialist audience", in addition to the research papers, there should be a reference to a textbook presentation or tutorial on algorithm permutations. (My 1974! edition of Aho, Hopcroft and Ullman only has one exercise on permutations. Cormen, Leiserson and Rivest might have more.)

Example usage

OK, but consider the suggestion I I wrote in the previous post.

API documentation

The link in the local copy of "index.html" doesn't work. I could only access it through the link on "https://jpt.cicirello.org/".

The API itself is excellent, but the guidelines recommend "including example inputs and outputs". This doesn't need to be (shouldn't be?) in the API, but I suggest that more example programs be included in the project.

@whedon commands

Here are some things you can ask me to do:

# List Whedon's capabilities
@whedon commands

# List of editor GitHub usernames
@whedon list editors

# List of reviewers together with programming language preferences and domain expertise
@whedon list reviewers

# Compile the paper
@whedon generate pdf

๐Ÿšง ๐Ÿšง ๐Ÿšง Experimental Whedon features ๐Ÿšง ๐Ÿšง ๐Ÿšง

# Compile the paper from a custom git branch
@whedon generate pdf from branch custom-branch-name

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@motib I've made changes and/or added things based on your comments. I believe I've accounted for almost all of your suggestions. Here's a summary of what I've done, and what I haven't gotten to yet.

First, list of what I've done so far based on your suggestions:

(1) In the build directory, in addition to the ant build.xml, there is now a Windows batch script that uses only javac (to compile), jar to generate the jar of the library, and java to execute the JUnit tests. If someone has the java jdk installed, they will already have all of these.

(2) I edited the sections of the readme for the repository to document how to build. Now two sections, one for ant, and one for this alternative process.

(3) I added a section to the readme for the repository explaining how to run the example programs (although I do still direct them to the readme in the example folders for more details).

(4) I fixed the bug in the index.html for the docs, where the link to the API docs was broken if viewed locally (I had a leading / in the link, which was fine when viewed via the web, but not when viewed locally).

(5) I added a sentence in the paper, referring to Knuth's book series for general coverage of permutation related algorithms. The CLRS algorithms book you suggested actually doesn't have much on permutations.

(6) I reorganized and edited the text of the paper in various ways. e.g., Now multiple sections, including one on statement of need.

(7) In editing the text of the paper, I also cut down length slightly. It is now just under the upper end of the suggested length range at 998 words (including title, author, keywords, etc,).

Next, what I haven't addressed (yet).... I think the only one of your suggestions that I haven't addressed yet is adding more examples. Are there any parts of the library that you think would especially benefit from more examples? Anyway, I'll probably add more examples on Wednesday (I teach most of tomorrow).

@cicirello Really nice now.

I suggest one example for each of the five top-level packages. You could probably do this with almost no work by adapting the JUnit tests programs to be standalone programs in the examples directory.

No rush. Tomorrow is a holiday here, so I won't get to it again before Thursday.

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@motib I added example programs for each of the top level packages, as you suggested. I also trimmed down the text of the paper a bit more (and made the tables look nicer).

Thanks, @cicirello.

I ran the programs with only one problem: the ieeetevc2016.FDC program ran out of memory (heap space). I added "-Xms1g" to the "java" command and it ran, although it took a long time. (I vaguely remembered that such a parameter exists, although it doesn't appear on the "java" usage, so I had to google it....).

Please add to the readme's and program comments an indication of how much memory and time is needed. (At work I am stuck with a 32bit 3Ghz 4GB computer, but, again, others might also have primitive computers!!)

@motib Thanks for catching the potential memory issue with that one experiment replication program. That one is very memory intensive. No problems on our server with the default heap allocation, but default is probably higher on a server JVM.

Anyway, I added a paragraph to both the repository's readme, and the readme in the replication folder on this (also mentioning it runs long), indicating to use the -Xms1g parameter if an out of memory error is thrown (and explicitly showed the full command line statement).

@cicirello In my generation we ran programs on free-standing computers, not servers. I worked at a university computer center with 250 KILO-bytes of disk space to be shared among everyone! I still find it hard to digest that a small program needs a gigabyte of memory :-).

@yochannah I recommend acceptance of this submission.

Someinitial comments:

Most methods don't have checks for invalid inputs -- personally I would have some kind of checked/debugging mode which checked that (for example) permutations are valid. At least one method I think this is more serious -- Permutation::toInteger can easily overflow (particularly as it calculates n!), but the result could still be used to build a new, invalid permutation.

The xor-swap trick, used in various places in Permutation, is both less clear, and slower, than just using a temporary int variable.

It is unclear if you intend to support permutations on 0 points. The tests don't seem to check them, but the code doesn't forbid them.

There are not many tests for many of the distances, checking a permutation against itself, or the result on identity / 1 or 0 point permutations.

@ChrisJefferson Thank you for your very useful feedback. Here's a list of what I've done to address your comments. I think I've covered all of your comments here.

1) Permutation::toInteger modified to throw an exception if permutation length greater than 12 (to avoid int overflow) and comments edited to document this. Similar change made to ReversalDistance which has a similar limit.

2) On potential for invalid inputs to the various methods of Permutation with index parameters.... Rather than checking, I've now edited documentation to document potential for ArrayIndexOutOfBoundsExceptions.

3) Related to the above, the JUnit tests for the Permutation class tests that the constructors generate valid Permutations. All of the methods of the Permutation class that change the permutation (except 1 of them), if implemented correctly guarantee result is a valid Permutation. The one method that does not have this guarantee, I have now removed (the set method). I was only using set in one other class (ReversalDistance) where there was an alternative way to do the same thing. So have decided not worth the potential risk of misuse of set.

4) Eliminated the xor-swap trick.

5) I hadn't explicitly thought about 0-length permutations. I edited test cases (both tests of Permutation class, as well as the tests of the distance metrics) to test with 0-length permutations. There was one distance metric that didn't handle that case correctly, so fixed that too (everything else handled the 0-length permutation correctly).

6) On distance tests.... All distance metrics already had tests for pairs of identical permutations (see the identicalPermutations method in that test class). However, I added to it, testing with two references to the same permutation object. Also, already testing for 1-length permutations (see the identicalPermutations method in that test class). There is only 1 case of a permutation of length 1: { 0 }, which is a distance of 0 from itself. Anyway, I have edited the tests to also now test the distance between the 0 length permutation from itself (and fixed 2 distance metrics that threw exceptions on the 0 length case). Thanks for catching that I neglected 0-length permutations.

7) This one isn't based on your comments: I had a couple classes in a package org.cicirello.sequences that are really just utility classes that serve the purpose of easing the implementation of the distance metrics in org.cicirello.sequences.distance, and which otherwise wouldn't apply outside of that context. So I moved those utility classes into the org.cicirello.sequences.distance package, and changed access to package-private.

Hey @ChrisJefferson! Quick status check - have you had any chance to check the paper? ๐Ÿ˜„

Sorry, I got distracted.

I am very happy with the whole library now, except for one issue (although it's a big one in some sense).

Personally, I don't like functions like the (Permutation p, int length) permutation constructor, which can produce invalid permutations if given incorrect input data -- I find it's too easy to accidentally make a slip up, and end up with an invalid permutation. In my code I would tend to have not have such functions (or at the least, label them somehow, with unsafe in their name), to make users think hard before using them.

Now, I wouldn't want to reject out of hand on that basis, as I realise the trade-off between speed and safety is a personal one (as I've got older, I've found no amount of speed is worth the occasions I have spent literal weeks tracking down a bug in a massive software project).

Therefore my only request is that either these constructors are made safe (possibly an Assert, or some other system which can be disabled if you really don't want to pay the cost of checking the permutation is valid although I don't know if they are trendy/acceptable in Java nowdays?), or alternatively to clearly document that some constructors can produce invalid permutations if given incorrect input data (this is technically already there, but could be made ALL CAPS, so no-one misses it).

@ChrisJefferson The constructor with parameters Permutation(Permutation p, int length) will now only generate valid permutations. I redefined its behavior (see the updated documentation of this constructor) slightly, but just enough that it could be implemented such that any parameter values can be handled. Also replaced the old test cases for this constructor to test the new behavior.

Related to this, I updated the documentation of the constructor with parameters Permutation(int[] p) to document the exception that is thrown if it is passed an array that contains duplicates or element values outside the interval [0,n). It was already doing that validation check, but the exception wasn't documented in the javadoc comments. It now is.

So now there are no constructors or methods that produce invalid permutations, and parameter values that would lead to such are checked with exceptions thrown and documented with @throws.

Excellent, thanks.

I am now completely happy with this package.

This is not at all a requirement, but I notice when looking in github ( for example https://github.com/cicirello/JavaPermutationTools/blob/master/src/org/cicirello/permutations/Permutation.java#L92 ), that the formatting looks weird. Mixed tabs+spaces? Or a tab size different to github's?

@ChrisJefferson I fixed the weird formatting. I think I had edited that one at some point from a different computer than usual with IDE configured differently (spaces vs tabs).

Okay - @ChrisJefferson If you're happy with the package can you run through the checklist at the top of this issue and make sure you're happy to check them all? Thanks! Once that's done we can move on to publishing this

My only issue is that there have been some commits, as part of the review, which should be part of a new release.

@ChrisJefferson @yochannah I just created a new release to include all commits made during the review process.

@cicirello This looks good! Once @ChrisJefferson has completed his check boxes I'll give everything a once-over and then we can hopefully publish!

I am now happy with everything.

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@cicirello - At this point could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive?

@yochannah Here's the Zenodo DOI:
DOI

OK. 10.5281/zenodo.1478192 is the archive.

@arfon ready to accept! ๐Ÿ’ฏ

@whedon accept

Attempting dry run of processing paper acceptance...

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/50

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/50, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

@whedon accept deposit=true

Doing it live! Attempting automated processing of paper acceptance...

๐Ÿšจ๐Ÿšจ๐Ÿšจ THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! ๐Ÿšจ๐Ÿšจ๐Ÿšจ

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/51
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.00950
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! ๐ŸŽ‰๐ŸŒˆ๐Ÿฆ„๐Ÿ’ƒ๐Ÿ‘ป๐Ÿค˜

    Any issues? notify your editorial technical team...

@ChrisJefferson, @motib - many thanks for your reviews here and to @yochannah for editing this submission โœจ

@cicirello - your paper is now accepted into JOSS :zap: :rocket: :boom:

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00950/status.svg)](https://doi.org/10.21105/joss.00950)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.00950">
  <img src="http://joss.theoj.org/papers/10.21105/joss.00950/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.00950/status.svg
   :target: https://doi.org/10.21105/joss.00950

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Was this page helpful?
0 / 5 - 0 ratings