Joss-reviews: [REVIEW]: Akmedoids R package for generating directionally-homogeneous clusters of longitudinal data sets

Created on 23 Jun 2020 · 42Comments · Source: openjournals/joss-reviews

Submitting author: @MAnalytics (Monsuru Adepeju)
Repository: https://github.com/MAnalytics/akmedoids
Version: v0.1.5
Editor: @karthik
Reviewer: @karthik, @njtierney
Archive: Pending

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/6b1b3bbe3259b3a3a42c4024de638a08"><img src="https://joss.theoj.org/papers/6b1b3bbe3259b3a3a42c4024de638a08/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/6b1b3bbe3259b3a3a42c4024de638a08/status.svg)](https://joss.theoj.org/papers/6b1b3bbe3259b3a3a42c4024de638a08)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@karthik & @njtierney , please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

Make sure you're logged in to your GitHub account
Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @karthik know.

✨ Please try and complete your review in the next six weeks ✨

Review checklist for @karthik

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@MAnalytics) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @njtierney

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[ ] Repository: Is the source code for this software available at the repository url?
[ ] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[ ] Contribution and authorship: Has the submitting author (@MAnalytics) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

[ ] Installation: Does installation proceed as outlined in the documentation?
[ ] Functionality: Have the functional claims of the software been confirmed?
[ ] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[ ] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[ ] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[ ] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[ ] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[ ] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[ ] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[ ] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[ ] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

review

Source

whedon

👀1 👍1

Most helpful comment

Thanks for your reponse, @MAnalytics !

I will reply shortly.

njtierney on 16 Nov 2020

❤1 👍1

All 42 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @karthik, @njtierney it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf

whedon on 23 Jun 2020

Reference check summary:

OK DOIs

- 10.1007/s00180-009-0178-4 is OK
- 10.1007/s10940-014-9228-3 is OK
- 10.1111/j.1745-9125.2004.tb00541.x is OK
- 10.1007/s10940-016-9301-1 is OK
- 10.4324/9781315089256-19 is OK
- 10.1177/1088767909337701 is OK
- 10.1093/bjc/azx022 is OK
- 10.1016/0167-9473(92)90042-e is OK
- 10.1080/03610917408548446 is OK
- 10.1016/0377-0427(87)90125-7 is OK

MISSING DOIs

- None

INVALID DOIs

- None

whedon on 23 Jun 2020

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 23 Jun 2020

@whedon generate pdf

karthik on 23 Jun 2020

👀1

:point_right: Check article proof :page_facing_up: :point_left:

whedon on 23 Jun 2020

Just touching base here, I'll review this article this week!

njtierney on 5 Jul 2020

👀1 👍1

Just touching base here, I'll review this article this week!

Thank you very much. I appreciate it.

MAnalytics on 5 Jul 2020

An update: I am in the process of review right now, but I have a few suggestions, collating these together will take me a bit longer than anticipated, I'll touch base later this week.

njtierney on 14 Jul 2020

👍2 👀1

An update: I am in the process of review right now, but I have a few suggestions, collating these together will take me a bit longer than anticipated, I'll touch base later this week.

@karthik @njtierney I am looking forward to this. Regards.

MAnalytics on 27 Jul 2020

👀1

Initial review of Akmedoids R package for generating directionally-homogeneous clusters of longitudinal data sets

Here are some general heuristic checks. Particularly problematic issues are using 1:nrow and importing full packages.

── Good Practice akmedoids 

It is good practice to

  ✖ write unit tests for all functions, and all package code
    in general. 0% of code lines are covered by test cases.

    R/akmedoids.clust.R:31:NA
    R/akmedoids.clust.R:35:NA
    R/akmedoids.clust.R:36:NA
    R/akmedoids.clust.R:40:NA
    R/akmedoids.clust.R:43:NA
    ... and 658 more lines

  ✖ use '<-' for assignment instead of '='. '<-' is the
    standard, and R users and developers are used it and it is easier
    to read your code for them if you use '<-'.

    R/akmedoids.clust.R:77:8
    R/akmedoids.clust.R:83:6
    R/akmedoids.clust.R:245:14
    R/akmedoids.clust.R:272:23
    R/akmedoids.clust.R:295:23
    ... and 22 more lines

  ✖ avoid long code lines, it is bad for readability. Also,
    many people prefer editor windows that are about 80 characters
    wide. Try make your lines shorter than 80 characters

    R/akmedoids.clust.R:3:1
    R/akmedoids.clust.R:4:1
    R/akmedoids.clust.R:5:1
    R/akmedoids.clust.R:6:1
    R/akmedoids.clust.R:7:1
    ... and 106 more lines

  ✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...),
    1:NROW(...) and 1:NCOL(...) expressions. They are error prone and
    result 1:0 if the expression on the right hand side is zero. Use
    seq_len() or seq_along() instead.

    R/akmedoids.clust.R:77:10
    R/akmedoids.clust.R:81:22
    R/akmedoids.clust.R:89:34
    R/dataImputation.R:109:8
    R/outlierDetect.R:67:11
    ... and 16 more lines

  ✖ not import packages as a whole, as this can cause name
    clashes between the imported packages. Instead, import only the
    specific functions you need.
  ✖ avoid 'T' and 'F', as they are just variables which are
    set to the logicals 'TRUE' and 'FALSE' by default, but are not
    reserved words and hence can be overwritten by the user.  Hence,
    one should always use 'TRUE' and 'FALSE' for the logicals.

    R/statPrint.R:NA:NA

Please add a language field to your DESCRIPTION file
Please check your package for spelling mistakes. Not all of these are mistakes but I see a few.

Potential Spelling mistakes


DESCRIPTION does not contain 'Language' field. Defaulting to 'en-US'.
  WORD              FOUND IN
’Whitespaces’ wSpaces.Rd:19
                  akmedoids-vignette.Rmd:365
agentlans         elbowPoint.Rd:21
al                akmedoids.clust.Rd:39
                  statPrint.Rd:27,57
                  description:2
                  README.md:7
                  akmedoids-vignette.Rmd:358,428,526,567,605
alphaLabel        alphaLabel.Rd:19
analysing         README.md:3
Appl              akmedoids.clust.Rd:40
behavioural       akmedoids-vignette.Rmd:126
Calinski          akmedoids.clust.Rd:18,24,41
                  description:6
Caliński         README.md:104
categorise        statPrint.Rd:27
                  akmedoids-vignette.Rmd:536
centroids         akmedoids.clust.Rd:27
charater          statPrint.Rd:19
Commun            akmedoids.clust.Rd:41
complementarity   akmedoids-vignette.Rmd:504
Comput            akmedoids.clust.Rd:40
corresp           dataImputation.Rd:17,18,19,20,26
customised        statPrint.Rd:43
                  akmedoids-vignette.Rmd:578
denomin           rates.Rd:14
doi               description:2,7,10
et                akmedoids.clust.Rd:39
                  statPrint.Rd:27,57
                  description:2
                  README.md:7
                  akmedoids-vignette.Rmd:358,428,526,567,605
GBTM              akmedoids-vignette.Rmd:126
Genolini          akmedoids.clust.Rd:39
                  description:2
ggplot            description:9
github            elbowPoint.Rd:21
Harabasz          akmedoids.clust.Rd:24,41
                  README.md:104
Harabatz          akmedoids.clust.Rd:18
                  description:6
http              akmedoids.clust.Rd:39
imput             akmedoids-vignette.Rmd:194
initialisation    akmedoids.clust.Rd:14
                  akmedoids-vignette.Rmd:126
jss               description:2
jstatsoft         akmedoids.clust.Rd:39
judgement         akmedoids-vignette.Rmd:504
kmeans            README.md:7
                  akmedoids-vignette.Rmd:126
Kmedoids          title:1
kml               akmedoids.clust.Rd:39
knitr             akmedoids-vignette.Rmd:428
maximisation      akmedoids-vignette.Rmd:126
md                NEWS.md:2
medoid            statPrint.Rd:25,27
                  akmedoids-vignette.Rmd:526,534,536,567
minimised         akmedoids-vignette.Rmd:126
minimises         akmedoids-vignette.Rmd:126
modelling         akmedoids-vignette.Rmd:126
normalise         akmedoids-vignette.Rmd:262
optimality        akmedoids.clust.Rd:18
param             akmedoids.clust.Rd:21
png               akmedoids-vignette.Rmd:428
propotion         props.Rd:14
quartile          statPrint.Rd:27
recognise         akmedoids-vignette.Rmd:194
recognises        akmedoids.clust.Rd:12
                  dataImputation.Rd:12
                  outlierDetect.Rd:13
                  props.Rd:12
                  removeRowsN.Rd:12
                  statPrint.Rd:23
                  akmedoids-vignette.Rmd:163
removeRowNA       NEWS.md:13
Rousseeuw         akmedoids.clust.Rd:40
                  description:5
                  README.md:102
sep               akmedoids-vignette.Rmd:428
showplots         NEWS.md:12
spacebar          wSpaces.Rd:19
                  akmedoids-vignette.Rmd:365
statPrints        NEWS.md:12
traj              rates.Rd:12,14
vartheta          akmedoids-vignette.Rmd:526
ve                akmedoids-vignette.Rmd:567
victimisation     akmedoids-vignette.Rmd:126
visualisation     statPrint.Rd:43
                  description:8
                  akmedoids-vignette.Rmd:578
visualise         akmedoids-vignette.Rmd:362
Wickham           description:9
www               akmedoids.clust.Rd:39

Your tests are on a folder called test NOT tests but uses testthat infrastructure. How are you running the tests? Also your tests are not passing and barely cover any of the code. Please fix your tests and coverage before we can proceed.

karthik on 29 Jul 2020

👍1

@karthik Thanks for the efforts. I will address all these issues asap.

MAnalytics on 30 Jul 2020

👀1

Dear @karthik @njtierney

This is to inform you that I have now addressed all the issues raised above.
I know you must be very busy, especially during this time period, I will greatly appreciate it if you can promptly review these corrections so that we can proceed to the next stage.
Thank you very much.
MA

Summary of revisions:

Included unit tests for all function, and all package codes.
Imported only required library functions (instead of whole packages).
Change all assignment signs '=' to '<-'
Short code line (<80 characters wide) implemented for all functions
Used seq_len( ) function to prevent errors associated with 1:nrow( ), 1:length( ), ....
'TRUE' and 'FALSE' utilized instead of 'T' and 'F', respectively.
Language field included in the DESCRIPTION file
All spelling errors addressed.

MAnalytics on 19 Aug 2020

Some of these issues still remain. Here are a few examples.

https://github.com/MAnalytics/akmedoids/blob/master/R/akmedoids.clust.R#L153

https://github.com/MAnalytics/akmedoids/blob/master/R/akmedoids.clust.R#L167

https://github.com/MAnalytics/akmedoids/blob/master/R/dataImputation.R#L175

karthik on 19 Aug 2020

👍1

@karthik I really appreciate your prompt review.
The highlighted issues have now been resolved. Thanks, MA

MAnalytics on 19 Aug 2020

There are still several more of these issues. Please look through your code where you do 1:nrow or 1:ncol. If the dimensions are 0 you will get silent errors that will go uncaught.

https://github.com/MAnalytics/akmedoids/blob/master/R/statPrint.R#L126
https://github.com/MAnalytics/akmedoids/blob/master/R/statPrint.R#L210
https://github.com/MAnalytics/akmedoids/blob/master/R/statPrint.R#L261
https://github.com/MAnalytics/akmedoids/blob/master/R/statPrint.R#L321

karthik on 19 Aug 2020

👍1

@karthik I have gone through all the functions and corrected these issues accordingly.
Thanks, MA

MAnalytics on 19 Aug 2020

Running googpractice::gp() I found a few missed cases:

I note that 1:length is used here:

https://github.com/MAnalytics/akmedoids/blob/master/tests/testthat/test_rates.R#L19

FYI There are minor also some style changes that can be fixed

using = instead of <-
using T and F instead of TRUE and FALSE
using trailing ;

But I can resume my review from here, these are minor things. I would recommend running googpractice::gp()

njtierney on 20 Aug 2020

👍1

@njtierney Thank you very much for the suggestion of googpractice::gp( )
I have now used the package to fix all the issues.
Thanks.

MAnalytics on 20 Aug 2020

👀1

@njtierney @karthik All long code issues have also been fixed. In other words, all code lines are now 80 or less character wide.
I hope to hear from you soonest.
Thanks, MA

MAnalytics on 21 Aug 2020

👀1

@karthik @njtierney Please, can I have a rough idea of when the review will be completed? We are using this work to cite your journal in another publication.
Hope to hear from you soon.
MA

MAnalytics on 22 Aug 2020

@MAnalytics I appreciate your enthusiasm and desire to see this review completed asap. But please let us complete our review as we manage other commitments during the pandemic. We'll update the thread when ready. There is no need to reply constantly on this thread until action is needed. Thanks for your patience.

karthik on 22 Aug 2020

Hi all! I am the rotating associate editor in chief this week. I see that nothing has happened on this review for over a month. I'd like to check in and see where we are at. My brief perusal of this review issue looks like items have been brought up by reviewers @karthik and @njtierney and responded to be the author @MAnalytics. However, many review boxes remain unchecked. What is the next step in this review?

kthyng on 5 Oct 2020

👀1

Thanks for checking in @kthyng. It's been on our list to finish this. We'll complete the reviews this week. Appreciate the nudge. :pray:

karthik on 5 Oct 2020

👀1

@MAnalytics Can you please add these files

Non-standard files/directories found at top level:
    'cran-comments.md' 'paper.bib' 'paper.md'

to your ..Rbuildignore

Please fix devtools installation instructions in your vignette. Currently it is:

devtools::install_github("manalytics/packages/akmedoids")

(remove packages).

karthik on 21 Oct 2020

@whedon check references

karthik on 21 Oct 2020

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1007/s00180-009-0178-4 is OK
- 10.1007/s10940-014-9228-3 is OK
- 10.1111/j.1745-9125.2004.tb00541.x is OK
- 10.1007/s10940-016-9301-1 is OK
- 10.4324/9781315089256-19 is OK
- 10.1177/1088767909337701 is OK
- 10.1093/bjc/azx022 is OK
- 10.1016/0167-9473(92)90042-e is OK
- 10.1080/03610917408548446 is OK
- 10.1016/0377-0427(87)90125-7 is OK

MISSING DOIs

- None

INVALID DOIs

- None

whedon on 21 Oct 2020

Please also cite R in the text "A full demonstration is provided in the package vignette of how to deploy Akmedoids to examine long-term relative exposure to crime in R"

R citation: https://cran.r-project.org/doc/FAQ/R-FAQ.html#Citing-R

karthik on 21 Oct 2020

@MAnalytics Once these minor issues are fixed and @njtierney signs off, I'm good to finish up the final steps.

karthik on 21 Oct 2020

👍1

@karthik @njtierney Thanks.
I have now addressed these highlighted issues.
MA.

MAnalytics on 21 Oct 2020

👍1

@whedon check references

karthik on 21 Oct 2020

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1007/s00180-009-0178-4 is OK
- 10.1007/s10940-014-9228-3 is OK
- 10.1111/j.1745-9125.2004.tb00541.x is OK
- 10.1007/s10940-016-9301-1 is OK
- 10.4324/9781315089256-19 is OK
- 10.1177/1088767909337701 is OK
- 10.1093/bjc/azx022 is OK
- 10.1016/0167-9473(92)90042-e is OK
- 10.1080/03610917408548446 is OK
- 10.1016/0377-0427(87)90125-7 is OK

MISSING DOIs

- None

INVALID DOIs

- None

whedon on 21 Oct 2020

@njtierney Please, what step should I take next? Thanks.

MAnalytics on 23 Oct 2020

@MAnalytics I have checked in with Nick and he will respond in the next few days.

karthik on 27 Oct 2020

👍1

@karthik @njtierney Thank you very much. I look forward to it.

MAnalytics on 27 Oct 2020

Thank you for your patience everyone, here is my review.

I have a few concerns regarding the functionality of the software and how easy it is to use for a data analysis. I appreciate that perhaps these points might at first appearance appear to be superficial, however I think that making these improvements will make the user experience better, and improve the package overall - I myself would love to use some of this work in my longitudinal data analysis, however at the moment I think that there are some improvements that can be made to the code.

I am very happy to provide more pointers/advice/detail if you have any questions about my review or comments on the code.

Thank you again for your submission, I enjoyed reading it, and appreciate your strong effort in making this project open source.

Review checklist for [@njtierney]

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?
[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
[ ] Contribution and authorship: Has the submitting author ([@MAnalytics]) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
- @MAnalytics appears to have been a major contributor to the software, although there are a lot of commits from geoMADE, but this does not appear to be an account (https://github.com/geoMADE), so I assume that these commits are from @MAnalytics and are named as geoMADE somehow.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[ ] Functionality: Have the functional claims of the software been confirmed?

One common theme I've seen in this package is that there is a lot of code being printed out to the console when functions are being run, and there is no way to silence this. For example:

```{r example-clust}
example_clust <- akmedoids.clust(all_traj, k=c(3,10), crit = "Silhouette")

[1] "solution of k = 3 determined!"
[1] "solution of k = 4 determined!"
[1] "solution of k = 5 determined!"
[1] "solution of k = 6 determined!"
[1] "solution of k = 7 determined!"
[1] "solution of k = 8 determined!"
[1] "solution of k = 9 determined!"
[1] "solution of k = 10 determined!"
[1] "Suggested optimal solution contains 3 clusters. See the plot for further examination!"

These print messages are useful, but there should be a `verbose` or `quiet` argument to suppress these. This information should be also stored in the output of the object. 

This function also opened a graphic window and plot. There is no way for the user to suppress this. The output should be stored in the object, and if you really want to show the plot to uses, have an option to print the plot when running the function, perhaps with an argument like `show_plot = TRUE`.

Reading the documentation of the package, the function returns different output based on the value of K. The documentation reads:

> #' @return If \code{k} is a vector of two numbers (see param.
#' \code{k} details above), the output is a graphical plot of
#' the quality scores of the cluster solutions. If \code{k} is
#' an exact integer number of clusters, the function returns
#' trajectory labels indicating the group membership of the
#' corresponding trajectory in the \code{traj} object.

I think that this makes the software harder to use, as there is an implied workflow that could be made simpler and easier to use. 

It seems then that the next step is to run the same code again but with `k` = 3. I am not sure why this needs to be rerun, shouldn't this be stored as the output? For example, a vector of classes for each value of `k`.

Running the function again with `k = 3`, I get the following:

```{r example-clust-3}
example_clust_3 <- akmedoids.clust(all_traj, k=c(3), crit = "Silhouette")
example_clust_3

Finally to summarise the information we run:

```{r}

get the optimal cluster solution

clustr <- akmedoids.clust(all_traj, k=3)
clustr <- as.vector(clustr$memberships)$alphabetic_Labels

get group statistics

descrStats <- statPrint(clustr, all_traj,
id_field = FALSE, showplots=FALSE)

print(descrStats)

I believe that this can be improved in a few ways.

Firstly, writing 

```r
clustr <- akmedoids.clust(all_traj, k=3)
clustr <- as.vector(clustr$memberships)$alphabetic_Labels

should be avoided as you are overwriting the object.

Secondly, the user should not need to perform special data manipulation (such as accessing a nested list with (as.vector(clustr$memberships)$alphabetic_Labels) in order to summarise the data.

This could be improved by writing a special print method for the software. This would mean that printing clustr out to the console, e.g.,

clustr <- akmedoids.clust(all_traj, k=3)
clustr

Outputting the clustr object in the console, it could return the information that statPrint returns. You can read more about creating custom print methods here - https://adv-r.hadley.nz/s3.html, https://blog.earo.me/2019/11/03/practical-guide-to-s3/.

It seems that there is a workflow implied:

Test multiple different values of k with akmedoids.clust
- explore the plots of the optimal solution based on a certain criterion.
- decide on ideal value of K
Run clustering for the optimally identified value of k
- using akmedoids.clust again
summarise this information using statPrint

I think that this could be improved in the following way.

Provide a function for exploring different values of K
Provide a function for identifying the cluster labels
Provide good print methods, so the output that is given to the console could be delivered via the print method.

I suggestion is the output of this function should contain possible memberships for

I also note that statPrint is noted as performing two tasks - (1) generating statistics of groups, and (2), creating plots.

In my opinion the output of statPrint and akmedoids.clust functions should be separated into two functions, or at least provide options for the plots to be turned off. If they are turned into functions, I recommend:

One function for identifying clusters or summary statistics that return a data frame output or a list.
And one function that focuses on creating graphics.

I have some other comments on the functions in this package:

akmedoids::alphaLabel()
- I'm not sure if the output of this needs to be a list, it could instead just be the vector of characters.
- Is it possible to provide an example of the use of this function?
akmedoids::dataImputation()
- see note in "common issues" about printing messages
akmedoids::elbowPoint()
- In example documentation, the function is applied to the plot (which is great!) but is not demonstrated in other contexts. I suggest providing an example outside of plotting as well.

akmedoids::outlierDetect()

This prints out several messages and objects, see below note on messages.

akmedoids::props()
- states output is a matrix, but is a data.frame
akmedoids::rates()
- This function doesn't return a matrix, instead returning a vector of common IDs. I am not sure what these mean, as their behaviour is not documented in the helpfile.
akmedoids::removeRowsN()
- Could the documentation clarify how is this different to na.omit()?
akmedoids::wSpaces()
- The example for this function doesn't contain whitespace characters
- Could the documentation explain how this is different to trimws function in R?
akmedoids::statPrint()
- It is very neat that this function can generate a nice looking ggplot, however it would be great if that plot was instead a separate function

Common issues

plots all open new graphics devices rather than using standard print methods, which can be problematic when trying to include them in documents
no way to silence console output - suggest adding a quiet or verbose option, and to use message when printing messages to the console.
- e.g., the rates, akmedoids.clust functions
[X] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Yes.

Documentation

[ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

The statement in the README says:

identify clusters of trajectories with similar long-term linear trends over time, providing an improved cluster identification as compared with the classic kmeans algorithm

[X] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.

Yes

[ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

Although the author provides instructions in the README, but these contain no output of the function. This should be updated to make the README a README.Rmd function. See usethis::use_readme_rmd() as an approach on doing this.

In addition, the README simulates some example data for use - this should be encapsulated as an example dataset in the package that can be loaded.

[ ] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
example documentation has unnecessary traj <- traj line in "examples"
The datasets are not described in the helpfiles, and state that they are matrices when they are data frames. This data should be documented more thoroughly (see https://r-pkgs.org/data.html#documenting-data for details on how to document data, and https://github.com/jforbes14/eechidna/blob/master/R/data.R for an example documentation of data).
[X] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

There are tests

[X] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[X] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[X] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[X] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[X] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[X] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

njtierney on 30 Oct 2020

@njtierney Thank you very much for taking the time to provide the above reviews. I will address the issues highlighted asap. Thanks once again.

MAnalytics on 31 Oct 2020

Dear @njtierney (and @karthik),

We greatly appreciate you taking the time to review our paper.

We have addressed all the issues raised by @njtierney, and we hope that the paper will now be accepted for publication. Please, find our response to each of the issues below.

Kind Regards,

Monsuru (@MAnalytics)

#-----------------------------------

Thank you for your patience everyone, here is my review.

I have a few concerns regarding the functionality of the software and how easy it is to use for a data analysis. I appreciate that perhaps these points might at first appearance appear to be superficial, however I think that making these improvements will make the user experience better, and improve the package overall - I myself would love to use some of this work in my longitudinal data analysis, however at the moment I think that there are some improvements that can be made to the code.

I am very happy to provide more pointers/advice/detail if you have any questions about my review or comments on the code.

Thank you again for your submission, I enjoyed reading it, and appreciate your strong effort in making this project open source.

Review checklist for [@njtierney]

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the repository url?

[x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?

[ ] Contribution and authorship: Has the submitting author ([@MAnalytics]) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

## ------------------------------------------
1. Reviewer's comment:
## ------------------------------------------

@MAnalytics appears to have been a major contributor to the software, although there are a lot of commits from geoMADE, but this does not appear to be an account (https://github.com/geoMADE), so I assume that these commits are from @MAnalytics and are named as geoMADE somehow.

## ------------------------------------------
Response:
## ------------------------------------------

@MAnalytics is the same account as @geoMADE - the account name was changed some time ago.

## ------------------------------------------
2. Reviewer's comment:
## ------------------------------------------

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?

[ ] Functionality: Have the functional claims of the software been confirmed?

One common theme I've seen in this package is that there is a lot of code being printed out to the console when functions are being run, and there is no way to silence this. For example:
example_clust <- akmedoids.clust(all_traj, k=c(3,10), crit = "Silhouette")
returns
[1] "solution of k = 3 determined!"
[1] "solution of k = 4 determined!"
[1] "solution of k = 5 determined!"
[1] "solution of k = 6 determined!"
[1] "solution of k = 7 determined!"
[1] "solution of k = 8 determined!"
[1] "solution of k = 9 determined!"
[1] "solution of k = 10 determined!"
[1] "Suggested optimal solution contains 3 clusters. See the plot for further examination!"
These print messages are useful, but there should be a verbose or quiet argument to suppress these. This information should be also stored in the output of the object.

This function also opened a graphic window and plot. There is no way for the user to suppress this. The output should be stored in the object, and if you really want to show the plot to uses, have an option to print the plot when running the function, perhaps with an argument like show_plot = TRUE.

Reading the documentation of the package, the function returns different output based on the value of K. The documentation reads:

' @return If \code{k} is a vector of two numbers (see param.

' \code{k} details above), the output is a graphical plot of

' the quality scores of the cluster solutions. If \code{k} is

' an exact integer number of clusters, the function returns

' trajectory labels indicating the group membership of the

' corresponding trajectory in the \code{traj} object.

I think that this makes the software harder to use, as there is an implied workflow that could be made simpler and easier to use.

It seems then that the next step is to run the same code again but with k = 3. I am not sure why this needs to be rerun, shouldn't this be stored as the output? For example, a vector of classes for each value of k.

Running the function again with k = 3, I get the following:
example_clust_3 <- akmedoids.clust(all_traj, k=c(3), crit = "Silhouette")
example_clust_3
Finally to summarise the information we run:
#get the optimal cluster solution
clustr <- akmedoids.clust(all_traj, k=3)
clustr <- as.vector(clustr$memberships)$alphabetic_Labels

##get group statistics
descrStats <- statPrint(clustr, all_traj,
                       id_field = FALSE, showplots=FALSE)

print(descrStats)
I believe that this can be improved in a few ways.

Firstly, writing
clustr <- akmedoids.clust(all_traj, k=3)
clustr <- as.vector(clustr$memberships)$alphabetic_Labels
should be avoided as you are overwriting the object.

Secondly, the user should not need to perform special data manipulation (such as accessing a nested list with (as.vector(clustr$memberships)$alphabetic_Labels) in order to summarise the data.

This could be improved by writing a special print method for the software. This would mean that printing clustr out to the console, e.g.,
clustr <- akmedoids.clust(all_traj, k=3)
clustr
Outputting the clustr object in the console, it could return the information that statPrint returns. You can read more about creating custom print methods here - https://adv-r.hadley.nz/s3.html, https://blog.earo.me/2019/11/03/practical-guide-to-s3/.

It seems that there is a workflow implied:

Test multiple different values of k with akmedoids.clust

explore the plots of the optimal solution based on a certain criterion.

decide on ideal value of K

Run clustering for the optimally identified value of k

using akmedoids.clust again

summarise this information using statPrint

I think that this could be improved in the following way.

Provide a function for exploring different values of K

Provide a function for identifying the cluster labels

Provide good print methods, so the output that is given to the console could be delivered via the print method.

I suggestion is the output of this function should contain possible memberships for

I also note that statPrint is noted as performing two tasks - (1) generating statistics of groups, and (2), creating plots.

In my opinion the output of statPrint and akmedoids.clust functions should be separated into two functions, or at least provide options for the plots to be turned off. If they are turned into functions, I recommend:

One function for identifying clusters or summary statistics that return a data frame output or a list.

And one function that focuses on creating graphics.

## ------------------------------------------
Response:
## ------------------------------------------

Thank you very much for taking the time to explore these functions and suggesting ways to improve their functionalities.

First, in order to improve the usability of the package, we make the following changes to the akmedoids.clust and statPrint functions, and how they can be deployed.

Change of names: We changed the names of akmedoids.clust and statPrint functions to more intuitive names: akClustr and print_akStats (a special print function) functions, respectively.
Workflow: The above two functions can now be deployed to perform clustering and visualize the results, respectively, following two simple steps as follows:

(a) Use akClustr function to generate the solutions for different values of k. For example,

```

aksolution <- akClustr(trajectry, k = c(3,7), verbose = FALSE, quality_Plot=FALSE)

To print the results out to the console, a user can just type:

aksolution

(This will print the solutiond for all values of `k` and also indicates which amongst them is the optimal solution – i.e. `$optimal_k`).

As seen above, the function now includes `verbose` argument to suppress the messages during processing, and a `quality_Plot` argument to control the plot of quality criteria. The output of `akClustr` function is exported as object of class `akObject`. 

(***b***) Given an object of class `akObject` (as in the above), use `print_akStats` function to generate the properties (i.e. descriptive and change statistics) of any specified `k`-cluster solution (i.e. optimal or not). In the example above, the properties of the `k =4` solution for example can be printed as:

print_akStats(aksolution, k = 4, type="lines", y.scaling="fixed", show_Plots = TRUE)
``Note that the function now includesshow_Plots` argument for controlling the clusters plots.

## ------------------------------------------
3. Reviewer's comment:
## ------------------------------------------

I have some other comments on the functions in this package:

akmedoids::alphaLabel()

I'm not sure if the output of this needs to be a list, it could instead just be the vector of characters.

Is it possible to provide an example of the use of this function?

## ------------------------------------------
Response:
## ------------------------------------------

The output is actually a vector of characters.

The goal of akmedoids::alphaLabel() function is to allow the transformation of any numeric cluster labels (particularly of external packages, such as kml) into alphabetic labels. The goal is to allow a comparison of results from such packages (e.g. kml) with the results from our own package. We have now included an @example usage of this type of scenario in the alphaLabel() function, by showing how the numeric cluster labels of kml::affectIndiv() function can be transformed into alphabetic labels. We deployed a real-life Time-at-risk for the adjudicated Toronto Youth datasets (Nielsen, J. 2018).

The alphaLabel() function was originally used within our akClustr function (see line 331 of akClustr function).

## ------------------------------------------
4. Reviewer's comment:
## ------------------------------------------

akmedoids::dataImputation()

see note in "common issues" about printing messages

## ------------------------------------------
Response:
## ------------------------------------------

A verbose argument has now been added to the akmedoids::dataImputation() function to control the output messages.

## ------------------------------------------
5. Reviewer's comment:
## ------------------------------------------

akmedoids::elbowPoint()

In example documentation, the function is applied to the plot (which is great!) but is not demonstrated in other contexts. I suggest providing an example outside of plotting as well.

## ------------------------------------------
Response:
## ------------------------------------------

We have now modified the @example of the elbowPoint() function by using a real-life data set (i.e. Time-at-risk for the adjudicated Toronto Youth datasets). We deploy akClustr function to demonstrate how the elbowPoint() function helps the Silhouette criterion to determine the optimality of k values.

## ------------------------------------------
6. Reviewer's comment:
## ------------------------------------------

akmedoids::outlierDetect()

This prints out several messages and objects, see below note on messages.

## ------------------------------------------
Response:
## ------------------------------------------
A verbose argument is added to control the print out messages.

## ------------------------------------------
7. Reviewer's comment:
## ------------------------------------------

akmedoids::props()

states output is a matrix, but is a data.frame

## ------------------------------------------
Response:
## ------------------------------------------

Thanks. Corrected accordingly.

## ------------------------------------------
8. Reviewer's comment:
## ------------------------------------------

akmedoids::rates()

This function doesn't return a matrix, instead returning a vector of common IDs. I am not sure what these mean, as their behaviour is not documented in the helpfile.

## ------------------------------------------
Response:
## ------------------------------------------

In our checks the function seems to be working fine. Please kindly see the @example provided under the function.

## ------------------------------------------
9. Reviewer's comment:
## ------------------------------------------

akmedoids::removeRowsN()

Could the documentation clarify how is this different to na.omit()?

## ------------------------------------------
Response:
## ------------------------------------------

We have now added the following statement in the description of the removeRowsN() function:

"... . The function is also able to remove records withInfentries, distinguishing it from the popularna.omit()function in R".
>
## ------------------------------------------
10. Reviewer's comment:
## ------------------------------------------*
>

akmedoids::wSpaces()

The example for this function doesn't contain whitespace characters

Could the documentation explain how this is different to trimws function in R?

## ------------------------------------------
Response:
## ------------------------------------------
We have added a new sample data traj_wSpaces.rda which contains whitespaces in order to demonstrate the functionality of this function. Please, see the @example. Note: the wSpaces() function is found to perform the same task as the trimws() function.

## ------------------------------------------
11. Reviewer's comment:
## ------------------------------------------

akmedoids::statPrint()

It is very neat that this function can generate a nice looking ggplot, however it would be great if that plot was instead a separate function

## ------------------------------------------
Response:
## ------------------------------------------
Thanks for the comment. Please, see above. Please, also note that akmedoids::statPrint() has now been changed to akmedoids::print_akStats()

## ------------------------------------------
12. Reviewer's comment:
## ------------------------------------------

Common issues

plots all open new graphics devices rather than using standard print methods, which can be problematic when trying to include them in documents

no way to silence console output - suggest adding a quiet or verbose option, and to use message when printing messages to the console.

e.g., the rates, akmedoids.clust functions

## ------------------------------------------
Response:
## ------------------------------------------
The Common issues have all been addressed (please, see above)

[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Yes.

Documentation

[ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

The statement in the README says:

identify clusters of trajectories with similar long-term linear trends over time, providing an improved cluster identification as compared with the classic kmeans algorithm

[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.

Yes

## ------------------------------------------
13. Reviewer's comment:
## ------------------------------------------

[ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

Although the author provides instructions in the README, but these contain no output of the function. This should be updated to make the README a README.Rmd function. See usethis::use_readme_rmd() as an approach on doing this.

In addition, the README simulates some example data for use - this should be encapsulated as an example dataset in the package that can be loaded.

## ------------------------------------------
Response:
## ------------------------------------------

Thank you for the suggestion.

We have now provided a README.Rmd, addressing the concerns above. We also update the package vignette accordingly. Specifically, we detailed the following in the README.Rmd document:

Installation instructions
- Sample codes for simulating example trajectory groups,
- Codes to visualize simulated trajectory groups.
- Code to cluster the groups, print cluster properties, and plot the clusters.

## ------------------------------------------
14. Reviewer's comment:
## ------------------------------------------

[ ] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

example documentation has unnecessary traj <- traj line in "examples"

The datasets are not described in the helpfiles, and state that they are matrices when they are data frames. This data should be documented more thoroughly (see https://r-pkgs.org/data.html#documenting-data for details on how to document data, and https://github.com/jforbes14/eechidna/blob/master/R/data.R for an example documentation of data).

## ------------------------------------------
Response:
## ------------------------------------------

Unnecessary lines have been removed.

We have now followed the example in https://github.com/jforbes14/eechidna/blob/master/R/data.R and provide detailed documentation of all the sample data sets included in the package. Please, see ./R/data.R

Other additional revision to note:

We renamed population.rda as popl.rda as there is a clash with the population.rda in one of the dependency packages.

[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

There are tests

[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?

[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?

[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

MAnalytics on 16 Nov 2020

Thanks for your reponse, @MAnalytics !

I will reply shortly.

njtierney on 16 Nov 2020

❤1 👍1

@njtierney Ok, Thanks

MAnalytics on 16 Nov 2020

review response

Thank you for your response! I'm looking forward to using this in my own work.

I greatly appreciate your effort in addressing my points, for readability, I will address the relevant points freshly below:

Function arguments should have consistent names

This package uses camelCase for it's functions names, (akClustr, alphaLabel, etc)
Some of the function arguments use mixed names, such as the function, print_akStats has the arguments, "akObject", "k", "reference", "N.quant",, "show_Plots", "type", "y.scaling". These arguments should be either
- all camelCase (akObject, yScaling, nQuant), or
- all lower case separated by an underscore (e.g., ak_object, y_scaling)
It should not be a mix of changing case and separator, as this is hard for the user to predict (e.g., N.quant, show_Plots)

Print methods

This isn't required to be addressed for the review, but it might be helpful for your own work - your function, print_akStats can be written in a way so that when you print an object created by akClustr, it will do the nice printing when output into the console. Again, not needed for this review, but worth knowing about.

Separate functions for plotting

The plot methods currently open up a new device when plots are shown, from this code:

  if(show_Plots==TRUE){
    #plot
    flush.console()
    dev.new(width=3, height=3)
    print(plt)
  }

This code could instead be:

  if(show_Plots==TRUE){
    print(plt)
  }

as there isn't a need to flush the console or create a specific device

I also recommend that this plotting function is separated, perhaps as a plot_akStats function. This can then be called inside your print method if you prefer, but having just the function to plot the information would be really nice.

regarding `akmedoids::rates()`

I stated in my previous review:

This function doesn't return a matrix, instead returning a vector of common IDs. I am not sure what these mean, as their behaviour is not documented in the helpfile.

Your response:

In our checks the function seems to be working fine. Please kindly see the @example provided under the function.

When I run the example code, I get this output:

$common_ids
[1] "E01012628" "E01004768" "E01004803" "E01004804" "E01004807" "E01004808"
[7] "E01004788" "E01004790" "E01004805"

$noncommon_ids_from_trajectories
[1] "E01004806"

$noncommon_ids_from_denominator
[1] "E01004809" "E01004899"

$rates_data
  location_ids X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009
1    E01012628  3.60  2.86  1.00  2.29  4.67  4.00  1.00  4.00  2.00
2    E01004768  4.00  6.40  3.63  2.29  2.00  3.56  0.80  2.18  0.00
3    E01004803  2.86  2.48  4.53  3.35  3.75  3.30  3.29  4.34  2.00
4    E01004804  2.29  1.60  0.00  4.71  0.89  1.26  2.40  2.29  2.91
5    E01004807  5.09  1.92  5.14  1.55  0.94  1.69  2.40  1.12  0.70
6    E01004808  3.76  2.42  5.00  3.61  4.80  2.21  3.09  3.56  1.85
7    E01004788  0.67  1.36  1.82  1.90  2.40  1.84  1.33  2.35  2.00
8    E01004790  6.40  3.64  5.33  2.46  0.57  2.67  3.00  0.47  0.44
9    E01004805  6.00  0.00  2.00  4.00  2.00  0.00  2.00  8.00  0.00

In the documentation, can you describe what, $common_ids $noncommon_ids_from_trajectories, and $noncommon_ids_from_denominator are?

Regarding the README.Rmd

A very minor thing - the README.Rmd file uses html_document instead of github_document (commented out here) - this should then generate the R output that can be seen in the README.md
Not necessary for this review at all, but the authors might also be interested in looking at the pkgdown R package for automatically generating websites for R packages. See https://pkgdown.r-lib.org/ for more details.

njtierney on 17 Nov 2020

👍1

@njtierney, Thank you very much for your efforts. I will address these issues asap.

MAnalytics on 18 Nov 2020

Dear @njtierney (and @karthik)

We thank you once again for your efforts. Please, find below our responses.

@MAnalytics

--------------------------------------------------------------------------------

review response

Thank you for your response! I'm looking forward to using this in my own work.

I greatly appreciate your effort in addressing my points, for readability, I will address the relevant points freshly below:

----------------------------------------------------------------------

1. Reviewer's comment:

----------------------------------------------------------------------

Function arguments should have consistent names

This package uses camelCase for it's functions names, (akClustr, alphaLabel, etc)

Some of the function arguments use mixed names, such as the function, print_akStats has the arguments, "akObject", "k", "reference", "N.quant",, "show_Plots", "type", "y.scaling". These arguments should be either

all camelCase (akObject, yScaling, nQuant), or

all lower case separated by an underscore (e.g., ak_object, y_scaling)

It should not be a mix of changing case and separator, as this is hard for the user to predict (e.g., N.quant, show_Plots)

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

Thank you for the observations. We have now decided to use lowercase in all the cases, i.e. both the names and arguments of the functions are now changed to lowercases.
>

----------------------------------------------------------------------

2. Reviewer's comment:

----------------------------------------------------------------------

Print methods

This isn't required to be addressed for the review, but it might be helpful for your own work - your function, print_akStats can be written in a way so that when you print an object created by akClustr, it will do the nice printing when output into the console. Again, not needed for this review, but worth knowing about.

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

Thank you very much for your suggestion. We will explore the print functionalities later.

----------------------------------------------------------------------

3. Reviewer's comment:

----------------------------------------------------------------------

Separate functions for plotting

The plot methods currently open up a new device when plots are shown, from this code:
  if(show_Plots==TRUE){
    #plot
    flush.console()
    dev.new(width=3, height=3)
    print(plt)
  }
This code could instead be:
  if(show_Plots==TRUE){
    print(plt)
  }
as there isn't a need to flush the console or create a specific device

I also recommend that this plotting function is separated, perhaps as a plot_akStats function. This can then be called inside your print method if you prefer, but having just the function to plot the information would be really nice.

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

>
The plot argument has been modified accordingly.

As suggested, we have now added a plot_akstats function for plotting. This function is called inside the print_akstats function to provide a limited plot functionalities. We then recommended (see @param show_plots) that users deploy plot_akstats for more plot options. The vignette, README.Rmd, test_units, etc. are modified accordingly.

----------------------------------------------------------------------

4. Reviewer's comment:

----------------------------------------------------------------------

regarding akmedoids::rates()

I stated in my previous review:

This function doesn't return a matrix, instead returning a vector of common IDs. I am not sure what these mean, as their behaviour is not documented in the helpfile.

Your response:

In our checks the function seems to be working fine. Please kindly see the @example provided under the function.

When I run the example code, I get this output:
$common_ids
[1] "E01012628" "E01004768" "E01004803" "E01004804" "E01004807" "E01004808"
[7] "E01004788" "E01004790" "E01004805"

$noncommon_ids_from_trajectories
[1] "E01004806"

$noncommon_ids_from_denominator
[1] "E01004809" "E01004899"

$rates_data
  location_ids X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009
1    E01012628  3.60  2.86  1.00  2.29  4.67  4.00  1.00  4.00  2.00
2    E01004768  4.00  6.40  3.63  2.29  2.00  3.56  0.80  2.18  0.00
3    E01004803  2.86  2.48  4.53  3.35  3.75  3.30  3.29  4.34  2.00
4    E01004804  2.29  1.60  0.00  4.71  0.89  1.26  2.40  2.29  2.91
5    E01004807  5.09  1.92  5.14  1.55  0.94  1.69  2.40  1.12  0.70
6    E01004808  3.76  2.42  5.00  3.61  4.80  2.21  3.09  3.56  1.85
7    E01004788  0.67  1.36  1.82  1.90  2.40  1.84  1.33  2.35  2.00
8    E01004790  6.40  3.64  5.33  2.46  0.57  2.67  3.00  0.47  0.44
9    E01004805  6.00  0.00  2.00  4.00  2.00  0.00  2.00  8.00  0.00
In the documentation, can you describe what, $common_ids $noncommon_ids_from_trajectories, and $noncommon_ids_from_denominator are?

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

>
We appreciate your very keen observation. We have now dealt with this issue.

First, we change these output names to more meaningful ones, i.e.

$common_ids remains as $common_ids
$noncommon_ids_from_trajectories changed to $ids_unique_to_traj_data
$noncommon_ids_from_denominator changed to $ids_unique_to_denom_data
$rates_data changed to $rates_estimates

Second, we then provide the description of each variable (see under @return section)

----------------------------------------------------------------------

5. Reviewer's comment:

----------------------------------------------------------------------

Regarding the README.Rmd

A very minor thing - the README.Rmd file uses html_document instead of github_document (commented out here) - this should then generate the R output that can be seen in the README.md

Not necessary for this review at all, but the authors might also be interested in looking at the pkgdown R package for automatically generating websites for R packages. See https://pkgdown.r-lib.org/ for more details.

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

>
The README.md now renders nicely in the browser. Please, see here: https://github.com/MAnalytics/akmedoids/blob/master/README.md

Thank you very much for the recommendation of pkgdown material. We will look into this for our package.

MAnalytics on 18 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[REVIEW]: GRUPO: Gauging Research University Publication Output

whedon · 10Comments

[PRE REVIEW]: Kindel: indel-aware consensus for nucleotide sequence alignments

whedon · 12Comments

[REVIEW]: J2suscep: Calculation of magnetic exchange coupling and temperature dependence of magnetic susceptibility

whedon · 8Comments

[REVIEW]: IBCAO_py: A matplotlib library for using the International Bathymetric Chart of the Arctic Ocean with cartopy and matplotlib

whedon · 12Comments

[REVIEW]: Noisyopt

whedon · 6Comments

Joss-reviews: [REVIEW]: Akmedoids R package for generating directionally-homogeneous clusters of longitudinal data sets

Status

Reviewer instructions & questions

Review checklist for @karthik

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @njtierney

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Most helpful comment

All 42 comments

Initial review of Akmedoids R package for generating directionally-homogeneous clusters of longitudinal data sets

Review checklist for [@njtierney]

Conflict of interest

Code of Conduct

General checks

Functionality

get the optimal cluster solution

get group statistics

Common issues

Documentation

Software paper

Review checklist for [@njtierney]

Conflict of interest

Code of Conduct

General checks

Functionality

' @return If \code{k} is a vector of two numbers (see param.

' \code{k} details above), the output is a graphical plot of

' the quality scores of the cluster solutions. If \code{k} is

' an exact integer number of clusters, the function returns

' trajectory labels indicating the group membership of the

' corresponding trajectory in the \code{traj} object.

Common issues

Documentation

Software paper

review response

Function arguments should have consistent names

Print methods

Separate functions for plotting

regarding akmedoids::rates()

Regarding the README.Rmd

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

review response

----------------------------------------------------------------------

1. Reviewer's comment:

----------------------------------------------------------------------

Function arguments should have consistent names

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

----------------------------------------------------------------------

2. Reviewer's comment:

----------------------------------------------------------------------

Print methods

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

----------------------------------------------------------------------

3. Reviewer's comment:

----------------------------------------------------------------------

Separate functions for plotting

----------------------------------------------------------------------

Response:

----------------------------------------------------------------------

----------------------------------------------------------------------

4. Reviewer's comment:

----------------------------------------------------------------------

regarding akmedoids::rates()

----------------------------------------------------------------------

Response:

regarding `akmedoids::rates()`

regarding `akmedoids::rates()`