Submitting author: @dmey (D. Meyer)
Repository: https://github.com/dmey/synthia
Version: 1.0.0
Editor: @oliviaguest
Reviewer: Pending
Managing EiC: Kyle Niemeyer
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
Author instructions
Thanks for submitting your paper to JOSS @dmey. Currently, there isn't an JOSS editor assigned to your paper.
The author's suggestion for the handling editor is @arfon.
@dmey if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). In addition, this list of people have already agreed to review for JOSS and may be suitable for this submission (please start at the bottom of the list).
Editor instructions
The JOSS submission bot @whedon is here to help you find and assign reviewers and start the main review. To find out what @whedon can do for you type:
@whedon commands
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks.
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Software report (experimental):
github.com/AlDanial/cloc v 1.84 T=0.12 s (354.6 files/s, 60453.2 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
SVG 1 0 0 4607
Python 20 316 361 839
Markdown 7 89 0 115
Jupyter Notebook 4 0 390 81
YAML 3 11 5 71
CSS 1 7 7 61
TeX 1 5 0 47
reStructuredText 3 38 66 41
INI 1 0 0 2
HTML 1 0 0 2
-------------------------------------------------------------------------------
SUM: 42 466 829 5866
-------------------------------------------------------------------------------
Statistical information for the repository '40a53db89b90e75a2c9bfb3d' was
gathered on 2020/10/24.
The following historical commit information, by author, was found:
Author Commits Insertions Deletions % of changes
Thomas Nagler 2 137 34 7.47
dmey 9 1766 353 92.53
Below are the number of rows from each author that have survived and are still
intact in the current revision:
Author Rows Stability Age % in comments
Thomas Nagler 95 69.3 0.0 28.42
dmey 1421 80.5 0.0 9.85
PDF failed to compile for issue #2779 with the following error:
Can't find any papers to compile :-(
@whedon generate pdf from branch joss-paper
Attempting PDF compilation from custom branch joss-paper. Reticulating splines etc...
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@whedon query scope
Submission flagged for editorial review.
Hi @dmey, thanks for your submission to JOSS. Due to the relatively small size of your software package, the editorial board is going to take a closer look at whether it falls within our scope.
Hi @kyleniemeyer, many thanks for letting me know. In case this may be of relevance to the board, this package have been used in two papers (currently in preparation) which I am planning to submit in the next few weeks. Furthermore, the tool is novel in its approach, well written and likely to be cited by future machine learning (ML) groups.
@kyleniemeyer just as clarification to my previous message -- as I am going to upload the scientific papers that make use of/cite Synthia on arXiv in a couple of weeks while their peer-review takes place, I can update this thread with links to those respective papers. Originally, I thought that this was going to be discussed during review but I am more than happy to wait here if that will make it easier to show the novelty and contribution of this tool to the community.
I'm having a look at the paper regarding the scope query requested by @kyleniemeyer . Is it normal that the paper is extremely short? The Github pdf only contains a "Summary" and "Acknowledgments" sections, it seems rather incomplete and I wonder if this is an involuntary mistake
@VivianePons thanks for looking into this. My understanding is that the summary paper needs to be very short -- abstract like -- as it is meant only as summary of the motivation and purpose of the tool and because the purpose of the review is to review the software rather than paper as done in more traditional journals. I have checked again at https://joss.readthedocs.io/en/latest/submitting.html and it says that the summary paper should be between 250-1000 words but I am more than happy to extend this, especially given that my first draft was much much longer and cut it down considerably at submission to make it more to the point.
Indeed, papers are rather short but they are still a bit more furnished. Look at our example paper: https://joss.readthedocs.io/en/latest/submitting.html#example-paper-and-bibliography
In particular, papers should contain a "Statement of need" which is missing in your case. You can also have some other sections such as "Features", "Examples"
You can browse through our recent publications to give you an idea.
@VivianePons many thanks for clarifying this, please allow me to make the necessary changes as advised.
@whedon check references from branch joss-paper
Attempting to check references... from custom branch joss-paper
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1201/b17116 is OK
- 10.1109/DSAA.2016.49 is OK
MISSING DOIs
- None
INVALID DOIs
- None
In particular, I would like to understand what your software adds specifically in terms of implementation. Considering the small amount of code, we might fear that it is mainly a python wrapper to some other tools like vinecopulib. Could you give us some information regarding this aspect?
Could you give us some information regarding this aspect?
@dmey - could you elaborate? This will help us making our editorial scope decision.
@arfon -- may I give you my response by early next week?
No problem!
@VivianePons apologies for the delay but I have had no time to look at this yet -- could I get back sometime in the next week? Thanks.
@whedon generate pdf from branch joss-paper
Attempting PDF compilation from custom branch joss-paper. Reticulating splines etc...
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@arfon @VivianePons @kyleniemeyer and @danielskatz many thanks for allowing me to get back to you this week. We have recently extended the documentation, added more examples, and reworded the paper to address what I think were your main concerns. We have also added a couple of new features, that is, the handling of discrete and categorical data in the last two releases which brings the number of lines of pure Python code to 1097 (please see cloc output below).
With regards to your individual questions --
In particular, I would like to understand what your software adds specifically in terms of implementation.
@VivianePons thanks for raising this -- looking at the paper and repository with fresh eyes, I can see how this was unclear. I have now made changes to the repository, paper and website and hope that the changes make the purpose clearer. With regards to your specific question, Synthia can currently be used to model univariate and multivariate data, parameterize marginals with empirical and parametric methods and apply manipulations such as stretching and uniformization (I have added a summary at https://dmey.github.io/synthia/features.html). For multivariate data we support three different types of methods: fPCA, parametric (Gaussian) copula, and vine copula models and provide a pure Python implementation for the former two and rely on vinecopulib for the latter. Recently we have also added the capability to handle discrete and categorical data when using vine copulas.
Considering the small amount of code, we might fear that it is mainly a python wrapper to some other tools like vinecopulib. Could you give us some information regarding this aspect?
We have tried to write Synthia succinctly and the current lines of pure Python code according to the cloc tool is 1097 (see below). The use of vinecopulib is important but it is not a required dependency. In our installation vinecopulib is also marked as an optional dependency (see https://dmey.github.io/synthia/installation.html). The amount of code that corresponds to the integration with vinecopulib is very very small, about 20-30 lines of code. Furthermore, although vinecopulib does play an important role in Synthia, its purpose is limited to the generation of vines not that of data generation in general.
As Synthia presents a new method for generation using multidimensional data in Python using fPCA, together with gaussian and vine copulas models, natively handle multidimensional arrays and datasets (essential in componential sciences), and the parametrizations and manipulation of univariate distribution in a single tool, I believe the paper is within scope.
The scope of the journal (https://joss.readthedocs.io/en/latest/submitting.html) indicates that [our bold]:
JOSS publishes articles about research software. This definition includes software that: solves complex modeling problems in a scientific context (physics, mathematics, biology, medicine, social science, neuroscience, engineering); supports the functioning of research instruments or the execution of research experiments; extracts knowledge from large data sets; offers a mathematical library, or similar.
JOSS publishes articles about software that represent substantial scholarly effort on the part of the authors. Your software should be a significant contribution to the available open source software that either enables some new research challenges to be addressed or makes addressing research challenges significantly better (e.g., faster, easier, simpler)
I cited a paper which is going to be submitted in the next few 10 days, I will let you know as soon as it's been deposited to that I can update the reference. And apologies for the long text but I thought it would be best to address everything in one long comment.
As a side note, I think there is a small issue with typesetting the figures in the paper (Table 1). Would it be possible to reduce the text size or change the width by a little so that the code blocks display as one liners. Otherwise I could move them to a different layout.
Output from the cloc command (local run, commit id: 0da044afc3c6d7bad0b60f54dcf21ba2fb6374be).
54 text files.
54 unique files.
21 files ignored.
github.com/AlDanial/cloc v 1.74 T=0.52 s (69.8 files/s, 4497.8 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 21 364 369 1097
Markdown 9 117 0 206
YAML 3 11 5 71
CSS 1 7 7 61
INI 1 0 0 2
HTML 1 0 0 2
-------------------------------------------------------------------------------
SUM: 36 499 381 1439
-------------------------------------------------------------------------------
@whedon check repository
Software report (experimental):
github.com/AlDanial/cloc v 1.84 T=0.10 s (498.5 files/s, 83480.9 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
SVG 1 0 0 4607
Python 21 365 370 1102
Markdown 10 112 0 198
Jupyter Notebook 7 0 911 177
YAML 3 11 5 71
CSS 1 7 7 61
TeX 1 5 0 47
reStructuredText 3 37 68 40
INI 1 0 0 2
HTML 1 0 0 2
-------------------------------------------------------------------------------
SUM: 49 537 1361 6307
-------------------------------------------------------------------------------
Statistical information for the repository '1f383df63cf604807d3377a9' was
gathered on 2020/11/16.
The following historical commit information, by author, was found:
Author Commits Insertions Deletions % of changes
Maik Riechert 2 242 37 10.05
Thomas Nagler 2 137 34 6.16
dmey 19 1927 398 83.78
Below are the number of rows from each author that have survived and are still
intact in the current revision:
Author Rows Stability Age % in comments
Maik Riechert 241 99.6 0.1 2.90
Thomas Nagler 87 63.5 0.8 24.14
dmey 1509 78.3 0.0 8.88
@openjournals/dev - any comments on this question from the author:
As a side note, I think there is a small issue with typesetting the figures in the paper (Table 1). Would it be possible to reduce the text size or change the width by a little so that the code blocks display as one liners. Otherwise I could move them to a different layout.
馃憢 @oliviaguest - would you be willing to edit this for JOSS?
@whedon invite @oliviaguest as editor
@oliviaguest has been invited to edit this submission.
I am really inundated with work at the moment, so on the proviso I can start (looking for reviewers, etc.) next week, sure. 鈽猴笍
Sure, that's fine!
@whedon assign @oliviaguest as editor
OK, the editor is @oliviaguest