Joss-reviews: [REVIEW]: Sentiment Analysis of Twitter Data (SAoTD)

Created on 4 Jun 2018  Â·  58Comments  Â·  Source: openjournals/joss-reviews

Submitting author: @evan-l-munson (Evan Munson)
Repository: https://github.com/evan-l-munson/SAoTD
Version: 0.2.0
Editor: @arfon
Reviewer: @kbenoit
Archive: 10.5281/zenodo.2578973

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/e6002792b44f50039afc22dbe3d4a086"><img src="http://joss.theoj.org/papers/e6002792b44f50039afc22dbe3d4a086/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/e6002792b44f50039afc22dbe3d4a086/status.svg)](http://joss.theoj.org/papers/e6002792b44f50039afc22dbe3d4a086)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@kbenoit, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @leeper know.

Review checklist for @kbenoit

Conflict of interest

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
    No, but many R packages using standard licenses only name the license in the DESCRIPTION file. The README.md refers to "All code is licensed GL", this should be "GPL-x".
  • [ ] Version: 0.2.0
    No: GitHub release is 6 commits behind the master. Master has v1.0.0.
  • [x] Authorship: Has the submitting author (@evan-l-munson) made major contributions to the software?
    Yes, apparently all.
  • [x] Does the full list of paper authors seem appropriate and complete?
    More than complete, as the 2nd through last author have an unknown contribution to the software. They have not made any GitHub commits, and there is no record of them having authored any of the functions (for instance via @author).

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • [ ] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Authors: Does the paper.md file include a list of authors with their affiliations?
  • [ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
    No, the paper focuses more on what the package does than providing an explicit statement of need. This could be made more clear (and at little cost or effort).
  • [x] References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
accepted published recommend-accept review

Most helpful comment

Thank you for the review.

I will work on those notes/corrections as soon as I get a chance (just moved my family across the United States and started a new job).

Evan Munson

Sent from my iPhone

On Jun 7, 2018, at 01:15, Thomas J. Leeper notifications@github.com wrote:

Excellent review, @kbenoit! Thank so much!

@evan-l-munson Can you address the issues raised in the review - particularly the missing checked items from the review checklist and the other useful suggestions raised in the review?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

All 58 comments

Hello human, I'm @whedon. I'm here to help you with some common editorial tasks. @kbenoit it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
Attempting PDF compilation. Reticulating splines etc...

Review

[Checklist moved above]

Comments

On the paper

This paper describes an R package that provides a workflow for analyzing sentiment and topics in Twitter text, wrapping around packages such as twitteR, tidytext, and topicmodels. The package contains a number of useful analytic functions for looking at Twitter data, and these are clearly demonstrated in the vignette. Some of these could be useful to other forms of text, but the package is specially designed to work with Twitter data, including not just the import of this data but also working with Twitter-specific handles such as hashtags and usernames

This paper makes a nice, short article that should be published, but could be improved by addressing a few relatively minor issues:

  • A few sentences at the outset describing the ideal use of the package and how it fits into other text analysis workflows using R would be welcome.
  • The package presumes a knowledge of the tidyverse approach to R, which is fine, but the paper (and package documentation) could mention this more explicitly.
  • Should the lexicon used for sentiment analysis be called the "Bing lexicon" or the Hu and Liu (2004) sentiment dictionary?
  • "Twitter" and "Tweet" should be capitalized in the text.

On the package

The package works, and I have seen much worse source code in widely-used R packages published on CRAN. These are some suggestions for improving the code and the package, not necessarily linked to the paper and whether it should be published. (I leave to the editor to decide.)

_Naming_. This is a matter of preference, although there are some emerging guidelines designed to reduce the chaos in the R world. This paper combines capitalized object names with lower-cased object names, and function names with . (e.g. e.g. PosNeg.Words()), which is generally discouraged due to the indeterminacy with the S3 dispatch system. Do the functions really need to be capitalized? Also naming is not consistent: Word.Corr() is "object.verb" but Number.Topics is verb.object.

The package name itself runs contrary to this advice from Hadley Wickham:

Avoid using both upper and lower case letters: doing so makes the package name hard to type and even harder to remember.

_Unnecessary C++ code._ Why is there a function rcpp_hello_world()? This looks like demonstration code that should be removed.

_Data copyright issues._ Can distribute the data in raw_tweets. That material is copyright of the authors of the Tweets. At the least, it may require some attribution. It would be worth reviewing the Twitter terms of service about this.

_Data object loading_. Set LazyLoad: true in DESCRIPTION else code like the (not run) examples for Scores() will not work.

_(non-)Object orientation_. None of the functions use generics and method dispatch, but rather check the class of the input objects using conditionals within each function (e.g. here. This makes extending the package harder, in addition to being more error-prone. The function names are very generic, furthermore, such as BoxPlot(), or Tidy(). Other packages have functions named tidy(), but they are defined for specific object classes. I suggest using more distinctive names to differentiate this package's functions from those found in other packages, and/or method dispatch for specific object classes. Simplyu capitalizing the function names is likely to confuse some users.

_Code organization_. Nearly all functions are in a single long .R file called Function.R. Splitting this into smaller files would make the code organization clearer.

_Examples_. Most examples are not run, due to the difficulties of connecting to the Twitter API using authentication. But this is not true for an functions that only use raw_tweets, such as Bigram(), Bigram.Network(), BoxPlot(), etc. Furthermore, this code does not run as written, because raw_tweets is not lazy loaded. In addition, there is no need to load the package in the examples (using library(SAoTD)) because the help functions should be only accessible if the package is already loaded.

_Tests_. The file tests/testthat/test_Acquire.R contains Twitter authentication keys. These should be removed (and changed, since they will remain visible in the git history).

Excellent review, @kbenoit! Thank so much!

@evan-l-munson Can you address the issues raised in the review - particularly the missing checked items from the review checklist and the other useful suggestions raised in the review?

Thank you for the review.

I will work on those notes/corrections as soon as I get a chance (just moved my family across the United States and started a new job).

Evan Munson

Sent from my iPhone

On Jun 7, 2018, at 01:15, Thomas J. Leeper notifications@github.com wrote:

Excellent review, @kbenoit! Thank so much!

@evan-l-munson Can you address the issues raised in the review - particularly the missing checked items from the review checklist and the other useful suggestions raised in the review?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@evan-l-munson Just a nudge on this.

@evan-l-munson Just another nudge on this.

Thank you for the reminder. I have not forgotten.

Sent from my iPhone

On Aug 26, 2018, at 16:05, Thomas J. Leeper notifications@github.com wrote:

@evan-l-munson Just another nudge on this.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Hi @evan-l-munson - please try and get to these updates when you get a chance.

I have made about half the corrections suggested above. In the next week, I hope to rename my functions to better fit with standard naming conventions, additionally, I am looking at the test dataset for the copyright issues and a couple of the other issues.

Good evening @arfon, I think I made all the requested corrections. Please let me know if you need anything else corrected/adjusted. Thanks for the patience and assistance in this process.

:wave: @kbenoit - please come and take another look at this submission when you get a chance, the author has made some updates based on your feedback.

@whedon assign @arfon as editor

@arfon Happy to do so.

Hi @kbenoit - have you had a chance to take another look at this submission?

I've had a chance to look at the package again, and I am pleased to report that it and the paper are much improved. @evan-l-munson has done a very good job of addressing my concerns above, if not as good a job of summarizing in a PR or memo what these changes were 😉 .

_Package_. The code organization is much better (and will be easier to maintain or for potential contributors to absorb). I also like the function naming much better than before - the function index looks more tidy and sensible now.

There is still unnecessary C++ code in /src, which creates the function rcpp_hello_world() in the package index. This should simply be removed.

_Paper._ The paper does a much better job now of explaining the package and its purpose, and the vignette rounds this out nicely.

Subject to the Hello World change, :+1:.

Thanks @kbenoit. @evan-l-munson - please make these final changes to your package and we can move forward accepting this submission.

Gentlemen, I missed your email by accident. I appreciate these comments and will try to get them corrected/accounted for this weekend or sometime during the holidays. Thank you!

Gentlemen, good morning. I have corrected rcpp_hello_world() function. I was running through everything to make sure it was working properly before I gave you the final word. Everything was looking good until I tried to view the vignette. For some reason, the vignette is not found when using utils::vignette("saotd"). I will troubleshoot that and hopefully get that working. Thanks!

Thanks for the update @evan-l-munson

👋 @evan-l-munson — How is it going? Have you been able to troubleshoot the issue? Give us an update when you can. Thanks!

Good morning, I appreciate your patience with me. Finishing up this package is ending up being more challenging and time-consuming than I anticipated. I was working on my vignette issue yesterday and am struggling to fix what I am seeing. Everything seems to be built properly, however, after I re-download and load the package from my Git page the vignettes are not found. I have used both, utils::vignette('saotd') and utils::browseVignettes('saotd') but receive the same error message: No vignettes found by utils::browseVignettes("saotd"). I have also sent the Git repository to a friend, who experienced the same issue. As I look at the R package structure I think I have everything correct (and I have been looking at it for months, trying to correct) but obviously I have something incorrect since they are not being found. If you have some insight as I what might be happening I would appreciate the any thoughts you might have so I can complete this. Thanks!

If you have some insight as I what might be happening I would appreciate the any thoughts you might have so I can complete this. Thanks!

I'm not sure sorry. Perhaps @kbenoit has some thoughts on this?

@arfon I have another colleague looking into the issue for me. I am hoping they will get back to me this week. I have ran out of ideas on my end and am not sure why I can view the vignettes after I re-download the package from GitHub. If the vignette issue isn't a big one I would say the package is ready for submission, but if that is a big issue, I will continue to work on it. Thanks!

@evan-l-munson I just tried building your package and it fails with

> devtools::build_vignettes()
Building saotd vignettes
... 
Error : processing vignette 'SAoTD.Rmd' failed with diagnostics:
Failed to locate the ‘weave’ output file (by engine ‘knitr::rmarkdown’) for vignette with name ‘SAoTD’. The following files exist in directory ‘.’: ‘saotd.html’, ‘saotd.R’, ‘SAoTD.Rmd’
Error: processing vignette 'SAoTD.Rmd' failed with diagnostics:
Failed to locate the ‘weave’ output file (by engine ‘knitr::rmarkdown’) for vignette with name ‘SAoTD’. The following files exist in directory ‘.’: ‘saotd.html’, ‘saotd.R’, ‘SAoTD.Rmd’

but when I changed the vignette name to test.Rmd (all lowercase) it succeeded. I suggest that you change the vignette filename to lowercase (the title can still be anything you want) and hopefully the problem will be solved. It seems related to https://stackoverflow.com/questions/27338970/vignette-creation-on-package-build-fails-with-the-error-failed-to-locate-the-w.

Also I don't think you need Rcpp at all. Suggest you delete /src, R/RcppExports.R, remove Rcpp from the Imports section of DESCRIPTION, remove the line LinkingTo: Rcpp, and delete the imports in R/SAoTD.R. They are doing nothing for you - I suspect they came from a package starter boilerplate tool.

@kbenoit I think I have renamed the vignette but still need to run through some checks. I'm not sure if the SAoTD.Rmd is a holdover from when I changed the name of the package to saotd.

I haven't had a chance to work on removing the rcpp items but hope to get to that later this weekend.

Again thanks for the help!

@kbenoit I fixed the vignette issue finally, which of course took longer than I expected. I went through and removed all Rcpp items in the package and ended up breaking everything. This weekend, I added all the Rcpp items back into the package, and now the package is passing all tests again and working properly.

There is no C++ in the package, so the "breaking" parts were probably due to incomplete removal or failure to rebuild the parts fully when you roxygenized.

Actually I just fixed it for you, https://github.com/evan-l-munson/saotd/pull/8.

@kbenoit Again thank you for the help with the rcpp items. I have merged your correction into the master repository. I retested everything and your corrections have passed all checks. I think this should be good to go. Thank you again for your help!

@arfon with a couple critical pointers from @kbenoit I think that I have finally corrected the last couple of items of concern he had for my package. Is there anything else that you need from me with this package submission? Thanks!

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

Hi @evan-l-munson. I've made some slight tweaks to your paper in https://github.com/evan-l-munson/saotd/pull/9 - let me know what you think.

In addition, please make sure you:

  • Update the version number based on the feedback from this review.
  • Add something to your README making it clear how users can contribute (if appropriate).

@arfon, Good evening. I have completed your suggested edits/additions and pushed to GitHub. If there is anything else, let me know and I will get working on it asap. Thanks for the help!

@whedon generate pdf

Attempting PDF compilation. Reticulating splines etc...

@arfon, Good evening. I have completed your suggested edits/additions and pushed to GitHub. If there is anything else, let me know and I will get working on it asap. Thanks for the help!

@evan-l-munson thanks. Please could you clarify what version the package is now at? Given the modifications during the review it would seem appropriate to make a new release and archive in Zenodo.

Once you've done this I'm happy to proceed with accepting this submission.

@arfon, I bumped the package to version 0.2.0. I will work to get resubmit to Zendo within the next day or two. Will let you know when I get that done. Thanks!

@arfon, I released the 0.2.0 version to zenodo this morning and have updated the DOI badge. Is there anything else you need? Thanks!

@whedon set 10.5281/zenodo.2578973 as archive

OK. 10.5281/zenodo.2578973 is the archive.

@whedon set 0.2.0 as version

OK. 0.2.0 is the version.

@whedon accept

Attempting dry run of processing paper acceptance...

```Reference check summary:

OK DOIs

MISSING DOIs

  • None

INVALID DOIs

  • None
    ```

Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/529

If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/529, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.
@whedon accept deposit=true

@whedon accept deposit=true

Doing it live! Attempting automated processing of paper acceptance...

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited :point_right: https://github.com/openjournals/joss-papers/pull/530
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.00764
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! 🎉🌈🦄💃👻🤘

    Any issues? notify your editorial technical team...

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00764/status.svg)](https://doi.org/10.21105/joss.00764)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.00764">
  <img src="http://joss.theoj.org/papers/10.21105/joss.00764/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.00764/status.svg
   :target: https://doi.org/10.21105/joss.00764

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

@kbenoit - many thanks for your review of this submission ✨

@evan-l-munson - your paper is now accepted into JOSS :zap::rocket::boom:

:tada::tada::tada: Congratulations on your paper acceptance! :tada::tada::tada:

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00764/status.svg)](https://doi.org/10.21105/joss.00764)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.00764">
  <img src="http://joss.theoj.org/papers/10.21105/joss.00764/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: http://joss.theoj.org/papers/10.21105/joss.00764/status.svg
   :target: https://doi.org/10.21105/joss.00764

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

@arfon and @kbenoit thank you both so much for your help! Take care!

Was this page helpful?
0 / 5 - 0 ratings