I received an email from CRAN today that one of my packages was producing errors on some platforms (see results page linked below). I traced this back to what I think is a new issue in dplyr related to the ts class. As you can see in the reprex below, after applying ts in mutate_at the resulting columns are listed as type "numerice" rather than "ts".
FYI The reprex below was produced on Windows with R 3.6.3. The issue does not occur on macOS but does seem to occur on most linux systems that CRAN uses for testing.
https://cran.r-project.org/web/checks/check_results_radiant.data.html
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(testthat)
#>
#> Attaching package: 'testthat'
#> The following object is masked from 'package:dplyr':
#>
#> matches
dat <- mutate_at(mtcars, .vars = c("mpg", "cyl"), .funs = ts, start = c(1971, 1), frequency = 52)
expect_equal(dat$mpg, ts(mtcars$mpg, start = c(1971, 1), frequency = 52))
#> Error: dat$mpg not equal to ts(mtcars$mpg, start = c(1971, 1), frequency = 52).
#> Classes differ: numeric is not ts
sapply(dat, class)
#> mpg cyl disp hp drat wt qsec vs
#> "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
#> am gear carb
#> "numeric" "numeric" "numeric"
This is consistent with [.ts which drops the class intentionally. The ts class is not designed to be used as a vector or a column. We could make it work with proper vctrs methods, but they would have to be contributed.
Thanks for your quick reply @lionel-. My package (radiant.data) has been on CRAN, with a dplyr dependency, for years now. Can you tell me a bit more perhaps about (1) what changed in dplyr and (2) why it works fine on some platform but not others? Finally, could you point me to more information on what adding "vctrs methods" would entail to make this work with dplyr again?
Thanks
dplyr now uses vctrs, which is more principled about vector types. As far as I know ts objects never worked properly with dplyr, and were mostly using undefined behaviour. The platforms that are not failing yet are likely still running vctrs 0.3.0 instead of vctrs 0.3.1. The latter properly falls back to [.ts for subsetting instead of creating corrupt objects.
To get started you could read https://vctrs.r-lib.org/articles/s3-vector.html, https://vctrs.r-lib.org/reference/howto-faq-coercion.html, and https://vctrs.r-lib.org/reference/vec_proxy.html. This won't be a simple task because you'll need to figure out the structure and semantics of the ts class without the benefit of existing base methods, since this class has never been subsettable or combinable; how to preserve and update attributes after data manipulations; and how to translate these findings in vctrs methods (proxy, restore, ptype2, and cast methods essentially, I'd be happy to help you with this).
Thanks for the clarification @lionel-. I now see what you mean by subsetting a ts object dropping the class information. I'm still not sure why that applies to this mutate_at setting however, because I was only applying a transformation and no subsetting was involved.
Adding the appropriate vctrs methods from scratch sounds fairly complex unless it can build on the work in tsibble (@earowang) somehow.
FYI, tsibble supports dplyr and vctrs methods.
@romainfrancois Could the vec_unchop() call in mutating verbs be avoided when the data frame is ungrouped?
@earowang Are you suggesting that if I want to "mutate" one or more columns to class "ts" that I convert the whole dataframe to a tsibble?
Minimal reprex:
library(dplyr, warn.conflicts = FALSE)
df <- data.frame(x = 1:5)
df %>%
mutate(x = ts(x, start = c(1971, 1), frequency = 52)) %>%
.$x
#> [1] 1 2 3 4 5
Created on 2020-06-22 by the reprex package (v0.3.0)
As @lionel- says, the simplest way to fix this specific issue would be only call vec_unchop() if there is more than one group. (Alternatively we could make vec_unchop() do less work if there's only one group, but that feels maybe a bit inconsistent to be in a low-level package). I've done that in #5348, but I don't have a strong feeling whether this fix is worth it since the grouped and ungrouped behaviour will now be different (albeit consistent with c.ts()).
Most helpful comment
FYI,
tsibblesupports dplyr and vctrs methods.