Hi all,
The outcome of duration()
is inconsistant when vectors are use:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
tibble(age = 2, unit = "week") %>% mutate(duration(age, units = unit))
#> # A tibble: 1 x 3
#> age unit `duration(age, units = unit)`
#> <dbl> <chr> <Duration>
#> 1 2 week 1209600s (~2 weeks)
tibble(age = 4, unit = "year") %>% mutate(duration(age, units = unit))
#> # A tibble: 1 x 3
#> age unit `duration(age, units = unit)`
#> <dbl> <chr> <Duration>
#> 1 4 year 126144000s (~4 years)
tibble(age = c(2, 4), unit = c("week", "year")) %>% mutate(duration(age, units = unit))
#> # A tibble: 2 x 3
#> age unit `duration(age, units = unit)`
#> <dbl> <chr> <Duration>
#> 1 2 week 1209600s (~2 weeks)
#> 2 4 year 126144000s (~4 years)
tibble(age = c(2, 4), unit = c("week", "week")) %>% mutate(duration(age, units = unit))
#> Error: Invalid unit name: week
tibble(age = c(2, 4), unit = c("year", "year")) %>% mutate(duration(age, units = unit))
#> Error: Invalid unit name: year
tibble(age = c(2, 4), unit = "week") %>% mutate(duration(age, units = unit))
#> Error: Invalid unit name: week
tibble(age = c(2, 4), unit = "year") %>% mutate(duration(age, units = unit))
#> Error: Invalid unit name: year
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#> setting value
#> version R version 3.6.0 (2019-04-26)
#> os macOS Mojave 10.14.5
#> system x86_64, darwin15.6.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Berlin
#> date 2019-07-02
#>
#> ─ Packages ──────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
#> backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.0)
#> callr 3.2.0 2019-03-15 [1] CRAN (R 3.6.0)
#> cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0)
#> devtools 2.0.2 2019-04-08 [1] CRAN (R 3.6.0)
#> digest 0.6.19 2019-05-20 [1] CRAN (R 3.6.0)
#> dplyr * 0.8.2 2019-06-29 [1] CRAN (R 3.6.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
#> fansi 0.4.0 2018-10-05 [1] CRAN (R 3.6.0)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0)
#> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.6.0)
#> knitr 1.23 2019-05-18 [1] CRAN (R 3.6.0)
#> lubridate * 1.7.4 2018-04-11 [1] CRAN (R 3.6.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0)
#> pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.0)
#> pkgbuild 1.0.3 2019-03-20 [1] CRAN (R 3.6.0)
#> pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.0)
#> processx 3.3.1 2019-05-08 [1] CRAN (R 3.6.0)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0)
#> purrr 0.3.2 2019-03-15 [1] CRAN (R 3.6.0)
#> R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.0)
#> Rcpp 1.0.1 2019-03-17 [1] CRAN (R 3.6.0)
#> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.0)
#> rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.0)
#> rmarkdown 1.13 2019-05-22 [1] CRAN (R 3.6.0)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
#> testthat 2.1.1 2019-04-23 [1] CRAN (R 3.6.0)
#> tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.0)
#> tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.6.0)
#> usethis 1.5.0 2019-04-07 [1] CRAN (R 3.6.0)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.0)
#> vctrs 0.1.0 2018-11-29 [1] CRAN (R 3.6.0)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
#> xfun 0.8 2019-06-25 [1] CRAN (R 3.6.0)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0)
#> zeallot 0.1.0 2018-01-28 [1] CRAN (R 3.6.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
Created on 2019-07-02 by the reprex package (v0.3.0)
It would be great if the vectorial form would always work.
PS: why does year works when the doc says "Units larger than weeks are not used due to their variability."...? I am happy it does though.
Alex
Hi @courtiol,
This is a common issue with all non-vectorized functions using mutate. There's a simple solution, add dplyr::rowwise()
like so:
tibble(age = c(2, 4), unit = c("week", "week")) %>% dplyr::rowwise() %>% mutate(duration(age, units = unit))
Hi @yogat3ch, thanks for your feedback.
I would be OK with the fact that duration is not a vectorised function, but my example show that it behave inconsistently despite alternative inputs being of same dimension and class.
As such lubridate::duration()
seem to violate a core tidy principle.
The issue applies outside a mutate call, so I am not sure that the problem falls under "common issue".
```{r}
lubridate::duration(c(1,2), units = c("days", "weeks"))
[1] "86400s (~1 days)" "1209600s (~2 weeks)"
lubridate::duration(c(1,2), units = c("weeks", "weeks"))
Error: Invalid unit name: weeks
``Since
dplyr::rowwise()is labelled as
questioning` with respect to its status, I also think it would be best not to have to rely on that and ideally offer instead true vectorisation support.
What do you think?
Best,
Alex
Hi Alex @courtiol,
I think the unexpected behavior definitely merits the attention of the lubridate developers, so thank you for posting it here. It seems like they're pretty inundated with issues though so I wanted to offer the workaround so you can move forward with whatever you're working on.
That surprises me about rowwise
, as that function is essential when writing custom mutate
functions as the single use case function is definitely simpler to write up and plug in to mutate
with a call to rowwise
than have to write every custom function for mutate
as vectorised.
I definitely don't think they should sunset rowwise
.
Thanks for the response!
@courtiol could you please provide a minimal reprex, focussing on the problem you're experiencing? Please eliminate the use of dplyr and don't include session info.
Hi @hadley, I did as you suggested:
works <- data.frame(age = c(2, 4), unit = c("year", "week"))
works$age <- lubridate::duration(works$age, units = works$unit)
doesnotwork <- data.frame(age = c(2, 4), unit = c("year", "year"))
doesnotwork$age <- lubridate::duration(doesnotwork$age, units = doesnotwork$unit)
#> Error: Invalid unit name: year
Created on 2019-11-20 by the reprex package (v0.3.0)
Even more minimal reprex 😄
library(lubridate, warn.conflicts = FALSE)
duration(1:2, c("seconds", "minutes"))
#> [1] "1s" "120s (~2 minutes)"
duration(1:2, c("seconds", "seconds"))
#> Error: Invalid unit name: seconds
Created on 2019-11-20 by the reprex package (v0.3.0)