If calling nest()
on a grouped_df
, the grouping variable, i.e. group_vars(df)
, over-rides the user specified variable passed to nest()
.
In the reprex below, df
is grouped by cyl
, however when nest(df, -gear)
is called, the cyl
variable over-rides the deliberately specified nesting variable (gear
). Further, gear
is removed entirely from the nested tibbles themselves and disappears.
reprex::reprex({
library(dplyr)
library(tidyr)
library(tibble)
#' Unexpected result calling `tidyr::nest(-gear)` if upstream `dplyr::group_by(cyl)`.
df <- as_tibble(mtcars) %>%
group_by(cyl)
group_vars(df)
#' The grouping variable (cyl) over-rides user-defined nesting variable (gear).
#' Additionally, 'gear' is removed from the 'data' tibbles.
ndf <- nest(df, -gear)
ndf
# gear disappears completely!
"gear" %in% names(ndf)
"gear" %in% names(ndf$data)
#' Expected result (compare tibble dims to those above):
df %>%
ungroup(mtcars) %>% # remove group_var: cyl
nest(-gear)
})
This is a follow up to #551.
Can you revise the above but insert what reprex::reprex()
leaves on your clipboard instead, i.e. the rendered reprex?
Crap! That's what I did originally and thought better of it ;(
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(tibble)
Unexpected result calling tidyr::nest(-gear)
if upstream dplyr::group_by(cyl)
.
df <- as_tibble(mtcars) %>%
group_by(cyl)
group_vars(df)
#> [1] "cyl"
The grouping variable (cyl) over-rides user-defined nesting variable (gear).
Additionally, βgearβ is removed from the βdataβ tibbles.
ndf <- nest(df, -gear)
ndf
#> # A tibble: 3 x 2
#> cyl data
#> <dbl> <list>
#> 1 6 <tibble [7 Γ 9]>
#> 2 4 <tibble [11 Γ 9]>
#> 3 8 <tibble [14 Γ 9]>
# gear disappears completely!
"gear" %in% names(ndf)
#> [1] FALSE
"gear" %in% names(ndf$data)
#> [1] FALSE
Expected result (compare tibble dims to those above):
df %>%
ungroup(mtcars) %>% # remove group_var: cyl
nest(-gear)
#> # A tibble: 3 x 2
#> gear data
#> <dbl> <list>
#> 1 4 <tibble [12 Γ 10]>
#> 2 3 <tibble [15 Γ 10]>
#> 3 5 <tibble [5 Γ 10]>
Created on 2019-05-01 by the reprex package (v0.2.1)
devtools::session_info()
#> β Session info ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> setting value
#> version R version 3.5.2 (2018-12-20)
#> os macOS Mojave 10.14.4
#> system x86_64, darwin15.6.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/Denver
#> date 2019-05-01
#>
#> β Packages ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> package * version date lib source
#> assertthat 0.2.0 2017-04-11 [1] CRAN (R 3.5.0)
#> backports 1.1.3 2018-12-14 [1] CRAN (R 3.5.0)
#> callr 3.1.1 2018-12-21 [1] CRAN (R 3.5.0)
#> cli 1.1.0 2019-03-19 [1] RSPM (R 3.5.2)
#> crayon 1.3.4 2017-09-16 [1] RSPM (R 3.5.2)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.5.0)
#> devtools 2.0.1 2018-10-26 [1] CRAN (R 3.5.1)
#> digest 0.6.18 2018-10-10 [1] CRAN (R 3.5.0)
#> dplyr * 0.8.0.1 2019-02-15 [1] RSPM (R 3.5.2)
#> evaluate 0.13 2019-02-12 [1] CRAN (R 3.5.2)
#> fansi 0.4.0 2018-10-05 [1] CRAN (R 3.5.0)
#> fs 1.2.6 2018-08-23 [1] CRAN (R 3.5.0)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.5.2)
#> highr 0.7 2018-06-09 [1] CRAN (R 3.5.0)
#> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.5.0)
#> knitr 1.22 2019-03-08 [1] CRAN (R 3.5.2)
#> magrittr 1.5 2014-11-22 [1] RSPM (R 3.5.2)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.5.0)
#> pillar 1.3.1 2018-12-15 [1] CRAN (R 3.5.0)
#> pkgbuild 1.0.2 2018-10-16 [1] CRAN (R 3.5.0)
#> pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.5.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.5.0)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.5.0)
#> processx 3.2.1 2018-12-05 [1] CRAN (R 3.5.0)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.5.0)
#> purrr 0.3.2 2019-03-15 [1] RSPM (R 3.5.2)
#> R6 2.4.0 2019-02-14 [1] CRAN (R 3.5.2)
#> Rcpp 1.0.0 2018-11-07 [1] CRAN (R 3.5.0)
#> remotes 2.0.2 2018-10-30 [1] CRAN (R 3.5.0)
#> rlang 0.3.4 2019-04-07 [1] RSPM (R 3.5.2)
#> rmarkdown 1.11 2018-12-08 [1] CRAN (R 3.5.0)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.5.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.5.0)
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.5.2)
#> stringr 1.4.0 2019-02-10 [1] RSPM (R 3.5.2)
#> testthat 2.0.1 2018-10-13 [1] CRAN (R 3.5.0)
#> tibble * 2.1.1 2019-03-16 [1] RSPM (R 3.5.2)
#> tidyr * 0.8.3 2019-03-01 [1] RSPM (R 3.5.2)
#> tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.5.0)
#> usethis 1.4.0 2018-08-14 [1] CRAN (R 3.5.0)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.5.0)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.0)
#> xfun 0.5 2019-02-20 [1] CRAN (R 3.5.2)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.5.0)
#>
#> [1] /Users/sfield/r_libs
#> [2] /Library/Frameworks/R.framework/Versions/3.5/Resources/library
Current version of tidyr
is, in fact, doing what you want. There was a change in how you pass variables that need to be nested and now they must be named. It is also not yet printed nicely, but the result is exactly what you want: gear
is left out and each of the resulting nested dataframes is grouped by cyl
:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(tibble)
df <- as_tibble(mtcars) %>%
group_by(cyl)
group_vars(df)
#> [1] "cyl"
ndf <- nest(df, -gear)
#> Warning: All elements of `...` must be named.
#> Did you want `data = c(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, carb)`?
ndf
#> # A tibble: 3 x 2
#> gear
#> <dbl>
#> 1 4
#> 2 3
#> 3 5
#> # β¦ with 1 more variable: data <list<df[,10]>>
ndf$data[[1]]
#> # A tibble: 12 x 10
#> # Groups: cyl [3]
#> mpg cyl disp hp drat wt qsec vs am carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 1
#> 4 24.4 4 147. 62 3.69 3.19 20 1 0 2
#> 5 22.8 4 141. 95 3.92 3.15 22.9 1 0 2
#> 6 19.2 6 168. 123 3.92 3.44 18.3 1 0 4
#> 7 17.8 6 168. 123 3.92 3.44 18.9 1 0 4
#> 8 32.4 4 78.7 66 4.08 2.2 19.5 1 1 1
#> 9 30.4 4 75.7 52 4.93 1.62 18.5 1 1 2
#> 10 33.9 4 71.1 65 4.22 1.84 19.9 1 1 1
#> 11 27.3 4 79 66 4.08 1.94 18.9 1 1 1
#> 12 21.4 4 121 109 4.11 2.78 18.6 1 1 2
Created on 2019-05-04 by the reprex package (v0.2.1)
By "current" you mean "dev" I assume? I was using CRAN v0.8.3
. Thank you!