Tidyr: Calling `nest()` on `grouped_df` grouped variable over-rides nesting variable

Created on 1 May 2019  Β·  5Comments  Β·  Source: tidyverse/tidyr

If calling nest() on a grouped_df, the grouping variable, i.e. group_vars(df), over-rides the user specified variable passed to nest().
In the reprex below, df is grouped by cyl, however when nest(df, -gear) is called, the cyl variable over-rides the deliberately specified nesting variable (gear). Further, gear is removed entirely from the nested tibbles themselves and disappears.

reprex::reprex({
  library(dplyr)
  library(tidyr)
  library(tibble)

  #' Unexpected result calling `tidyr::nest(-gear)` if upstream `dplyr::group_by(cyl)`.
  df <- as_tibble(mtcars) %>%
    group_by(cyl)

  group_vars(df)

  #' The grouping variable (cyl) over-rides user-defined nesting variable (gear).
  #' Additionally, 'gear' is removed from the 'data' tibbles.
  ndf <- nest(df, -gear)
  ndf

  # gear disappears completely!
  "gear" %in% names(ndf)
  "gear" %in% names(ndf$data)

  #' Expected result (compare tibble dims to those above):
  df %>%
    ungroup(mtcars) %>%    # remove group_var: cyl
    nest(-gear)
})

This is a follow up to #551.

All 5 comments

Can you revise the above but insert what reprex::reprex() leaves on your clipboard instead, i.e. the rendered reprex?

Crap! That's what I did originally and thought better of it ;(

Reprex Output

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
library(tibble)

Unexpected result calling tidyr::nest(-gear) if upstream dplyr::group_by(cyl).

df <- as_tibble(mtcars) %>%
  group_by(cyl)

group_vars(df)
#> [1] "cyl"

The grouping variable (cyl) over-rides user-defined nesting variable (gear).
Additionally, β€˜gear’ is removed from the β€˜data’ tibbles.

ndf <- nest(df, -gear)
ndf
#> # A tibble: 3 x 2
#>     cyl data             
#>   <dbl> <list>           
#> 1     6 <tibble [7 Γ— 9]> 
#> 2     4 <tibble [11 Γ— 9]>
#> 3     8 <tibble [14 Γ— 9]>

# gear disappears completely!
"gear" %in% names(ndf)
#> [1] FALSE
"gear" %in% names(ndf$data)
#> [1] FALSE

Expected result (compare tibble dims to those above):

df %>%
  ungroup(mtcars) %>%    # remove group_var: cyl
  nest(-gear)
#> # A tibble: 3 x 2
#>    gear data              
#>   <dbl> <list>            
#> 1     4 <tibble [12 Γ— 10]>
#> 2     3 <tibble [15 Γ— 10]>
#> 3     5 <tibble [5 Γ— 10]>

Created on 2019-05-01 by the reprex package (v0.2.1)

Session info

devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.5.2 (2018-12-20)
#>  os       macOS Mojave 10.14.4        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/Denver              
#>  date     2019-05-01                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version date       lib source        
#>  assertthat    0.2.0   2017-04-11 [1] CRAN (R 3.5.0)
#>  backports     1.1.3   2018-12-14 [1] CRAN (R 3.5.0)
#>  callr         3.1.1   2018-12-21 [1] CRAN (R 3.5.0)
#>  cli           1.1.0   2019-03-19 [1] RSPM (R 3.5.2)
#>  crayon        1.3.4   2017-09-16 [1] RSPM (R 3.5.2)
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.5.0)
#>  devtools      2.0.1   2018-10-26 [1] CRAN (R 3.5.1)
#>  digest        0.6.18  2018-10-10 [1] CRAN (R 3.5.0)
#>  dplyr       * 0.8.0.1 2019-02-15 [1] RSPM (R 3.5.2)
#>  evaluate      0.13    2019-02-12 [1] CRAN (R 3.5.2)
#>  fansi         0.4.0   2018-10-05 [1] CRAN (R 3.5.0)
#>  fs            1.2.6   2018-08-23 [1] CRAN (R 3.5.0)
#>  glue          1.3.1   2019-03-12 [1] CRAN (R 3.5.2)
#>  highr         0.7     2018-06-09 [1] CRAN (R 3.5.0)
#>  htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.5.0)
#>  knitr         1.22    2019-03-08 [1] CRAN (R 3.5.2)
#>  magrittr      1.5     2014-11-22 [1] RSPM (R 3.5.2)
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.5.0)
#>  pillar        1.3.1   2018-12-15 [1] CRAN (R 3.5.0)
#>  pkgbuild      1.0.2   2018-10-16 [1] CRAN (R 3.5.0)
#>  pkgconfig     2.0.2   2018-08-16 [1] CRAN (R 3.5.0)
#>  pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.5.0)
#>  prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.5.0)
#>  processx      3.2.1   2018-12-05 [1] CRAN (R 3.5.0)
#>  ps            1.3.0   2018-12-21 [1] CRAN (R 3.5.0)
#>  purrr         0.3.2   2019-03-15 [1] RSPM (R 3.5.2)
#>  R6            2.4.0   2019-02-14 [1] CRAN (R 3.5.2)
#>  Rcpp          1.0.0   2018-11-07 [1] CRAN (R 3.5.0)
#>  remotes       2.0.2   2018-10-30 [1] CRAN (R 3.5.0)
#>  rlang         0.3.4   2019-04-07 [1] RSPM (R 3.5.2)
#>  rmarkdown     1.11    2018-12-08 [1] CRAN (R 3.5.0)
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.5.0)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.5.0)
#>  stringi       1.4.3   2019-03-12 [1] CRAN (R 3.5.2)
#>  stringr       1.4.0   2019-02-10 [1] RSPM (R 3.5.2)
#>  testthat      2.0.1   2018-10-13 [1] CRAN (R 3.5.0)
#>  tibble      * 2.1.1   2019-03-16 [1] RSPM (R 3.5.2)
#>  tidyr       * 0.8.3   2019-03-01 [1] RSPM (R 3.5.2)
#>  tidyselect    0.2.5   2018-10-11 [1] CRAN (R 3.5.0)
#>  usethis       1.4.0   2018-08-14 [1] CRAN (R 3.5.0)
#>  utf8          1.1.4   2018-05-24 [1] CRAN (R 3.5.0)
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.5.0)
#>  xfun          0.5     2019-02-20 [1] CRAN (R 3.5.2)
#>  yaml          2.2.0   2018-07-25 [1] CRAN (R 3.5.0)
#> 
#> [1] /Users/sfield/r_libs
#> [2] /Library/Frameworks/R.framework/Versions/3.5/Resources/library

Current version of tidyr is, in fact, doing what you want. There was a change in how you pass variables that need to be nested and now they must be named. It is also not yet printed nicely, but the result is exactly what you want: gear is left out and each of the resulting nested dataframes is grouped by cyl:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
library(tibble)

df <- as_tibble(mtcars) %>%
  group_by(cyl)

group_vars(df)
#> [1] "cyl"

ndf <- nest(df, -gear)
#> Warning: All elements of `...` must be named.
#> Did you want `data = c(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, carb)`?
ndf
#> # A tibble: 3 x 2
#>    gear
#>   <dbl>
#> 1     4
#> 2     3
#> 3     5
#> # … with 1 more variable: data <list<df[,10]>>

ndf$data[[1]]
#> # A tibble: 12 x 10
#> # Groups:   cyl [3]
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6 160     110  3.9   2.62  16.5     0     1     4
#>  2  21       6 160     110  3.9   2.88  17.0     0     1     4
#>  3  22.8     4 108      93  3.85  2.32  18.6     1     1     1
#>  4  24.4     4 147.     62  3.69  3.19  20       1     0     2
#>  5  22.8     4 141.     95  3.92  3.15  22.9     1     0     2
#>  6  19.2     6 168.    123  3.92  3.44  18.3     1     0     4
#>  7  17.8     6 168.    123  3.92  3.44  18.9     1     0     4
#>  8  32.4     4  78.7    66  4.08  2.2   19.5     1     1     1
#>  9  30.4     4  75.7    52  4.93  1.62  18.5     1     1     2
#> 10  33.9     4  71.1    65  4.22  1.84  19.9     1     1     1
#> 11  27.3     4  79      66  4.08  1.94  18.9     1     1     1
#> 12  21.4     4 121     109  4.11  2.78  18.6     1     1     2

Created on 2019-05-04 by the reprex package (v0.2.1)

By "current" you mean "dev" I assume? I was using CRAN v0.8.3. Thank you!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

coatless picture coatless  Β·  6Comments

andrewpbray picture andrewpbray  Β·  8Comments

yusuzech picture yusuzech  Β·  3Comments

strengejacke picture strengejacke  Β·  8Comments

romagnolid picture romagnolid  Β·  8Comments