Tidyr: Weird output with nest()

Created on 4 Jul 2019  ·  10Comments  ·  Source: tidyverse/tidyr

I am not sure if this is the right repo to post this issue, but I am getting weird output when using the nest() function.

library(tidyverse)

mtcars %>%
  group_by(cyl) %>%
  nest()
#>   cyl
#> 1   6
#> 2   4
#> 3   8
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     data
#> 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     21.000, 21.000, 21.400, 18.100, 19.200, 17.800, 19.700, 160.000, 160.000, 258.000, 225.000, 167.600, 167.600, 145.000, 110.000, 110.000, 110.000, 105.000, 123.000, 123.000, 175.000, 3.900, 3.900, 3.080, 2.760, 3.920, 3.920, 3.620, 2.620, 2.875, 3.215, 3.460, 3.440, 3.440, 2.770, 16.460, 17.020, 19.440, 20.220, 18.300, 18.900, 15.500, 0.000, 0.000, 1.000, 1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 0.000, 0.000, 0.000, 0.000, 1.000, 4.000, 4.000, 3.000, 3.000, 4.000, 4.000, 5.000, 4.000, 4.000, 1.000, 1.000, 4.000, 4.000, 6.000
#> 2                                                                                                                                                                                                                                                   22.800, 24.400, 22.800, 32.400, 30.400, 33.900, 21.500, 27.300, 26.000, 30.400, 21.400, 108.000, 146.700, 140.800, 78.700, 75.700, 71.100, 120.100, 79.000, 120.300, 95.100, 121.000, 93.000, 62.000, 95.000, 66.000, 52.000, 65.000, 97.000, 66.000, 91.000, 113.000, 109.000, 3.850, 3.690, 3.920, 4.080, 4.930, 4.220, 3.700, 4.080, 4.430, 3.770, 4.110, 2.320, 3.190, 3.150, 2.200, 1.615, 1.835, 2.465, 1.935, 2.140, 1.513, 2.780, 18.610, 20.000, 22.900, 19.470, 18.520, 19.900, 20.010, 18.900, 16.700, 16.900, 18.600, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000, 0.000, 0.000, 1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 4.000, 4.000, 4.000, 4.000, 4.000, 4.000, 3.000, 4.000, 5.000, 5.000, 4.000, 1.000, 2.000, 2.000, 1.000, 2.000, 1.000, 1.000, 1.000, 2.000, 2.000, 2.000
#> 3 18.700, 14.300, 16.400, 17.300, 15.200, 10.400, 10.400, 14.700, 15.500, 15.200, 13.300, 19.200, 15.800, 15.000, 360.000, 360.000, 275.800, 275.800, 275.800, 472.000, 460.000, 440.000, 318.000, 304.000, 350.000, 400.000, 351.000, 301.000, 175.000, 245.000, 180.000, 180.000, 180.000, 205.000, 215.000, 230.000, 150.000, 150.000, 245.000, 175.000, 264.000, 335.000, 3.150, 3.210, 3.070, 3.070, 3.070, 2.930, 3.000, 3.230, 2.760, 3.150, 3.730, 3.080, 4.220, 3.540, 3.440, 3.570, 4.070, 3.730, 3.780, 5.250, 5.424, 5.345, 3.520, 3.435, 3.840, 3.845, 3.170, 3.570, 17.020, 15.840, 17.400, 17.600, 18.000, 17.980, 17.820, 17.420, 16.870, 17.300, 15.410, 17.050, 14.500, 14.600, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 1.000, 1.000, 3.000, 3.000, 3.000, 3.000, 3.000, 3.000, 3.000, 3.000, 3.000, 3.000, 3.000, 3.000, 5.000, 5.000, 2.000, 4.000, 3.000, 3.000, 3.000, 4.000, 4.000, 4.000, 2.000, 2.000, 4.000, 2.000, 4.000, 8.000

Created on 2019-07-04 by the reprex package (v0.3.0)

Session info

devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.0 (2019-04-26)
#>  os       Linux Mint 19.1             
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_CA:en                    
#>  collate  en_CA.UTF-8                 
#>  ctype    en_CA.UTF-8                 
#>  tz       America/Toronto             
#>  date     2019-07-04                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                          
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.6.0)                  
#>  backports     1.1.4      2019-04-10 [1] CRAN (R 3.6.0)                  
#>  broom         0.5.2      2019-04-07 [1] CRAN (R 3.6.0)                  
#>  callr         3.3.0      2019-07-04 [1] CRAN (R 3.6.0)                  
#>  cellranger    1.1.0      2016-07-27 [1] CRAN (R 3.6.0)                  
#>  cli           1.1.0      2019-03-19 [1] CRAN (R 3.6.0)                  
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 3.6.0)                  
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 3.5.2)                  
#>  desc          1.2.0      2018-05-01 [2] CRAN (R 3.5.2)                  
#>  devtools      2.0.2      2019-04-08 [1] CRAN (R 3.6.0)                  
#>  digest        0.6.20     2019-07-04 [1] CRAN (R 3.6.0)                  
#>  dplyr       * 0.8.2      2019-06-29 [1] CRAN (R 3.6.0)                  
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 3.6.0)                  
#>  forcats     * 0.4.0      2019-02-17 [1] CRAN (R 3.6.0)                  
#>  fs            1.3.1      2019-05-06 [1] CRAN (R 3.6.0)                  
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 3.6.0)                  
#>  ggplot2     * 3.2.0      2019-06-16 [1] CRAN (R 3.6.0)                  
#>  glue          1.3.1      2019-03-12 [1] CRAN (R 3.6.0)                  
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 3.6.0)                  
#>  haven         2.1.1      2019-07-04 [1] CRAN (R 3.6.0)                  
#>  highr         0.8        2019-03-20 [1] CRAN (R 3.6.0)                  
#>  hms           0.4.2      2018-03-10 [1] CRAN (R 3.6.0)                  
#>  htmltools     0.3.6      2017-04-28 [1] CRAN (R 3.6.0)                  
#>  httr          1.4.0      2018-12-11 [2] CRAN (R 3.5.2)                  
#>  jsonlite      1.6        2018-12-07 [1] CRAN (R 3.6.0)                  
#>  knitr         1.23       2019-05-18 [1] CRAN (R 3.6.0)                  
#>  lattice       0.20-38    2018-11-04 [1] CRAN (R 3.6.0)                  
#>  lazyeval      0.2.2      2019-03-15 [1] CRAN (R 3.6.0)                  
#>  lubridate     1.7.4      2018-04-11 [1] CRAN (R 3.6.0)                  
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 3.5.2)                  
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.6.0)                  
#>  modelr        0.1.4      2019-02-18 [1] CRAN (R 3.6.0)                  
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 3.6.0)                  
#>  nlme          3.1-140    2019-05-12 [4] CRAN (R 3.6.0)                  
#>  pillar        1.4.2      2019-07-04 [1] Github (r-lib/pillar@71bc9d4)   
#>  pkgbuild      1.0.3      2019-03-20 [1] CRAN (R 3.6.0)                  
#>  pkgconfig     2.0.2      2018-08-16 [1] CRAN (R 3.6.0)                  
#>  pkgload       1.0.2      2018-10-29 [2] CRAN (R 3.5.2)                  
#>  prettyunits   1.0.2      2015-07-13 [2] CRAN (R 3.5.2)                  
#>  processx      3.4.0      2019-07-03 [1] CRAN (R 3.6.0)                  
#>  ps            1.3.0      2018-12-21 [2] CRAN (R 3.5.2)                  
#>  purrr       * 0.3.2      2019-03-15 [1] CRAN (R 3.6.0)                  
#>  R6            2.4.0      2019-02-14 [1] CRAN (R 3.6.0)                  
#>  Rcpp          1.0.1      2019-03-17 [1] CRAN (R 3.6.0)                  
#>  readr       * 1.3.1      2018-12-21 [1] CRAN (R 3.6.0)                  
#>  readxl        1.3.1      2019-03-13 [1] CRAN (R 3.6.0)                  
#>  remotes       2.1.0      2019-06-24 [1] CRAN (R 3.6.0)                  
#>  rlang         0.4.0      2019-06-25 [1] CRAN (R 3.6.0)                  
#>  rmarkdown     1.13       2019-05-22 [1] CRAN (R 3.6.0)                  
#>  rprojroot     1.3-2      2018-01-03 [2] CRAN (R 3.5.2)                  
#>  rvest         0.3.4      2019-05-15 [1] CRAN (R 3.6.0)                  
#>  scales        1.0.0      2018-08-09 [1] CRAN (R 3.6.0)                  
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 3.5.2)                  
#>  stringi       1.4.3      2019-03-12 [1] CRAN (R 3.6.0)                  
#>  stringr     * 1.4.0      2019-02-10 [1] CRAN (R 3.6.0)                  
#>  testthat      2.1.1      2019-04-23 [1] CRAN (R 3.6.0)                  
#>  tibble      * 2.1.3      2019-06-06 [1] CRAN (R 3.6.0)                  
#>  tidyr       * 0.8.3.9000 2019-07-02 [1] Github (tidyverse/tidyr@d9ecc8f)
#>  tidyselect    0.2.5      2018-10-11 [1] CRAN (R 3.6.0)                  
#>  tidyverse   * 1.2.1      2017-11-14 [1] CRAN (R 3.6.0)                  
#>  usethis       1.5.1      2019-07-04 [1] CRAN (R 3.6.0)                  
#>  vctrs         0.2.0      2019-07-04 [1] Github (r-lib/vctrs@5edf2c0)    
#>  withr         2.1.2      2018-03-15 [2] CRAN (R 3.5.2)                  
#>  xfun          0.8        2019-06-25 [1] CRAN (R 3.6.0)                  
#>  xml2          1.2.0      2018-01-24 [1] CRAN (R 3.6.0)                  
#>  yaml          2.2.0      2018-07-25 [1] CRAN (R 3.6.0)                  
#>  zeallot       0.1.0      2018-01-28 [1] CRAN (R 3.6.0)                  
#> 
#> [1] /home/pmassicotte/R/x86_64-pc-linux-gnu-library/3.6
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

Most helpful comment

You can reinstall vctrs to fix:

remotes::install_github("r-lib/vctrs")

All 10 comments

It is the way data.frame object is printed, and is not weird.
Append %>% as_tibble and you'll see what I guess you want.

It is with the dev version. Even when converting to tibble.

https://github.com/r-lib/pillar/issues/169#issuecomment-508551648

Just to illustrate more examples where there is something in tidyr about list tibble printing.
Also group_nest is working.

library(tidyr)
packageVersion("tidyr")
#> [1] '0.8.3.9000'
library(dplyr)

mtcars %>%
  group_by(cyl) %>%
  nest() %>%
  as_tibble()
#> # A tibble: 3 x 2
#>     cyl            data
#>   <dbl> <list<df[,10]>>
#> 1     6        [7 x 10]
#> 2     4       [11 x 10]
#> 3     8       [14 x 10]

mtcars %>%
  group_nest(cyl)
#> # A tibble: 3 x 2
#>     cyl data              
#>   <dbl> <list>            
#> 1     4 <tibble [11 x 10]>
#> 2     6 <tibble [7 x 10]> 
#> 3     8 <tibble [14 x 10]>

mtcars %>%
  as_tibble() %>%
  group_by(cyl) %>%
  nest() %>%
  as_tibble()
#> # A tibble: 3 x 2
#>     cyl            data
#>   <dbl> <list<df[,10]>>
#> 1     6        [7 x 10]
#> 2     4       [11 x 10]
#> 3     8       [14 x 10]

Created on 2019-07-05 by the reprex package (v0.3.0.9000)

I got it.

The problem is caused by nest.tbl_df is returning data.frame, which used to be "tbl_df" "tbl" "data.frame" in the tidyr 0.8.3.

I'll dig into the source of nest.tbl_df.

tidyr 0.8.3.9000

library(tidyr)
library(dplyr, warn.conflict=FALSE)


mtcars %>%
  as_tibble %>%
  group_by(cyl) %>%
  nest() %>%
  class
#> [1] "data.frame"


mtcars %>%
  group_by(cyl) %>%
  nest() %>%
  class
#> [1] "data.frame"

Created on 2019-07-05 by the reprex package (v0.3.0)

tidyr 0.8.3

library(tidyr)
library(dplyr, warn.conflict=FALSE)
mtcars %>%
  as_tibble %>%
  group_by(cyl) %>%
  nest() %>%
  class
#> [1] "tbl_df"     "tbl"        "data.frame"
mtcars %>%
  group_by(cyl) %>%
  nest() %>%
  class
#> [1] "tbl_df"     "tbl"        "data.frame"

Created on 2019-07-05 by the reprex package (v0.3.0)

Problem is that vctrs::vec_cbind returning data.frame when the first argument is grouped_df and the second is data.frame.
This is what is happening at the last by mtcars %>% group_by(cyl) %>% nest.

I file the issue in vctrs

You can reinstall vctrs to fix:

remotes::install_github("r-lib/vctrs")

Related to #649

The broader question here is to consider what happens when a grouped data frame and a data frame are passed through vec_cbind(), as is done at the end of the nest.tbl_df call.

When dplyr implements vec_ptype2.grouped_df, I think that will override Lionel's changes in vctrs and force a grouped data frame to be returned from nest(), since the vec_ptype2(<grouped_df>, <data_frame>) is likely another <grouped_df>.

Should nest.grouped_df return another grouped data frame?
(The CRAN version does not but I like the idea of retaining the grouping info because then grouped_df %>% nest() %>% unnest() returns another grouped_df)

  • If so, when dplyr implements the ptype methods things will work correctly.
  • If not, nest.grouped_df probably needs to call ungroup() before passing on to nest.tbl_df

I tend to agree we should retain group information as much as possible. However changing this might cause breakages.

Duplicate of #649

Was this page helpful?
0 / 5 - 0 ratings