library(tidyverse)
tibble(list = list(NULL, tibble(x = 1))) %>%
unnest()
#> Error: Each column must either be a list of vectors or a list of data frames [list]
I would have expected:
#> # A tibble: 2 x 1
#> x
#> <dbl>
#> 1 NA
#> 2 1.00
I've been thinking about this issue too. To me, it feels like this fits in with the discussion happening over at #358 ...
I have a use case for this where:
I have multiple datasets in a clinical trial. Some data have one or more rows for each subject; some data may have zero rows for each subject. Specifically, lab measures from blood concentrations have at least one measure for each subject; adverse events (aka side effects) have zero or more rows per subject.
The number of rows in the data are the number of observations which is important for adverse events, and imputing an empty row would cause issues with many downstream processing efforts because counting adverse events would be more complex.
What I want to do is make nested datasets for both, merge them by subject number, and be able to unnest either individually later. As an example:
library(tidyverse)
#> -- Attaching packages ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
#> v ggplot2 2.2.1 v purrr 0.2.5
#> v tibble 1.4.2 v dplyr 0.7.5
#> v tidyr 0.8.1 v stringr 1.3.1
#> v readr 1.1.1 v forcats 0.3.0
#> Warning: package 'tidyr' was built under R version 3.4.4
#> Warning: package 'purrr' was built under R version 3.4.4
#> Warning: package 'dplyr' was built under R version 3.4.4
#> Warning: package 'stringr' was built under R version 3.4.4
#> -- Conflicts ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag() masks stats::lag()
d_adverse <-
data.frame(SUBJID=1,
AE="nausea") %>%
as_tibble() %>%
nest(-SUBJID, .key="adverse")
d_lab <-
data.frame(SUBJID=1:2,
labname="cholesterol") %>%
as_tibble() %>%
nest(-SUBJID, .key="lab")
#> Warning: package 'bindrcpp' was built under R version 3.4.4
d_total <- full_join(d_adverse, d_lab)
#> Joining, by = "SUBJID"
d_total %>%
select(-lab) %>%
unnest()
#> Error: Each column must either be a list of vectors or a list of data frames [adverse]
Hi @billdenney, if you need a temporary workaround, perhaps modifying the adverse list column to replace NULL with an empty tibble could help. Unnesting then returns the original adverse event counts:
library(tidyverse)
d_total %>%
mutate(adverse = map_if(adverse, is.null, ~ tibble())) %>%
select(-lab) %>%
unnest()
#> # A tibble: 1 x 2
#> SUBJID AE
#> <dbl> <fct>
#> 1 1 nausea
Supporting NULL
values seems reasonable to me.
Fixed with a quick hack; will hopefully naturally fall out when I rewrite unnest()
to use vctrs.
Most helpful comment
Hi @billdenney, if you need a temporary workaround, perhaps modifying the adverse list column to replace NULL with an empty tibble could help. Unnesting then returns the original adverse event counts: