Dplyr: bind_rows rejects lists with additional class names

Created on 12 Jul 2017  路  7Comments  路  Source: tidyverse/dplyr

When a list includes an additional class name, bind_rows does not recognize the objet as a list. Generates the following error:

Error in bind_rows_(x, .id) : 
  Argument 1 must be a data frame or a named atomic vector, not a my_class/list
df1 <- data.frame(a=1,b=2)
df2 <- data.frame(a=3,b=4)
l1 <- list(df1,df2)
l2 <- structure(l1, class=c("my_class","list"))
class(l1)
class(l2)
dplyr::bind_rows(l1)
dplyr::bind_rows(l2)

All 7 comments

Thanks. Could you please describe your use case?

I have defined a data model within a package which consists of a list of date-stamped dataframes (i.e. one dataframe for each day). My package-specific data model is basically a list, but I have also tagged it with another class name for internal tracking/consistency purposes.

Sometimes I need to flatten the list objects into a single dataframe.

As shown in my example above bind_rows works if the object has class of list alone, but it fails when an additional class name is added.

Another option for this kind of data would be a nested tibble, there you would just call unnest() instead of bind_rows(). The test in bind_rows() currently errs on the side of caution, can you define an as.data.frame() method for your objects and use that before binding?

We're working on a better way, take a look at the "Methods" section in http://adv-r.hadley.nz/s3.html.

Hi,
In the same way, using bind_rows with a by() returning a set of data.frame() is not possible any more with 0.7.2, and it was possible with 0.5.0. Ok, I can agree it's not the best way to do that but when bind_rows() was used to replace the do.call(rbind,...) design and not all the code converted to dplyr (yet !), it was very useful.

library(dplyr)

data("iris")

bind_rows(
  by(iris, iris$Species, function(ir) {
    data.frame(species=ir$Species[1], dummy=mean(ir$Sepal.Length))
  })
)

It is still possible to do it like this:

bind_rows(
  unclass(by(iris, iris$Species, function(ir) {
    data.frame(species=ir$Species[1], dummy=mean(ir$Sepal.Length))
  }))
)

Maybe we should just treat lists that don't inherit from data frames as lists... It's just not clear what is the right behaviour with objects.

On a new code yes, but in a compatibility point of view...
I used S3 redefinition of bind_rows to have a bind_rows.by() and avoid modification of all the code
by() could probably treated as an exception...
bind_rows() should accept a wide range of list-like object (for robustness) and limit compatibility issues ?

I think changing this behaviour is likely to have far-reaching consequences, and I don't think it's worth it given that you can easily unclass() the input.

Was this page helpful?
0 / 5 - 0 ratings