Sf: cbind of sf and df

Created on 2 Dec 2016  路  19Comments  路  Source: r-spatial/sf

Hi all,
cbind of a sf and a data.frame creates a weird matrix object.
dd <- cbind(sf,df)

but if we do it this way it works:

dd <- st_sf(data.frame(sf,df))

as cbind.sf does the above procedure internally, I think with a minor change, cbind.sf could become more flexible.

All 19 comments

Please try a pull request, it is hard for me to see what you suggest to change otherwise.

It turns out the problem is not from sf rather the way cbind dispatch on sf object:

sf <- st_read(system.file("shape/nc.shp", package="sf"), "nc", crs = 4267)
df <- data.frame( field1=1:nrow(sf) , field2=1:nrow(sf))

weird_obj <- cbind(sf,df)
new_sf <- cbind.sf(sf,df)  #works

cbind is an internal generic function and I don't understand why cbind has this behaviour!

Since it doesn't have named arguments before ..., it can't use standard method dispatch.

any workaround?

call cbind.sf directly?

dplyr has bind_rows() and bind_cols(), maybe sf could use a similar interface, or something like sf_bind_rows() and sf_bind_cols() with a similar set of strict conditions ?

base:::rbind works for binding rows of sf objects, and cbind.sf or dplyr:::bind_cols work for binding columns. Do we need other aliases for the cbind case, or for both, in order to hide this from the user (as @etiennebr suggests)?

I don't think additional aliases are necessary. Using old known interfaces such as cbind is more intuitive and faster to grab for the end user. However, I think cbind.sf should be documented, because not all R users are aware of method dispatching of the R. I am not a user of dplyr, but if that bind_cols function is not exported (which is evident from the notation), an alias would be appropriate. again, because not everybody know how to access a non-exported function!

cbind and rbind methods seem to work; can we close this?

Still not working. cbind (sf,df) creates an object of class matrix, with mode of list. I am using the CRAN version though.

But rbind(sf,sf) work, right?

For cbind, I suggest to export a st_bind_cols that explicitly calls sf:::cbind.sf, then.

for rbind(sf,sf), it works fine. st_bind_cols is better than the cryptic cbind.sf .

dplyr::bind_cols now works with sf and non-sf objects, but this might only be true for the (pre) 0.6.0 version on github.com/tidyverse/

make it work also with sfc (i.e. sfc is passed instead of sf):

st_bind_cols(st_sfc(st_point(c(1,1))) , a= 1, data.frame(b=2) )

Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class "c("sfc_POINT", "sfc")" to a data.frame

The following change is needed:

st_bind_cols <- function (...)
{
  arg1 <- list(...)[[1]]

  if (inherits(arg1,"sf"))
    cbind.sf(...)
  else if(inherits(arg1,"sfc"))
    cbind.sfc(...)
}
cbind.sfc <- function(...){
  sf::st_sf(...)
}
st_bind_cols( st_sfc(st_point(c(1,1)))  , a= 1, data.frame(b=2)   )

The better approach might be to add

as.data.frame.sfc = function(x, ...) {
    ret = data.frame(row.names = seq_along(x))
    ret$geometry = x
    ret
}

dplyr::bind_cols(sf, data.frame) works, but is sub-optimal because it doesn't put columns from data.frame before the geometry column, which I expected. Is this intended behavior? MWE (dplyr 0.7-4, sf 0.5-4):

sf <- st_read(system.file("shape/nc.shp", package="sf"), "nc", crs = 4267)
df <- data.frame(field1 = 1:nrow(sf) , field2 = 1:nrow(sf))

new_sf <- dplyr::bind_cols(sf, df)
names(new_sf)

 [1] "AREA"      "PERIMETER" "CNTY_"     "CNTY_ID"   "NAME"      "FIPS"      "FIPSNO"    "CRESS_ID"  "BIR74"     "SID74"     "NWBIR74"   "BIR79"     "SID79"    
[14] "NWBIR79"   "geometry"  "field1"    "field2" 

Why does the order of columns matter? I would expect bind_cols to keep them in the order of appearance.

It matters for example when viewing sf-objects in RStudio environment pane, or print()-ing them, since you're usually not interesting in looking directly at the geometry column in those cases. Also, I thought that sf convention was to have the geometry column last? And yes, that's the expected behavior of bind_cols. I think I was hinting at providing an sf method which put geometry last.

sounds like a GUI issue rather than an R issue. Sinds bind_cols is not a generic, there is nothing sf can do for you here, AFAICT.

Was this page helpful?
0 / 5 - 0 ratings