I'm not sure if this is an SF or dplyr problem, so posting here first.
I'm updating my package to support the major breaking changes in dplyr 1.0.
I use dplyr::bind_rows to bind a list of sf data frames into a single data frame.
Simple example:
xy1 <- st_linestring(matrix(runif(18), ncol = 2))
xy1 <- st_as_sf(data.frame(id = 1, geometry = st_as_sfc(list(xy1))))
xy2 <- st_linestring(matrix(runif(18), ncol = 2))
xy2 <- st_as_sf(data.frame(id = 2, geometry = st_as_sfc(list(xy2))))
res_xy <- dplyr::bind_rows(list(xy1, xy2))
This works with both the CRAN and GitHub version of dplyr.
But if I use an XYZ linestring:
xyz1 <- st_linestring(matrix(runif(18), ncol = 3))
xyz1 <- st_as_sf(data.frame(id = 1, geometry = st_as_sfc(list(xyz1))))
xyz2 <- st_linestring(matrix(runif(18), ncol = 3))
xyz2 <- st_as_sf(data.frame(id = 2, geometry = st_as_sfc(list(xyz2))))
res_xyz <- dplyr::bind_rows(list(xyz1, xyz2))
I get an error:
Error in st_sfc(x, crs = st_crs(to), precision = st_precision(to)) :
found multiple dimensions: XYZ XY
This is not true as all the geometries are LINESTRING Z, and this error does not occur when I use the CRAN version of dplyr.
Session Info:
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_0.8.99.9002 sf_0.9-3 opentripplanner_0.2.1.0 testthat_2.3.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 compiler_3.6.3 pillar_1.4.4 prettyunits_1.1.1
[5] remotes_2.1.1 class_7.3-15 tools_3.6.3 googlePolylines_0.7.2
[9] pkgbuild_1.0.8 jsonlite_1.6.1 checkmate_2.0.0 lifecycle_0.2.0
[13] tibble_3.0.1 pkgconfig_2.0.3 rlang_0.4.6 cli_2.0.2
[17] DBI_1.1.0 rstudioapi_0.11 parallel_3.6.3 curl_4.3
[21] e1071_1.7-3 httr_1.4.1 withr_2.2.0 generics_0.0.2
[25] vctrs_0.3.0 grid_3.6.3 classInt_0.4-3 rprojroot_1.3-2
[29] tidyselect_1.0.0 glue_1.4.0 R6_2.4.1 processx_3.4.2
[33] pbapply_1.4-2 fansi_0.4.1 callr_3.4.3 purrr_0.3.4
[37] magrittr_1.5 units_0.6-6 backports_1.1.6 ps_1.3.2
[41] ellipsis_0.3.0 assertthat_0.2.1 geodist_0.0.3 KernSmooth_2.23-16
[45] crayon_1.3.4
@Robinlovelace does this affect any of your packages?
@lionel- any ideas?
does this affect any of your packages?
Not at present, the slopes package is the only one that explicitly uses xyz coordinates and that has no dplyr dependencies. FYI I did notice lots various sf things crashing when I tested the dev version of dplyr for stplanr which is now protected from the incoming changes, but didn't report them because that was a while ago an the underlying vctrs work seemed to be in flux. Now is a great time to fix issues like this to ensure dplyr/sf compatibility. Let us know @lionel- and others if we can help with any more testing or use cases.
I was going to take a look. I'll do it now.
@Robinlovelace Any failing test cases that you think might have different causes of failures would be great!
@mem48 I'm working on improving base fallbacks in vctrs so we can just use c.sfc() instead of vctrs methods. Also FYI we're delaying the 1.0 release two more weeks.
@lionel- thanks for the update. Sorry I got a bit panicked that my package would break and be pulled from CRAN. Glad to hear a solution is in the works.
@edzer I think you can close this since the fix will come from our side, and we're testing this specific case.
Thanks so much, @hadley & @lionel- !
@lionel- I tested your commits to vctrs and this does not appeare to be fixed with lists longer than 2. Using my original example
# works
res_xyz2 <- dplyr::bind_rows(list(xyz1, xyz2))
# fails
res_xyz3 <- dplyr::bind_rows(list(xyz1, xyz2, xyz1))
Error in st_geometry.sf(x) :
attr(obj, "sf_column") does not point to a geometry column.
Did you rename it, without setting st_geometry(obj) <- "newname"?
A new error message, so perhaps a new problem?
@mem48 Confirmed thanks, I see what's going on. Sorry this might have to wait until after Wednesday, but I'll hold vctrs 0.3.1 so I can fix this.
Thanks, at the moment I'm advising my users not to update dplyr but that will become harder over time.
As far as I know bind_rows() with sf data frames didn't work with dplyr 0.8.5 either, so that's not a regression. Do you know about issues for other functionality? Any reprexes would be helpful, becuase it allows us to grow our set of unit tests.
So bind_rows() does not work perfectly in dplyr 0.8.5 with sf dataframes, but it can be made to work if you make a few assumptions.
The place my package is having problems is here
The relevant bit of code is below.
suppressWarnings(results_routes <- dplyr::bind_rows(results_routes))
results_routes <- as.data.frame(results_routes)
results_routes$geometry <- sf::st_sfc(results_routes$geometry)
results_routes <- sf::st_sf(results_routes)
sf::st_crs(results_routes) <- 4326
results_routes is a list of sf data frames, in this case, I know that for every data frame in the list the CRS, column names / types, geometry type etc is always the same. So this trick works. It creates warnings, hence the suppressWarnings but it works.
I've seen this used quite widely, it is useful when iterating over each row of an sf data frame, If you use a for loop or lapply you often end up with a list of 1 row data frames.
@lionel- I've had an email from CRAN giving me until the 15th to fix the problem, or my package will be pulled from CRAN. Do you think vctrs 0.3.1 will be out before then? Otherwise, I'm going to have to find a way to remove dplyr from my package dependencies.
@edzer If this is not fixed soon, it would be worth reopening this issue, in case others come across the same problem.
@mem48 yes — we are literally working on this as hard as we possible can.
@mem48 just to be clear, this will be fixed by vctrs 0.3.1 which will be coming out later this week, but you have made a bad situation for yourself. It's a bad idea to silence warnings that tell you what you're doing is dangerous.
@hadley as a general rule I would agree. But my options were limited. Not using bind_rows would have left me using do.call(rbind) which is so slow as to make the package unusable. While not suppressing the warning would leave my package users with lots of warnings that they wouldn't understand.