It seems st_bbox(x) returns the bounding box of the entire x object (which makes sense). In my case, I would like to get the bounding box of each feature in x. For example, I can do
library(sf)
nc <- st_read(system.file("shape/nc.shp", package="sf"))
bboxes <- list()
for(i in seq_len(nrow(nc))) bboxes <- c(bboxes, list(st_bbox(nc[i, ])))
But this is obviously slow for large feature sets. Is there a better way and if not, would it be difficult to vectorize st_bbox() at the c++ level? Thanks!
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2
Small tip - maybe it would be helpful in your case. You should not grow an object if you know its future size. The code below should be substantially faster for large datasets:
bboxes2 <- vector("list", nrow(nc))
for(i in seq_len(nrow(nc))) bboxes2[[i]] <- st_bbox(nc[i, ])
Update:
Even faster approach, but it drops CRS:
nc2 = st_geometry(nc)
bboxes3 <- vector("list", length(nc2))
for(i in seq_len(length(nc2))) bboxes3[[i]] <- st_bbox(nc2[[i]])
Comparison using a medium-sized data:
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.4, proj.4 4.9.3
## mock up some point data
n = 1e4
set.seed(234)
dat = data.frame(x = runif(n), y = runif(n))
pts = st_as_sf(dat, coords = c("x", "y"))
get_bbox1 = function(x){
bboxes <- list()
for(i in seq_len(nrow(x))) bboxes <- c(bboxes, list(st_bbox(x[i, ])))
return(bboxes)
}
get_bbox2 = function(x){
bboxes2 <- vector("list", nrow(x))
for(i in seq_len(nrow(x))) bboxes2[[i]] <- st_bbox(x[i, ])
return(bboxes2)
}
get_bbox3 = function(x){
x = st_geometry(x)
bboxes3 <- vector("list", length(x))
system.time({for(i in seq_len(length(x))) bboxes3[[i]] <- st_bbox(x[[i]])})
return(bboxes3)
}
system.time(get_bbox1(pts))
#> user system elapsed
#> 26.992 0.019 27.280
system.time(get_bbox2(pts))
#> user system elapsed
#> 23.446 0.005 23.675
system.time(get_bbox3(pts))
#> user system elapsed
#> 0.699 0.000 0.705
...and a larger one:
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.4, proj.4 4.9.3
## mock up some point data
n = 1e5
set.seed(234)
dat = data.frame(x = runif(n), y = runif(n))
pts = st_as_sf(dat, coords = c("x", "y"))
get_bbox1 = function(x){
bboxes <- list()
for(i in seq_len(nrow(x))) bboxes <- c(bboxes, list(st_bbox(x[i, ])))
return(bboxes)
}
get_bbox2 = function(x){
bboxes2 <- vector("list", nrow(x))
for(i in seq_len(nrow(x))) bboxes2[[i]] <- st_bbox(x[i, ])
return(bboxes2)
}
get_bbox3 = function(x){
x = st_geometry(x)
bboxes3 <- vector("list", length(x))
system.time({for(i in seq_len(length(x))) bboxes3[[i]] <- st_bbox(x[[i]])})
return(bboxes3)
}
system.time(get_bbox1(pts))
#> user system elapsed
#> 607.953 2.174 620.526
system.time(get_bbox2(pts))
#> user system elapsed
#> 265.432 3.197 272.193
system.time(get_bbox3(pts))
#> user system elapsed
#> 6.807 0.031 6.901
@Nowosad This is great, thanks! I still think it'd be nice to have a native, vectorized version of st_bbox() in the sf package, but your solution works great until that day comes (if it does).
@Nowosad I appreciate it if you help solving issues here.
@ben519 feel free to write a PR for this in case it is worth the speed up.
The non-for-loop version of this is
lapply(st_geometry(nc), st_bbox)
Thanks @edzer. I had tried versions of lapply(), sapply(), and apply() with no luck but didn't think about casting my sf object to type sfc first. Much appreciated.
Most helpful comment
Small tip - maybe it would be helpful in your case. You should not grow an object if you know its future size. The code below should be substantially faster for large datasets:
Update:
Even faster approach, but it drops CRS: