Sf: st_sample should inherit polygon attributes

Created on 26 Mar 2019  ยท  11Comments  ยท  Source: r-spatial/sf

Hello, st_buffer results in polygons that inherit the attributes of the points used to create them. It would be handy if st_sample would do the same but in reverse (i.e., points would inherit the attributes of polygons they were sampled within). I can't figure out how to do that manually, especially when a sampled point is contained within several polygons (but I only want it to inherit the attributes of the polygon it was sampled within). Thanks for any advice! Mark

Most helpful comment

n = c(1,2,3)
st_sf(id = rep(seq_along(n), n), geom = p1)

All 11 comments

Hi @markyxe can you provide a reproducible example to illustrate the point?

Hi Robin. Thanks for your response. Here's an example:

nc = st_read(system.file("shape/nc.shp", package="sf"))
p1 = st_sample(nc[1:3,], c(1,2,3),exact=T)
plot(st_geometry(nc)[1:3])
plot(p1,add=T)
nc #has attributes
p1 #has geometry but no attributes; each point should inherit attributes of
the polygon it was created within

I'd like each randomly created point to inherit the attributes of the
polygon it was created within. In this example, I could use a spatial join
to do that, but in my dataset I have overlapping polygons so a spatial join
will not work (i.e., randomly created points fall within multiple polygons).

Ideally, st_sample would have an argument to make this easy (something
like: inheritAttributes=TRUE) but in the meantime I'd appreciate any help
coming up with a manual solution.

Thank you!

Mark

On Tue, Mar 26, 2019 at 1:02 AM Robin notifications@github.com wrote:

Hi @markyxe https://github.com/markyxe can you provide a reproducible
example to illustrate the point?

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/r-spatial/sf/issues/1014#issuecomment-476499935, or mute
the thread
https://github.com/notifications/unsubscribe-auth/Ap_1IwNOwWiAV6bHQBzIdeN18j-Hi3Luks5vacYAgaJpZM4cKfXv
.

Hello. Does anyone know how to copy attributes from polygons to points randomly created with them using st_sample? Please see commented example above. Thank you!

n = c(1,2,3)
st_sf(id = rep(seq_along(n), n), geom = p1)

Hi Edzer, thank you for your help. I tried working with your example, but I couldn't figure out how to use it to solve my problem. I've used st_sample() to create random points in an sf object containing polygons, and I'd like the points to inherit the attributes of the specific polygon they were created within. Would you show me how this example can help me do that? Thank you.

Note that I changed my comment, using seq_along.

If n is passed as the value for argument size of st_sample and specifies the number of points per polygon, as it does in your example, then rep(seq_along(n), n) gives you a set of IDs that point to the polygon each of the points was sampled from. This is what I used to create an annotated point set.

This is how you can copy the attributes from nc to the points. It's called a spatial join:

library(sf)
#> Linking to GEOS 3.5.1, GDAL 2.1.2, PROJ 4.9.3
nc = st_read(system.file("shape/nc.shp", package = "sf"))
#> Reading layer `nc' from data source `/usr/local/lib/R/site-library/sf/shape/nc.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 100 features and 14 fields
#> geometry type:  MULTIPOLYGON
#> dimension:      XY
#> bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> epsg (SRID):    4267
#> proj4string:    +proj=longlat +datum=NAD27 +no_defs
p1 = st_sample(nc[1:3, ], c(1, 2, 3), exact = TRUE)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
plot(st_geometry(nc)[1:3])
plot(p1, add = TRUE)

names(nc)
#>  [1] "AREA"      "PERIMETER" "CNTY_"     "CNTY_ID"   "NAME"     
#>  [6] "FIPS"      "FIPSNO"    "CRESS_ID"  "BIR74"     "SID74"    
#> [11] "NWBIR74"   "BIR79"     "SID79"     "NWBIR79"   "geometry"
names(p1)
#> NULL
p1_sf = st_sf(p1)
p1_joined = st_join(p1_sf, nc)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
names(p1_joined)
#>  [1] "AREA"      "PERIMETER" "CNTY_"     "CNTY_ID"   "NAME"     
#>  [6] "FIPS"      "FIPSNO"    "CRESS_ID"  "BIR74"     "SID74"    
#> [11] "NWBIR74"   "BIR79"     "SID79"     "NWBIR79"   "geometry"

Created on 2019-03-28 by the reprex package (v0.2.1)

Session info

devtools::session_info()
#> โ”€ Session info โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#>  setting  value                       
#>  version  R version 3.5.3 (2019-03-11)
#>  os       Debian GNU/Linux 9 (stretch)
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Etc/UTC                     
#>  date     2019-03-28                  
#> 
#> โ”€ Packages โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#>  package     * version date       lib source                           
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.5.3)                   
#>  backports     1.1.3   2018-12-14 [1] CRAN (R 3.5.3)                   
#>  callr         3.2.0   2019-03-15 [1] CRAN (R 3.5.3)                   
#>  class         7.3-15  2019-01-01 [2] CRAN (R 3.5.3)                   
#>  classInt      0.3-1   2018-12-18 [1] CRAN (R 3.5.3)                   
#>  cli           1.1.0   2019-03-19 [1] CRAN (R 3.5.3)                   
#>  crayon        1.3.4   2017-09-16 [1] CRAN (R 3.5.3)                   
#>  curl          3.3     2019-01-10 [1] CRAN (R 3.5.3)                   
#>  DBI           1.0.0   2018-05-02 [1] CRAN (R 3.5.3)                   
#>  desc          1.2.0   2018-05-01 [1] CRAN (R 3.5.3)                   
#>  devtools      2.0.1   2018-10-26 [1] CRAN (R 3.5.3)                   
#>  digest        0.6.18  2018-10-10 [1] CRAN (R 3.5.3)                   
#>  dplyr         0.8.0.1 2019-02-15 [1] CRAN (R 3.5.3)                   
#>  e1071         1.7-1   2019-03-19 [1] CRAN (R 3.5.3)                   
#>  evaluate      0.13    2019-02-12 [1] CRAN (R 3.5.3)                   
#>  fs            1.2.7   2019-03-19 [1] CRAN (R 3.5.3)                   
#>  glue          1.3.1   2019-03-12 [1] CRAN (R 3.5.3)                   
#>  highr         0.8     2019-03-20 [1] CRAN (R 3.5.3)                   
#>  htmltools     0.3.6   2017-04-28 [1] CRAN (R 3.5.3)                   
#>  httr          1.4.0   2018-12-11 [1] CRAN (R 3.5.3)                   
#>  knitr         1.22    2019-03-08 [1] CRAN (R 3.5.3)                   
#>  lwgeom        0.1-6   2019-03-26 [1] Github (r-spatial/lwgeom@b11624d)
#>  magrittr      1.5     2014-11-22 [1] CRAN (R 3.5.3)                   
#>  memoise       1.1.0   2017-04-21 [1] CRAN (R 3.5.3)                   
#>  mime          0.6     2018-10-05 [1] CRAN (R 3.5.3)                   
#>  pillar        1.3.1   2018-12-15 [1] CRAN (R 3.5.3)                   
#>  pkgbuild      1.0.3   2019-03-20 [1] CRAN (R 3.5.3)                   
#>  pkgconfig     2.0.2   2018-08-16 [1] CRAN (R 3.5.3)                   
#>  pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.5.3)                   
#>  prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.5.3)                   
#>  processx      3.3.0   2019-03-10 [1] CRAN (R 3.5.3)                   
#>  ps            1.3.0   2018-12-21 [1] CRAN (R 3.5.3)                   
#>  purrr         0.3.2   2019-03-15 [1] CRAN (R 3.5.3)                   
#>  R6            2.4.0   2019-02-14 [1] CRAN (R 3.5.3)                   
#>  Rcpp          1.0.1   2019-03-17 [1] CRAN (R 3.5.3)                   
#>  remotes       2.0.2   2018-10-30 [1] CRAN (R 3.5.3)                   
#>  rlang         0.3.2   2019-03-21 [1] CRAN (R 3.5.3)                   
#>  rmarkdown     1.12    2019-03-14 [1] CRAN (R 3.5.3)                   
#>  rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.5.3)                   
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.5.3)                   
#>  sf          * 0.7-3   2019-02-21 [1] CRAN (R 3.5.3)                   
#>  stringi       1.4.3   2019-03-12 [1] CRAN (R 3.5.3)                   
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 3.5.3)                   
#>  testthat      2.0.1   2018-10-13 [1] CRAN (R 3.5.3)                   
#>  tibble        2.1.1   2019-03-16 [1] CRAN (R 3.5.3)                   
#>  tidyselect    0.2.5   2018-10-11 [1] CRAN (R 3.5.3)                   
#>  units         0.6-2   2018-12-05 [1] CRAN (R 3.5.3)                   
#>  usethis       1.4.0   2018-08-14 [1] CRAN (R 3.5.3)                   
#>  withr         2.1.2   2018-03-15 [1] CRAN (R 3.5.3)                   
#>  xfun          0.5     2019-02-20 [1] CRAN (R 3.5.3)                   
#>  xml2          1.2.0   2018-01-24 [1] CRAN (R 3.5.3)                   
#>  yaml          2.2.0   2018-07-25 [1] CRAN (R 3.5.3)                   
#> 
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/local/lib/R/library

The process is described in a number of freely accessible resources, including here: https://geocompr.robinlovelace.net/spatial-operations.html#spatial-joining

I think that resolves the issue, meaning this can be closed. However, if there are not downsides of adding an inherit_attributes argument to st_sample(), I see know reason not to do so. I don't think it is the same as st_buffer() though, because buffering involves a 1-to-1 match between the input and output features. I could imagine scenarios where it could be useful, but understand the need to resist 'feature creep'. Thoughts @edzer?

P.s. here's a result showing it does capture the correct attribute data:

p1_joined["NAME"]
Simple feature collection with 6 features and 1 field
geometry type:  POINT
dimension:      XY
bbox:           xmin: -81.27795 ymin: 36.27591 xmax: -80.59918 ymax: 36.53381
epsg (SRID):    4267
proj4string:    +proj=longlat +datum=NAD27 +no_defs
       NAME                   geometry
1      Ashe POINT (-81.27795 36.41372)
2 Alleghany POINT (-81.01464 36.53381)
3 Alleghany POINT (-81.02903 36.44819)
4     Surry POINT (-80.68799 36.27591)
5     Surry POINT (-80.73566 36.39051)
6     Surry POINT (-80.59918 36.37627)

This will still not tell you from which polygon a particular point was sampled when the polygons overlap and a point falls in more than one polygon.

Robin, thanks for showing me the spatial join! Edzer is correct though, this won't quite work for my data because I have overlapping polygons. Sorry for any confusion caused by my reproducible example not having overlapping polygons...

Edzer, thanks for your solution, which seems to work quite well at least some of the time. However, I am now running into another problem. When I request 100 points from st_sample, sometimes I get exactly 100 points, but usually I get < or > than 100 even when I set the exact=T argument. When I get 100 points, your solution works well because the number of rows in my polygon layer matches the number of rows in the points layer. But when I get < or > 100 points, they don't match and I get "Error...arguments imply differing number of rows". Do you know what I'm doing wrong or why the exact=T argument isn't working? Thank you!

And...I just updated to the latest version of sf and exact=T is now working just fine, so thanks!

Was this page helpful?
0 / 5 - 0 ratings