Sf: Odd behavior of st_intersection when using multipolygons

Created on 10 Dec 2018  路  4Comments  路  Source: r-spatial/sf

I'm finding that the order in which I contruct a MULTIPOLYGON affects the behavior of st_intersection.

When I construct the multipolygon using st_multipolygon(list(list(square_out), list(square_small))) the behavior is as expected but when I use st_multipolygon(list(list(square_out, square_small))) I end up with only the first element (square_out) getting intersected. Is this expected behavior?

See below:

library(sf); library(mapview); library(tidyverse)
#> Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
square <- cbind(c(0,0, 1, 1, 0), c(0, 1, 1, 0, 0))
square_out <- square + 1
square_small <- square*0.5


poly_1 <- st_multipolygon(list(list(square))) %>% 
  st_sfc() %>% 
  st_sf()
poly_2a <- st_multipolygon(list(list(square_out), list(square_small))) %>% 
  st_sfc() %>% 
  st_sf()
poly_2b <- st_multipolygon(list(list(square_out, square_small))) %>% 
  st_sfc() %>% 
  st_sf()

plot1 <- poly_1 %>% 
  ggplot() + 
  geom_sf(fill = "blue", alpha = 0.4) + 
  geom_sf(data = poly_2a, fill = "red", alpha = 0.4) + 
  theme_bw()

plot2 <- poly_1 %>% 
  ggplot() + 
  geom_sf(fill = "blue", alpha = 0.4) + 
  geom_sf(data = poly_2b, fill = "red", alpha = 0.4) + 
  theme_bw()

# Both appear to be legitimate:
cowplot::plot_grid(plot1, plot2, nrow = 1)

st_intersection(poly_2a, poly_1) # works
#> Simple feature collection with 1 feature and 0 fields
#> geometry type:  GEOMETRYCOLLECTION
#> dimension:      XY
#> bbox:           xmin: 0 ymin: 0 xmax: 1 ymax: 1
#> epsg (SRID):    NA
#> proj4string:    NA
#>                         geometry
#> 1 GEOMETRYCOLLECTION (POINT (...
st_intersection(poly_2b, poly_1) # fails
#> Simple feature collection with 1 feature and 0 fields
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: 1 ymin: 1 xmax: 1 ymax: 1
#> epsg (SRID):    NA
#> proj4string:    NA
#>      geometry
#> 1 POINT (1 1)

Created on 2018-12-10 by the reprex package (v0.2.1.9000)

Most helpful comment

I don't know. Perhaps others with more experience in how long sf::st_is_valid takes to run might be able to answer? But I feel like being able to create and store invalid geometries is very helpful when working with "real world data" that so very often contains invalid geometries that first need to be fixed prior to analysis, so I wouldn't want functions like st_multipolygon to throw an error and refuse to create invalid geometries.

All 4 comments

Sorry if I'm missing something obvious or I'm just plain wrong, but it looks to me like poly_2b is an invalid geometry. For instance, on my system, sf::st_is_valid(poly_2b) returns FALSE. If my understanding is correct, this is because the second polygon in poly_2b is actually defined to be a hole inside the first polygon. In other words, when you create a st_polygon with multiple polygons in the input list (which AFAIK is what you're doing with poly_2b), the first polygon should be the "outer border" and the subsequent polygons should define the "borders of the holes" inside the first polygons (apologies - I don't know the correct terminology). But here, the hole (i.e. the second polygon) in poly_2b is located outside the "border polygon (i.e. the first polygon).

Here's some code that might help with understanding how st_polygon and st_multipolygon interpret input arguments. In the plots below, the "holes" are in coloured in white and the "filled in geometries" are coloured in "grey".

# load package
library(sf)

# create raw polygon coordinate data
outer <- matrix(c(0,0,10,0,10,10,0,10,0,0),ncol=2, byrow=TRUE)
hole1 <- matrix(c(1,1,1,2,2,2,2,1,1,1),ncol=2, byrow=TRUE)
hole2 <- matrix(c(5,5,5,6,6,6,6,5,5,5),ncol=2, byrow=TRUE)

# create polygon with two holes
pl <- st_polygon(list(outer, hole1, hole2))

# create a multipolygon that contains one polygon with two holes
mpl <- st_multipolygon(list(list(outer, hole1, hole2)))

# create a multipolygon that contains three polygons, each with no holes
mpl2 <- st_multipolygon(list(list(outer), list(hole1), list(hole2)))

# plot the geometries to visually compare
par(mfrow = c(1, 3))
plot(pl, col = "grey50", main = "pl")
plot(mpl, col = "grey50", main = "mpl")
plot(mpl2, col = "grey50", main = "mpl2")

plot

Fantastic - I get it!

@jeffreyhanson Should st_multipolygon check this sort of validity at construction? Or is that too burdensome?

I don't know. Perhaps others with more experience in how long sf::st_is_valid takes to run might be able to answer? But I feel like being able to create and store invalid geometries is very helpful when working with "real world data" that so very often contains invalid geometries that first need to be fixed prior to analysis, so I wouldn't want functions like st_multipolygon to throw an error and refuse to create invalid geometries.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dpprdan picture dpprdan  路  4Comments

dkyleward picture dkyleward  路  4Comments

matthewpaulking picture matthewpaulking  路  4Comments

duleise picture duleise  路  3Comments

kendonB picture kendonB  路  3Comments