Shapely: ops.split inconsistent behavior with LineStrings and MultiLineStrings

Created on 11 Apr 2019  路  5Comments  路  Source: Toblerity/Shapely

Expected behavior and actual behavior & steps to reproduce the problem

Splitting a linestring A with an intersecting linestring B does not produce the expected output (returns A). Splitting A with a multilinestring containing B and C (C intersecting A as well) produces the desired output (3 segments). This seems inconsistent.

from shapely.ops import split
from shapely.geometry import LineString, MultiLineString

A = LineString([(0, 0), (10, 0)])
B = LineString([(5, 0), (5, 5)])
C = LineString([(1, -1), (1, 1)])
assert (split(A, B).wkt == "GEOMETRYCOLLECTION (LINESTRING (0 0, 10 0))")
# Does not split
# Expected: "GEOMETRYCOLLECTION (LINESTRING (0 0, 5 0), LINESTRING (5 0, 10 0))"

assert (A.intersection(B).wkt == "POINT (5 0)")
# Although A and B do intersect

assert (split(A, MultiLineString([B, C])).wkt ==
        "GEOMETRYCOLLECTION (LINESTRING (0 0, 1 0), LINESTRING (1 0, 5 0), LINESTRING (5 0, 10 0))")
# OK - A is split by both B and C - but A was not split by B in the previous example.

Operating system

Ubuntu 18.04.2 LTS

Shapely version and provenance

Python 3.6.7 - Shapely 1.6.4.post2 (installed from PyPI using pip)

Most helpful comment

I think that would work fine, yes. And the behavior makes sense to me.

For a bit, I thought we could filter the goems in a MultiLineString splitter for those crossing line and take the difference of only those. But that would not solve the case where a single LineString splitter hooks back to touch the line it crosses at another point. I think the only way to get consistent behavior here while using difference for the heavy lifting is to split touching linear geometries.

I suspect that maybe crosses was used so the function returned early when only the boundary of line touches the splitter. In fact this case is tested for. But our current case, where the boundary of splitter touches line, is not in the tests, and may have been a blind spot. However, I'm a Shapely native without much experience with other simple feature libraries. And I don't know whether only splitting when crossing is standard behavior for linear geometries.

It was 5 years ago, but if @georgeouzou remembers maybe he could confirm or explain why crosses was used?

We can have our cake and eat it too if we want to return early in the case tested for, but carry on with splitting in this issue's case. That would require using relate directly. But I doubt that there would be a large performance difference either way.

All 5 comments

Ok. I see the problem here.

Visually, those linestrings look like this:

image

When you split the horizontal (A) by the large vertical on top (B) which only touches A, nothing happens. But splitting A by C does work because C crosses A.

Now, when you split A by MultiLineString([B, C]), it splits into three linestrings, not two as you would expect. I agree that this seems inconsistent.

The code causing this is actually pretty accessible (permalink).

    @staticmethod
    def _split_line_with_line(line, splitter):
        """Split a LineString with another (Multi)LineString or (Multi)Polygon"""

        # if splitter is a polygon, pick it's boundary
        if splitter.type in ('Polygon', 'MultiPolygon'):
            splitter = splitter.boundary

        assert(isinstance(line, LineString))
        assert(isinstance(splitter, LineString) or isinstance(splitter, MultiLineString))

        if splitter.crosses(line):
            # The lines cross --> return multilinestring from the split
            return line.difference(splitter)
        elif splitter.relate_pattern(line, '1********'):
            # The lines overlap at some segment (linear intersection of interiors)
            raise ValueError('Input geometry segment overlaps with the splitter.')
        else:
            # The lines do not cross --> return collection with identity line
            return [line]

You can see that the splitter is required to cross line for any splitting to commence. But once that checkpoint is passed, a simple call to difference is made to generate the split geometries. But difference does not discriminate between crossing and merely touching! So if at any point splitter crosses line, then all merely touching points will be split on.

@Jeremiah-England thanks for digging into this! It seems like, for lines at least, we could change the crosses test to a touches test and get less surprising results? What do you think about that?

I think that would work fine, yes. And the behavior makes sense to me.

For a bit, I thought we could filter the goems in a MultiLineString splitter for those crossing line and take the difference of only those. But that would not solve the case where a single LineString splitter hooks back to touch the line it crosses at another point. I think the only way to get consistent behavior here while using difference for the heavy lifting is to split touching linear geometries.

I suspect that maybe crosses was used so the function returned early when only the boundary of line touches the splitter. In fact this case is tested for. But our current case, where the boundary of splitter touches line, is not in the tests, and may have been a blind spot. However, I'm a Shapely native without much experience with other simple feature libraries. And I don't know whether only splitting when crossing is standard behavior for linear geometries.

It was 5 years ago, but if @georgeouzou remembers maybe he could confirm or explain why crosses was used?

We can have our cake and eat it too if we want to return early in the case tested for, but carry on with splitting in this issue's case. That would require using relate directly. But I doubt that there would be a large performance difference either way.

@Jeremiah-England my memory is a little cloudy on this one, but seeing the code again i can confirm that this was a blind spot. The code should also test for the current case to be complete. Could you continue on with the changes?

Hey @georgeouzou, thanks for taking the time to look at that!

And, yes, if it's OK with everyone I would like to hammer out this case and submit a PR at some point.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

akadouri picture akadouri  路  4Comments

mikedh picture mikedh  路  6Comments

jGaboardi picture jGaboardi  路  5Comments

LostFan123 picture LostFan123  路  5Comments

ispmarin picture ispmarin  路  3Comments