Openstreetmap-carto: Lake Ontario does not render at zoom 5 or below

Created on 8 Jun 2015  Â·  57Comments  Â·  Source: gravitystorm/openstreetmap-carto

Lake Ontario does not render at zoom 5 or below. It looks fine at zoom 6 or above. I think that this is a bug in the OSM map style because I think that the data is fine.

water

Most helpful comment

The Great Lakes are too large to ignore at lower zoom levels. It seems really odd for such huge lakes to appear out of nowhere at z6.

untitled2

All 57 comments

This is nothing special about Lake Ontario - no lakes are rendered at z<6. This is not a bug but still it would be good to also show lakes at the lower zooms of course. It is difficult to do that in an efficient and good looking way though. See also #754.

Lake Erie, Huron, Superior and Michigan render properly. Why doesn't Lake Ontario?

Those are mapped as coastline - see

http://wiki.openstreetmap.org/wiki/Tag:natural%3Dcoastline#What_about_lakes.3F
http://www.openstreetmap.org/changeset/28625595

Maybe a short term workaround would be to render lakes at a somewhat lower
zoom e.g. at zoom 4 or higher. At zoom 3 the "Toronto" label covers where
Lake Ontario should be so the lack of Lake Ontario isn't that noticeable.
At zoom 5 currently Lake Ontario looks absolutely awful.

On Mon, Jun 8, 2015 at 12:38 PM, imagico [email protected] wrote:

Those are mapped as coastline - see

http://wiki.openstreetmap.org/wiki/Tag:natural%3Dcoastline#What_about_lakes.3F
http://www.openstreetmap.org/changeset/28625595

—
Reply to this email directly or view it on GitHub
https://github.com/gravitystorm/openstreetmap-carto/issues/1604#issuecomment-110068479
.

link: http://www.openstreetmap.org/?mlat=44.024&mlon=-78.926#map=6/44.024/-78.926

@pnorman - Would it be a significant performance impact to render lakes also on z5, maybe also on z4?

@pnorman - Would it be a significant performance impact to render lakes also on z5, maybe also on z4?

Probably. It'd mean another sequential scan, and I'm not sure how big the set of results would be. That would depend on pixel size and way area filtering.

Why not simply make a shape file with generalized versions of big lakes and use that for rendering in low zooms? They could easily be extracted from OSM data, the shapefile would not be so big, and you already depend on external shapfiles anyways.

If you only show the larger lakes there is little gain compared to a plain way_area threshold - both performance wise and quality wise. And a high size cutoff is a bad idea leading to ugly results, see #754.

If you however want to produce a generalized rendering based on all water areas, even the small ones, you are dealing with a huge volume of data at the start, even if it boils down to a small file in the end (data size ratio is in the order of 1:100). This is not something you can do as a background job during the night.

Shameless plug: @joto and myself have been considering this as something that would be nice to add to openstreetmapdata.com. This is surely something that would be useful for a lot of applications but if this is going to happen depends on financing development work and processing resources.

I'm against adding dependencies on additional preprocessed sources. In most stylesheets, I'd have no problems, but then again, in most stylesheets I'd probably use natural earth at zoom 5.

Fair enough - note however ultimately this would mean that you'd never get a really good quality rendering at the lower zooms. AFAIK @gravitystorm has repeatedly stated that one of his aims is to improve the look of the map in the starting configuration when you newly visit openstreetmap.org - which happens to be z=5 at the moment:

http://www.openstreetmap.org/#map=5/51.500/-0.100

Improving this depends a lot on use of preprocessed data - not so much for the lakes but quite definitely for boundaries and placenames.

FYI: Even more great lakes have been changed from coastline to lakes in the dataset recently, which caused some confusion:
https://www.reddit.com/r/openstreetmap/comments/3z7w9d/where_are_the_great_lakes/

Bummer ... I spent a few days sketching coastline ways around Lake Ontario, but now we've moved backwards.

Well I have a hack that may work, just render the name= tags for those lakes as a lake. You keep the performance but render the important lakes.

The Great Lakes are too large to ignore at lower zoom levels. It seems really odd for such huge lakes to appear out of nowhere at z6.

untitled2

IMO this is not any worse than the Greenland and Antarctic ice appearing out of nowhere at the same zoom level.

As discussed previously a good solution to this problem will require a preprocessing approach. @pnorman has voiced his opinion against such approach here. In #754 the poor and lazy solution of size based filtering has been (IMO rightfully) rejected because of its arbitrariness.

Even if a practically usable preprocessing solution is currently not available - and i don't know if and when i will have the time to work on such - see above in https://github.com/gravitystorm/openstreetmap-carto/issues/1604#issuecomment-119102118 - it would probably be helpful if the maintainers here can agree on how they see this in principle, that means either closing this as wontfix, reopening #754 or keeping this open with the aim of solving it with preprocessed data.

In #754 the poor and lazy solution of size based filtering has been (IMO rightfully)
rejected because of its arbitrariness.

This could be one of those cases where the perfect is the enemy of the good. There will always be arbitrary thresholds and heuristics, even in the preprocessing solutions. That's not inherently bad.

this is not any worse than the Greenland and Antarctic ice appearing out of nowhere at the same zoom level.

@imagico land/water relationship is more fundamental to how we view the face of this world than ice. This is more akin to showing the Caspian Sea as land when zoomed out, it causes a very basic discrepancy on the map.

The problems of size based filtering have been discussed in depth, no real need to repeat this here. The advantages of a preprocessing approach less so but you can take my word on it that when done well this is much less arbitrary - if you want to you can essentially try to mimic the appearence of rendering the full detail data minus the AGG rendering artefacts that would otherwise be dominating at these scales.

Probably obvious but the likeliness of a good but somewhat complex solution being developed of course significantly diminishes if a bad solution is accepted as good enough - so using your words: good enough is the enemy of great.

Also in this style you should always keep in mind the effect such decisions have on mapping activities. way_area filtering with a larger than sub-pixel threshold would put significant incentive on mapping for the renderer by merging waterbodies to push the area to get your pet lake show up at a lower zoom level.

@planemad - that is obviously a question of subjective preferences, lets focus on the issue itself and not on if it is more important than other things.

... not to mention Rio de la Plata's wide mouth 'becoming' dry land between Argentina and Uruguay...

I'd be happy to consider a pre-processing step for water areas at low zoom. But if we're going to do it, then it needs to be done properly, namely:

  • Dissolving adjacent polygons (no simple way_area check). As @imagico says, way_area filtering incentivises poor mapping
  • Handles all water body types. I don't like people feeling obliged to tag large lakes as coastline purely because of rendering problems
  • Is frequently updated
  • Is open source

I'd also like if the approach can be easily adapted to other low zoom features, like forests. These suffer from way_area based filtering when mapped in detail. When a large forest is mapped (correctly) as many polygons with small road-width-sized gaps between them, naive way_area filtering makes the forest disappear. But the main reason for easy adaptation is to ensure we don't end up with X shapefile created though method A, Y shapefile by method B, Z by C etc, with different software or update frequencies or whatnot.

I may have a fix, lets tag all large inland lakes with lake=great_lake this would include lake Victoria and lake Baikal. The criteria for great_lake is that it is visible from space, is large enough for ferries and fishing boats and is large enough to impact the weather. All of these lakes meet the criteria and this would make it easier to match this tag to render while giving a useful semantic differentiation. Openstreetmap-carto can simply render lake=great_lake as a lake at the lower zoom levels.

I may have a fix, lets tag all large inland lakes

Please discuss this on a tagging list or forum, not the style issue tracker.

IMO this is not any worse than the Greenland and Antarctic ice appearing out of nowhere at the same zoom level.

It took me a while to see the difference:

  1. The ice appears everywhere on z6, but we have most of the water visible from z0, so this is sometimes inconsistent and unexpected that some quite big water areas are not visible just because they are inside the land.
  2. What's more important, we get human settlements, roads etc. from z5 - but not much bigger water areas. It's evident that something big is missing there and this is a bug in the map to be fixed. z0-z4 don't have this problem, so they can use more general solution.

Since we probably can't avoid 1. and some limits are needed, the crucial thing would be to start rendering big inland waters on z5.

Are there any further ideas for this? IMHO the lack of big water areas in low zooms is really annoying...
I can understand the reluctance of relying on pre-processed data from another source, but is there any better alternative?

Maybe we could do a trick like this:

  • Take the (100?) largest lakes on earth and find their wikidata-ID.
  • use this list to render everything that has natural=water and a fitting wikidata-tag.

In this scenario, we would not rely on external sources after the first list of wikidata-IDs has been generated. No maintenance would be needed on that list, except if we wanted to expand it (or some catastrophic event generates a new giant lake somewhere on earth ;) ) Wikidata-IDs should hopefully be stable, and more suitable than using a list of names (because of possible spelling changes). Tagging missing Wikidata-IDs is easy.

A list of the largest lakes and reservoirs is here:
https://de.wikipedia.org/wiki/Liste_der_größten_Seen
(the linked English page is much shorter)

Another aspect of the above idea: It does not matter whether the lake is split in parts or not, as long as every part has the wikidata-id. Of course, using one relation for the whole lake is better.

Some disadvantages:

  • the threshold is always arbitrary.
  • the raw geometric data may still be not very usable for the cleanest rendering in those zooms.

Another possible threshold could be way_area of the smallest from top 100. This would make us not dependent on additional tags.

Another possible threshold could be way_area of the smallest from top 100.

How long would that database query take?

The main issue is performance. Perhaps now we have the requirement that all rendered features should at least be 0,1 pixel (if I remember correctly), performance is not so much of an issue anymore. This should be tested.

We have partial indexes on large ways (way_area_z6) and water. Postgres could use either for this query, but a sequential scan might end up being fastest.

Lac Saint-Jean is the last of top 100. It has the area 1,053 km2 and ~396 way_pixels on z6, according to Kosmtik Data Inspector. On z5 I couldn't point it, but it's somewhere between 99-100 (with a 99 limit it's still visible, with 100 it disappears) and it looks like this (in the middle):

9p71h_wg

The code is quite simple and would supersede #2864 if approved (after changes, if requested).

  • I didn't use any Merkator fix, so some bigger lakes near equator might be not shown, but I guess with ~100 way_pixels it doesn't really matter, because we would cover all the big waters anyway and the rest is just a nice add-on.
  • I skipped landuse=reservoir and waterway=riverbank in low zoom layer - the first can be tagged as a water and it's not a riverbank range (Río de la Plata is tagged as natural=water).
  • I also made labels to appear from z5, because common 3000 way_pixels limit is safe. I'm not sure how many such names would be on z5, but at least Caspian Sea would be labelled.

Comments?

Rounding a lot and forgetting about Merkator for a moment (100 way_pixels = 1000 km² on z5, 100 lakes) it looks to me that we could fix the earlier levels too just by extending this scheme, if I don't make oversimplification:

  • z4 - 4k km² - 35 lakes
  • z3 - 16k km² - 15 lakes
  • z2 - 64k km² - 3 lakes (Caspian, Superior, Victoria)
  • z0-z1 - 256k km² - Caspian only, but we already show it currently, so no change

Tuning:

  1. For cultural reasons I would make z2 to show more or less the same as z3, because Great Lakes together make a common shape.
  2. Another tweak could be showing Ladoga and Onega as a known pair too on z3 (with a 9k km² limit it would still be just 17 lakes instead of 15, so not a big change).
  3. We could also apply z3 limit to z0-z2, because this would resolve Argentina-Uruguay border bug (Río de la Plata area is ~35k km²), since we show country borders from z1 (they look like broken here) and capitals from z4 (only Buenos Aires is visible in this case and it looks strange as inland city).
  4. We might also go with all 100 biggest lakes from z0, because it's just not too much of them.

I believe with just a dozen biggest objects we have full control over rendering and we don't have to stick to the numerical rules if they contradict other expectations. I'm fully aware that it's a lot of assumptions and exceptions, but on low zoom levels it's more about picking things than anything else, because there are not enough objects to make solid rules.

After having thought a bit more about it, I suspect the performance problems were mainly related to drawing too many objects, not by runtime of the SQL queries. So I suspect (now we have defined a minimum pixel size) we can simply redefine the minimum zoomlevel at which water areas are shown, and everything should work fine.

So I just quickly tested simply changing z >= 6 to >= 3 in water.mss line 46. The performance impact is huge in the sense that it increases rendering times on these zoomlevels by a factor. However there aren't too many tiles in these zoomlevels, so Z 3-5 still finish rendering in a few minutes. The worst metatiles in each zoom level were:

x=16 y=8 z=5 render_time=440 seconds (instead of 127)
x=8 y=0 z=4 render_time=421 seconds (instead of 56)
x=0 y=0 z=3 render_time=380 seconds

I don't know how much these tiles will hurt without a SSD, but as these are only rerendered after a style update, for me this would be acceptable.

Which code branch did you use?

I tested z0+ and it looks like z4-z5 works as expected, the only problem was performance in North America - it ate up all my memory until I've changed the factor from 0.01 to 10 in project.mml:

AND way_area > 10*!pixel_width!::real*!pixel_height!::real

But it looks like z0-z3 is pre-rendered land with borders and I can't see the new water areas.

Bear in mind that even though the large lakes are only covering a few pixels at these zoom levels, they still have thousands of nodes. So they'll take up large amounts of ram and processing time to draw them. Simplification might help with this.

Good point, I hadn't considered that.

Do we have any standard tools to simplify? BTW - the problem with memory didn't occur in other places on Earth. I wonder also if partial index for water is working for all the zoom levels?

I'm still puzzled why z3 does not show me the water.

I'm still puzzled why z3 does not show me the water.

Did you adapt the minzoom in the .mml, in this line?

Thanks! I've made more mistakes than just this one while drafting, but it helped me to find the solution.

It seems that the big hidden problem is the subpixel accuracy in our SQL code. I don't know what the factor of 0.01 is needed for, but with 1 the rendering is similar, if not the same. This change however allows us to skip all the artificial limits and show every water area from z0 without any other optimizations (like shape simplification). My system just stopped exhausting the memory and became responsive even in the worst cases.

If you agree that getting rid of sub-pixel accuracy makes sense, I propose to:

  1. Prepare a PR from my testing code and replace the #2864.
  2. Prepare a PR fixing it with other layers too (namely: landcover-low-zoom, landcover, landcover-area-symbols, buildings, buildings-major and nature-reserve-boundaries), because it's probably performance bottleneck on all the medium and high zoom levels.

A demo-rendering would be very welcome for that.

I had some strange problem with exporting from Kosmtik and I have enough of investigating for today, so I will try later. @rrzefox It would be great if you could test the performance with my current code and update the testing layer, so we could compare the results.

Just a screenshot of a z6 near Great Lakes with a lot of small water areas (I'm using water tags extract to keep database import time manageable) - click to see the full scale:

no-subpixel-select-biglakes-z6

Which code branch did you use?

That was simply 4.3.0 with changed zoomlevel 6->3 in water.mss and adapted zoomlevel 4->3 in project.mml

@rrzefox It would be great if you could test the performance with my current code and update the testing layer, so we could compare the results.

This has been deployed now (for reference: this has 1 as the pixel factor).
With that, Z 0-3 each took 400 seconds to render. On Z 4, the worst tile was 490 seconds, on Z 5 256 seconds (but most are very fast).
Note that only Z 0-7 have been rerendered fully, on other zoom levels you may need to dirty. All Zoom levels should show the new rendering by now (30.09. 11:50)

Thanks a lot!

Direct comparison is available (as always) here:
http://bl.ocks.org/math1985/raw/af7a602c222dbf1ff1a2c0d84ed755b7/#3.00/18.42/1.85

Water labels are different due to #2845, which is now dropped. Some new areas lack the label because of a previous deployment, so they should be rerendered.

I hope we will have a Mapnik release with a label placement fix soon (https://github.com/mapnik/mapnik/pull/3771#issuecomment-331976391) and some other improvements are possible (https://github.com/mapnik/mapnik/issues/3550), so we would get even better water areas.

I see some differences in Northern Poland on z6 between the 0,01 and 1 pixel rendering. I prefer the (new) 1 pixel rendering. In other areas or zoom levels I don't notice differences immediately.

So it would be great if we can have a performance improvement and rendering improvement in one.

I would be very interested to see what would be the effect of applying the 1 pixel change to other landuse too.

Water labels are different due to #2845, which is now dropped.

This should not be visible on any rerendered tiles anymore, I removed that patch.

Would it not be easier to remove the old filecache, then? Especially when spotting small differences like 0.01 vs 1px on the high zoom levels, it is really hard to see if files are old or new.

This is a tileserver that is actually used by people, so I cannot just remove all cached tiles, because that would make it unusable :) But it does already rerender tiles in the background (currently at Z8) and will continue to do so, but that'll simply take a long time. So you can already look at Z0-7 now, and if you need a specific tile in another zoomlevel, feel free to dirty that - otherwise just wait, it will have rerendered all tiles eventually.

This is a tileserver that is actually used by people

Great! Could you share some real-life statistics then to compare the old and new rendering time, memory and disc IO usage?

Tweaking other layers should be straightforward and even more interesting, because it deals with landcovers and buildings (+nature reserves) and we have a lot of them!

I see some differences in Northern Poland on z6 between the 0,01 and 1 pixel rendering. '

When I zoom Firefox to 300% (max available) I see some smaller lakes missing in this area (south of Koszalin):

http://bl.ocks.org/math1985/raw/af7a602c222dbf1ff1a2c0d84ed755b7/#9.00/53.7320/16.6433

BTW - @rrzefox you have some problems with fonts used, especially here in China:

http://bl.ocks.org/math1985/raw/af7a602c222dbf1ff1a2c0d84ed755b7/#7.00/25.414/99.502

The main issue is performance. Perhaps now we have the requirement that all rendered features should at least be 0,1 pixel (if I remember correctly), performance is not so much of an issue anymore. This should be tested.

I guess adding this limit was an improvement comparing to lack of any area measuring before, and 0.01 was probably used just in case to not break anything.

I think that we could test the database performance change alone (real life statistics are more complex), but how can we do it? Directly from Postgres by some script maybe for given location or via Mapnik? I would like to check it on my system.

I have another performance related question: is it possible to make Postgres read from 2 instances of a "gis" database in parallel? Would it require any changes to the osm-carto code and what speedup I could expect?

I get the feeling this discussion is somewhat drifting off-topic, so I'll try to keep it short. Feel free to skip reading everything but the last sentence.

Great! Could you share some real-life statistics then to compare the old and new rendering time, memory and disc IO usage?

That's the problem with a production server: Tile rendering times differ a lot, depending on things like: how many and which other rendering requests run at the same time, or even what happens to be in the disk cache. The only thing you can really see is if a change alters the rendering times significantly, e.g. as in this case, where the rendering time for Z0 increases by a factor of >100 (without this change, the one Z0 tile that exists for the planet rendered in 0.5 seconds!). The same goes for memory usage and all that. If you want numbers that are more precise than "it's in the same order of magnitude", it's better to measure that on an extra instance only used for benchmarking.

BTW - @rrzefox you have some problems with fonts used, especially here in China

This is running the standard Ubuntu 16.04 versions of the fonts with noto-emoji installed manually. My best guess is that the OSM tileservers use a newer version of the fonts, or these are renderd<->tirex differences. As I don't care about china much (that is not where this tileserver is usually used), I prefer to stick with the fonts packaged by Ubuntu, even if they may be outdated.

Last but not least: all tiles on the "demo" should now have rerendered, so compare away.

There are lots of differences visible with the glaciers at the border around here: http://bl.ocks.org/math1985/raw/af7a602c222dbf1ff1a2c0d84ed755b7/#8.00/43.000/42.387 (also in lower zoom levels). However the differences disappear as expected when you zoom in, so it's really just getting rid of one-pixel-noise in the lower zoom levels and I don't think that's a bad thing.

I was surprised to learn where the Caspian Sea is located, according to the current rendering.
caspiansea

@rrzefox This is probably result of #1465

I was surprised to learn where the Caspian Sea is located, according to the current rendering.

Strange - this is error on z5+, but on z4 it's placed properly:

http://bl.ocks.org/math1985/raw/af7a602c222dbf1ff1a2c0d84ed755b7/#4.00/40.49/-277.64

Anyway, some future Mapnik release should fix it soon.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Tomasz-W picture Tomasz-W  Â·  4Comments

HolgerJeromin picture HolgerJeromin  Â·  3Comments

dktue picture dktue  Â·  3Comments

Phyks picture Phyks  Â·  3Comments

MarkusStue picture MarkusStue  Â·  4Comments