Id: Some generic name validation is too strict

Created on 22 Feb 2019  路  7Comments  路  Source: openstreetmap/iD

I have added the name "Bacolor Municipal Hall" to a feature tagged amenity=townhall and I get the following validation warning with a button inviting me to remove the name:

Town Hall has the generic name "Bacolor Municipal Hall"

But as you can see, the name is a _proper_ proper name and is not a generic name.

validation

Most helpful comment

but it might be faster and more useful to just tweak the filters.json list to move more of the _specific_ names, like those hotels, from discardNames (which iD repurposes) to discardKeys.

鈽濓笍 I did this.. I'm going to close for now, but we can always adjust the filter some more if it turns out to still be too aggressive.

All 7 comments

@seav The generic name validation uses the discardNames regex list from filters.json in the name-suggestion-index project. It looks like it flags any name containing "municipal". We should probably change that to only flag exact matches.

Keep in mind that the generic name validation uses imperfect heuristics and you can ignore it! There are bars named "Bar" after all.

Yup, I know that I can ignore the warnings similar to how I handle validation errors flagged by JOSM. But I am an experienced OSM mapper after all. I fear that newbie OSM mappers who use iD may think that what they are doing is wrong because of these too strict warnings.

@seav I agree with you 100%. We should fix this case. I was just letting anyone who reads this thread know it's okay to ignore warnings 馃槄

The generic name validation uses the discardNames regex list from filters.json in the name-suggestion-index project

Is it importing filters or just copied them as a starting point? Because blindly importing them is a bad idea, their purpose is not to detect generic names but to discard all non-brand names.

Other cases that will happily generate false positives from a quick look: "Bank sp贸艂dzielczy" (matches "^bank(| sp贸艂dzielczy)$"), everything from "^(central|city|europa|grand|palace|park|royal)(\\s)?hotel$" rule, "okr臋gowa stacja kontroli pojazd贸w" (matches "^(okr臋gowa\\s)?stacja kontroli pojazd贸w$") and many other non-brand non-generic ones...

I would suggest to maintain them as a separate list - maybe with list of filters excluded as too zealous and adopted to make importing new ones easy.

Other cases that will happily generate false positives from a quick look: "Bank sp贸艂dzielczy" (matches "^bank(| sp贸艂dzielczy)$"), everything from "^(central|city|europa|grand|palace|park|royal)(\\s)?hotel$" rule, "okr臋gowa stacja kontroli pojazd贸w" (matches "^(okr臋gowa\\s)?stacja kontroli pojazd贸w$") and many other non-brand non-generic ones...

Yes, the filters.json list isn't really designed to be a name filter for iD, we are just using it for now.

We could create a separate list, but it might be faster and more useful to just tweak the filters.json list to move more of the _specific_ names, like those hotels, from discardNames (which iD repurposes) to discardKeys.

The same validation happened with "Municipalidad de Gonzales Chaves" in Buenos Aires, Argentina.
That's because the word "Municipalidad" (Town hall) contains the string "Municipal"?

a36130db-1fcf-49d6-a14c-0bb8dd61

but it might be faster and more useful to just tweak the filters.json list to move more of the _specific_ names, like those hotels, from discardNames (which iD repurposes) to discardKeys.

鈽濓笍 I did this.. I'm going to close for now, but we can always adjust the filter some more if it turns out to still be too aggressive.

Was this page helpful?
0 / 5 - 0 ratings