web planner - ignore apostrophe character in online search text
In Progress
I was searching for "Saint John's Point, Donegal, Ireland". It is not in the list for 1st screen capture because (apparently) I left out the apostrophe character (') - duh.
When I type the exact name as per the map on right hand side (found with Google Maps by the way), the online search has the correct place top of list.
Other equivalent search items would be
- St Johns Point
- St John's Point
Hello Andrew. This will sound silly, but the problem here is that we don't know the name of the lighthouse is in English. Vast majority of names in the OSM dataset we use does not specify its language. So we cannot use language specific mechanisms. The apostrophe is not the problem, that is treated correctly, problem is that 'john' is not equal to 'johns' in general. In English - yes, like (rock, rocks), in another languages - not. The proper, world-wide solution is close to impossible, I am afraid. But some assumptions based on geography could be used in Europe, USA e.t.c. That is a point to consider.
Hello Andrew. This will sound silly, but the problem here is that we don't know the name of the lighthouse is in English. Vast majority of names in the OSM dataset we use does not specify its language. So we cannot use language specific mechanisms. The apostrophe is not the problem, that is treated correctly, problem is that 'john' is not equal to 'johns' in general. In English - yes, like (rock, rocks), in another languages - not. The proper, world-wide solution is close to impossible, I am afraid. But some assumptions based on geography could be used in Europe, USA e.t.c. That is a point to consider.
>The proper, world-wide solution is close to impossible
it would appear Google Maps have done the impossible?
and maybe my English-speaking simplistic quick-hack-solution but if the apostrophe character were removed from the search string, then the match would have been made? Surely better than the current frustrating situation? Ironically (sadly) since the introduction of Locus online search in LM4.17 I've been using Google Maps more & more rather than less & less.
>The proper, world-wide solution is close to impossible
it would appear Google Maps have done the impossible?
and maybe my English-speaking simplistic quick-hack-solution but if the apostrophe character were removed from the search string, then the match would have been made? Surely better than the current frustrating situation? Ironically (sadly) since the introduction of Locus online search in LM4.17 I've been using Google Maps more & more rather than less & less.
No, this simplistic hack is a) already done early in the chain of events anyway, b) does not address the problem.
No, this simplistic hack is a) already done early in the chain of events anyway, b) does not address the problem.
The problem here is the absence of what is called 'stemming' or 'lemmatization'. Boulders have the same 'stemm' as boulder. Johns is the same as John. This is of course done properly in cases we know for certain the language used is en, de, es, fr and a few others. But in the most cases the language is unknown - the mappers did not tell us specifically - 's' is not recognized as just a plural form. I agree it would be appropriate to assume en is used in England, de is used in Germany. If we use osm data, there is no other option really. But doing this language geo-localization or even recognition by searched string in India, Africa? Very hard imho. This problem is unfortunatelly much deeper than some 'bug' to fix. First step in the solution: building a map of most used languages per region. Then decisions like: yes, en in England is a safe asumption. India - better leave this map blank. And so on.
The problem here is the absence of what is called 'stemming' or 'lemmatization'. Boulders have the same 'stemm' as boulder. Johns is the same as John. This is of course done properly in cases we know for certain the language used is en, de, es, fr and a few others. But in the most cases the language is unknown - the mappers did not tell us specifically - 's' is not recognized as just a plural form. I agree it would be appropriate to assume en is used in England, de is used in Germany. If we use osm data, there is no other option really. But doing this language geo-localization or even recognition by searched string in India, Africa? Very hard imho. This problem is unfortunatelly much deeper than some 'bug' to fix. First step in the solution: building a map of most used languages per region. Then decisions like: yes, en in England is a safe asumption. India - better leave this map blank. And so on.
Replies have been locked on this page!