Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar rules: limits? #251

Open
Penegal opened this issue Jun 18, 2018 · 12 comments
Open

Grammar rules: limits? #251

Penegal opened this issue Jun 18, 2018 · 12 comments

Comments

@Penegal
Copy link

Penegal commented Jun 18, 2018

Hello, there.

Making some tests with OSRM in French, I noticed it had no support of elision, which is particularly perceptible in this language: OSRM will, for instance, say Tourner légèrement à droite en direction de Épinal, whereas the correct French would be Tourner légèrement à droite en direction d’Épinal. That would need adaptation of translated text according to rules applied on the OSM names, but would this be possible using grammar rules?

Another related question: in some cases, OSRM will say, for instance, Tourner à gauche sur Rue de Jarménil (D 159b), but the correct French sentence would be Tourner à gauche sur la rue de Jarménil (D 159b), with a definite article, as the way name doesn't have one, and a lower case first letter. That would greatly increase readability and user-friendliness of instructions, but would it be possible and desirable?

I'm willing to help for these, at least for the regex part, as it is my first time on this project and I'm unsure I'm able to completely write such a feature from scratch.

Regards.

@yuryleb
Copy link
Contributor

yuryleb commented Jun 18, 2018

Yes, this looks possible and desirable. I could propose to add some (say elision) "grammar case" for French, add this elision option to all way_name keywords and prototype a few expressions for elisions and articles. I hope this could work.

@yuryleb
Copy link
Contributor

yuryleb commented Jun 18, 2018

Actually your first issue requires changing whole source phrase not way_name value only and so this is out of grammar rules functionality 😦

Maybe it's possible to apply elision right in source phrase like below?
"destination": "Tourner à droite en direction d’{destination}"

BTW also French translation on Transifex looks unsynchronized with current languages/translations/fr.json content that makes working with new French override script to add grammar options much harder. Maybe you can fill missing translations on Transifex first or maybe @benjamintd, @patjouk, @guillaumerose could help with this?

@Penegal
Copy link
Author

Penegal commented Jun 19, 2018

I don't think it will be possible to apply elision directly in source phrases, at least not always, as it only applies before vowels and some words starting with a h. It would need a bit of regex to detect on which cases apply elision.

@yuryleb
Copy link
Contributor

yuryleb commented Jun 19, 2018

OK, then we need for another bunch of rules to post-process whole final phrase content. Actually this is necessary also for Russian (#240) 😉

@benjamintd
Copy link

@yuryleb would it make sense to add the article (de or d') inside the way name grammar case?

We would have something like:
"destination": "Tourner à droite en direction {destination:article}"
which would give:

  • "Marseille": "Tourner à droite en direction de Marseille"
  • "Épinal": "Tourner à droite en direction d'Épinal" (words that start with vowels + some 'H' words)
  • "Le Havre": "Tourner à droite en direction du Havre" (le )
  • "Les Ulis": "Tourner à droite en direction des Ulis" (les )
  • "Rue de Jarménil": "Tourner à droite en direction *de la rue de Jarménil*" (rue, avenue and other street classification)

But then I don't know what to do with cases like:

  • "Rue de Jarménil": "Tourner à gauche sur *la rue de Jarménil*" which does not require the de
    How do we separate those two cases in the grammar file?

If we don't restrict ourselves to fit the grammar JSON file model, we can consider @yuryleb 's solution to post-process the whole sentence.

@Penegal
Copy link
Author

Penegal commented Jun 19, 2018

@benjamintd: you could detect such cases with case-insensitive regex on the destination string with the following order used as with a switch (the first matching case stops evaluation):

  1. if it begins with le, use du instead of de and remove the first word of the destination string;
  2. if it begins with les, use des instead of de and remove the first word of the destination string;
  3. if it begins with la, downcase the first letter of the destination string;
  4. if it begins with rue, use de la and downcase the first letter of the destination string;
  5. if it begins with avenue, use de l’ and downcase the first letter of the destination string;;
  6. [other street classifications here]
  7. if it begins with a vowel or an elision h, use d’ instead of de.

This takes into account removing useless capital first letters, which in French are to be used only for start of sentences and proper names, and therefore are currently a disturbance when reading OSRM instructions: Tourner à droite sur la Rue de Jarménil is mostly an error, as the correct sentence would be Tourner à droite sur la rue de Jarménil.

@yuryleb
Copy link
Contributor

yuryleb commented Jun 19, 2018

@benjamintd, actually it's the great idea just to "move" prepending article into way_name and/or destination - then we can work inside current grammar rules model 👍
@Penegal, I suppose all these rules could be easily reflected into regular expressions.

I already prototyped and published first dummy implementation of French "grammar", just give me some time to change it accordantly to your proposals.

@yuryleb
Copy link
Contributor

yuryleb commented Jun 20, 2018

@Penegal, @benjamintd, please review my second commit in #252, especially the proposed status street names list (I collect them early for translation for my Garmin Russian TTS voices project).

I had to add new elision rules specially for destination keyword handling to insert proper du/des/de article/preposition. Please note that now this de is not necessary before {destination} in Transifex translations and could be removed there.

And actually I didn't yet test routes with destination - do you know the places where these destination are filled on roads/junctions?

@Penegal
Copy link
Author

Penegal commented Jun 20, 2018

@yuryleb: maybe we should wait for your PR to be merged before updating translations?

@yuryleb
Copy link
Contributor

yuryleb commented Jun 21, 2018

Yes, since we started to change French translation strings too, it's better to finish the PR first. Fortunately Transifex seems to be able to upload translation JSON back (just don't forget to remove :article and other grammar options we added before upload - they will be inserted by languages\overrides\fr,js script automatically).

@Penegal
Copy link
Author

Penegal commented Jun 21, 2018

@yuryleb: you can try on these interchange:

I worked on them this morning, so, if map.project-osrm.org uses the OSM.org DB, you should be able to test on them. BTW, I saw cases where the lane assist wasn't working though the turn:lanes data were available? Is there a condition for lane assist to be enabled, as a highway=motorway_junction node at the ways connection, or are the map.project-osrm.org data behind the OSM.org DB?

@yuryleb
Copy link
Contributor

yuryleb commented Jun 22, 2018

It seems destination processing also works:
french destination1
french destination2
french destination3
Incorrect turn:lanes processing is mostly osrm-backend issue as on last screenshot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants