Convert translation properties to XML #6524

leonardehrenfried · 2025-03-10T13:09:49Z

Summary

Before this PR we used .properties files in ISO-8859-1 encoding as this used to the the only encoding that you could use when OTP was started. It has been possible for many years to use UTF-8 as the encoding but IntelliJ has the old encoding pretty hardcoded and doesn't auto-detect it. You can configure Intellij but every developer would have to do it.

This leads to the following unfortunate situation:

Java can also read properties files in XML encoding which allows you to specify the encoding in the file itself.

This is what I have done with this PR, so it now looks like this:

Merging translation bundles

We had one message bundle called WayProperties and one for "internal" stuff. I combined them into a single one called "translations".

codecov · 2025-03-10T13:43:19Z

Codecov Report

Attention: Patch coverage is 88.23529% with 2 lines in your changes missing coverage. Please review.

Project coverage is 70.17%. Comparing base (6fd4d48) to head (426fabe).
Report is 26 commits behind head on dev-2.x.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
.../framework/resources/XmlResourceBundleControl.java	86.66%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             dev-2.x    #6524   +/-   ##
==========================================
  Coverage      70.17%   70.17%           
- Complexity     18310    18315    +5     
==========================================
  Files           2082     2083    +1     
  Lines          77217    77228   +11     
  Branches        7832     7832           
==========================================
+ Hits           54185    54198   +13     
- Misses         20259    20260    +1     
+ Partials        2773     2770    -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

.idea/codeStyles/Project.xml

application/src/main/java/org/opentripplanner/framework/resources/XMLResourceBundleControl.java

abyrd · 2025-03-14T08:07:03Z

The underlying issue here seems to be displaying and editing properties files (not loading or using them), and this seems to be the only motivation for switching to the XML format. As far as I can see, the sole advantage of the new format is that it can self-describe its encoding. In all other respects it is less convenient to read and edit, at least directly in an editor without tooling, as it contains a lot of tag repetition and angle-bracket line noise. Were there other reasons for preferring XML over properties? @leonardehrenfried I think you mentioned editing tools during the meeting. Do people generally use such tools, and do some of them work better with XML, or are we mostly targeting manual editing in an IDE or text editor?

During the development meeting, I mentioned the possibility of using the Unicode byte order mark (BOM, not to be confused with Bill Of Materials in Maven artifacts) to signal to both the IDE and Java that the file was UTF-8 encoded. There was some discussion of whether this was allowed or appropriate, as UTF-8 does not have any byte order ambiguity. I just checked the Unicode spec and it seems to clearly permit starting a UTF-8 stream with a byte order mark, as a way to show to a high degree of certainty that a file is UTF-8 encoded Unicode. https://www.unicode.org/faq/utf_bom.html#BOM

I tried prepending a UTF-8 BOM to some properties files, and unfortunately by default IntelliJ still tries to decode them as ISO Latin-1. IntelliJ apparently doesn't try to guess or detect the encoding and relies on encoding options in the project settings.

On the consuming side, it appears that one can unambiguously assert the encoding of Properties in code by using the Properties.load method that takes a Reader instead of an InputStream. This is what we do in method CustomDocumentation#loadCustomDocumentationFromPropertiesFile for example. But in other places we're relying on ResourceBundle to do the loading. This should not pose a problem since Java 9 (released in September 2017, so over 8 years ago) and JEP 226 originally proposed in 2014, the default encoding for Properties and so for PropertyResourceBundles is UTF-8. So in fact the underlying problem is not with Java, nor with Properties, but with one specific IDE (IntelliJ) that has incorrect defaults and is assuming an encoding that is fully obsolete.

The OpenTripPlanner project already requires all developers to set certain configuration options for code style, and the repo even contains configuration specific to the IDE in question (https://github.com/opentripplanner/OpenTripPlanner/tree/dev-2.x/.idea). Is it actually prohibitively complex to ask developers to change one setting (properties encoding) in the relatively infrequent case they're going to be editing translations?

Interestingly though, the IntelliJ documentation https://www.jetbrains.com/help/idea/properties-files.html claims that IntelliJ uses UTF-8 by default. Is there something in our own project repo that is causing the IDE to select ISO Latin-1?

Taking a step back, note that the only reason we're even confronting encoding problems is because we're using ResourceBundles and specifically PropertyResourceBundles to supply translations. It would also be possible to define the translations in ListResourceBundles (in Java code that is unambiguously UTF-8) or even in Java Maps or in some extremely simple custom format (e.g. each line is a key, followed by a space, followed by the text).

But the first and simplest line of attack here is to determine why IntelliJ is not applying its own default of UTF-8. Maybe there's something in our Maven POM or other configuration files in the OTP repo.

leonardehrenfried · 2025-03-14T08:11:49Z

I really, really don't want to fiddle around with Intellij settings for this very unimportant feature. I want it to Just Work and not spend more time on it.

If the thought of using XML is so undesirable to you, I will close this PR and it will remain as it is. People will have to find their own solutions.

abyrd · 2025-03-14T08:24:51Z

My sense is that an incorrect default in one IDE does not seem like a good reason to migrate project files to a less common format that is in other ways less convenient to work with.

This is simply a bug in IntelliJ, and a bug can be worked around with a single change in the project settings. The bug is tracked here and I'll add a comment there asking them to fix it:
https://youtrack.jetbrains.com/issue/IDEA-301047/IJ-detect-ISO-8859-1-for-properties-files-instead-of-using-UTF-8-as-default-property-encoding-for-Java-9-and-higher

abyrd · 2025-03-14T08:36:18Z

I have posted a comment on the IntelliJ issue tracker asking them to fix their encoding defaults.
https://youtrack.jetbrains.com/issue/IDEA-301047/IJ-detect-ISO-8859-1-for-properties-files-instead-of-using-UTF-8-as-default-property-encoding-for-Java-9-and-higher#focus=Comments-27-11715475.0-0

leonardehrenfried requested a review from a team as a code owner March 10, 2025 13:09

leonardehrenfried added the Improvement A functional improvement label Mar 10, 2025

leonardehrenfried force-pushed the xml-props branch from c75ce70 to 4899a53 Compare March 10, 2025 13:57

leonardehrenfried added 5 commits March 10, 2025 14:58

Switch to Properties to XML

2b386b2

Switch Messages bundle to XML

28486d0

Remove conversion code

a4f1b32

Use correct name

c7d43f9

Rename bundle to 'translations'

0011f76

leonardehrenfried force-pushed the xml-props branch from 4899a53 to 0011f76 Compare March 10, 2025 13:58

optionsome reviewed Mar 10, 2025

View reviewed changes

.idea/codeStyles/Project.xml Outdated Show resolved Hide resolved

application/src/main/java/org/opentripplanner/framework/resources/XMLResourceBundleControl.java Outdated Show resolved Hide resolved

leonardehrenfried added 2 commits March 10, 2025 16:33

Use lowercase

b84c5d8

Revert changes to .idea

426fabe

leonardehrenfried requested a review from optionsome March 10, 2025 15:39

optionsome requested a review from abyrd March 11, 2025 10:25

t2gran added this to the 2.8 (next release) milestone Mar 12, 2025

leonardehrenfried closed this Mar 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Convert translation properties to XML #6524

Convert translation properties to XML #6524

Uh oh!

leonardehrenfried commented Mar 10, 2025 •

edited

Loading

Uh oh!

codecov bot commented Mar 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

abyrd commented Mar 14, 2025

Uh oh!

leonardehrenfried commented Mar 14, 2025 •

edited

Loading

Uh oh!

abyrd commented Mar 14, 2025

Uh oh!

abyrd commented Mar 14, 2025

Uh oh!

Uh oh!

Convert translation properties to XML #6524

Convert translation properties to XML #6524

Uh oh!

Conversation

leonardehrenfried commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Merging translation bundles

Uh oh!

codecov bot commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

abyrd commented Mar 14, 2025

Uh oh!

leonardehrenfried commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abyrd commented Mar 14, 2025

Uh oh!

abyrd commented Mar 14, 2025

Uh oh!

Uh oh!

leonardehrenfried commented Mar 10, 2025 •

edited

Loading

codecov bot commented Mar 10, 2025 •

edited

Loading

leonardehrenfried commented Mar 14, 2025 •

edited

Loading