[core] i18n: Fresh Start #1074

confused-Techie · 2024-08-09T03:13:48Z

This PR builds off of the fantastic work by @meadowsys in #715 to attempt getting Pulsar up and running with translation support.

While some aspects of this PR are inspired or directly borrowed from Meadows work, a lot of what was helpful were the initial research stages done, and being able to implement many of those ideas directly.

But this PR now represents a completely functional translatable Pulsar and community packages implementation. But to dive a little deeper, it's important to mention each individual way of text appearing in Pulsar and how we translate it:

UI: This is the easiest, able to be translated with the i18n API.
Settings: These are translatable via LocaleLabels
Context Menu: Translatable via LocaleLabels
Menu: Translatable via LocaleLabels
Command Palette: Does not support Translation: I've opted to keep the command-palette untouched, since commands are directly created by the names of the commands, in this current form I fear we couldn't implement translations without breaking changes, so I thought it was better to leave an attempt at that as it's own PR completely.

Getting Started

To implement translations in a community project (or Pulsar for that matter) is to simply add a locales folder in the root of the project. Within this folder should be a collection of files for your translations named like so: package-name.locale.json|cson.

When your project is initialized, just like for menus, this file will be read automatically. In the case of Pulsar the i18n.initialize() function reads Pulsar's locales file automatically.

As this file is read it'll take a look at the locale for each file and find the ones that may apply to the user. Because during load we have no awareness of the completeness of each file, we want to load as many as possible, while still making sure not to do wasted work. So we look at all possible languages the user may load at some point, and include any locales that are on that list.

From here all files loaded are available via i18n.strings (although this should not be accessed directly). This key value store is then used to match all translations. Accessible via a keypath like pulsar.context-menu.core:undo.

Translating Strings

For each string that's being translated we have the full support of the ICU Message Syntax, which allows plurals, replacements and much more, this is all provided by Intl-MessageFormat.

Keep in mind that for items that need additional properties passed to them, such as replacement values, these can only be used when translating via the i18n API, as extra properties cannot be passed via LocaleLabels.

Methods of Translation

There's a few different ways that get a string translated, so lets take a look at all of them.

LocaleLabel

In some cases, it's impossible to access the i18n API to translate a string, such as files in your menus directory. Since these files are cson|json files, they cannot run JS code. To translate these items we use what I'm dubbing a LocaleLabel which is simple a keyPath that correlates to a string accessible to the i18n API, such as one stored in your locales directory, that is surrounded by %.

For example, lets say the contents of ./locales/pulsar.en.cson looked like:

'pulsar': {
  'context-menu': {
    'core:undo': 'Undo'
  }
}

And I wanted this string to appear from ./menus/win32.cson:

'context-menu': 
  'atom-text-editor, .overlayer': [
    {label: '%pulsar.context-menu.core:undo%', command: 'core:undo'}
  ]

The above will successfully translate the label of this context menu item when it appears for the user.

The LocaleLabel method of translation is supported in:

Config: Only supported on the title and description key.
Application Menu: Only supported on the label key.
Context Menu: Only supported on the label key.

API

But for all other cases of translation, when we have access to the i18n API we have more freedom.

Lets say we have ./locales/pulsar.en.cson:

'pulsar': {
  'ui': {
    'myString': 'Hello World'
  }
}

The simplest way to translate this would be:

const str = atom.i18n.t("pulsar.ui.myString");

And if we needed to pass any replacements or extra parameters to Intl-MessageFormat we would do that like:

const str = atom.i18n.t("pulsar.ui.myString", opts);

But lets say we wanted easy access to our namespace, or our package's namespace.

const t = atom.i18n.getT("pulsar");

const str = t.t("ui.myString");

This saves us from having to type the full API dozens or hundreds of times as well as the name of the package.

How does a user control translations?

To control translations you simply have two settings:

Primary Locale: This would be the most specific primary locale for the user. Such as en-US
Priority List: This is a list from highest to lowest of the most specific locales the user would like to see as an alternative.

These two values are used, along with the hardcoded default fallback of en) to construct a list of languages to display to a user. This list is created in accordance of RFC4647 "Lookup Filtering Fallback Pattern".

What this means is that for every single entry, we continuously fallback to less and less specific locales of that language before moving onto the next option.

For example:

Primary: es-MX
Priority List: zh-Hant-CN, ja-JP
Constant hardcoded fallback: en

Our priority list of locales would be:

[
  'es-MX',
  'es',
  'zh-Hant-CN',
  'zh-Hant',
  'zh',
  'ja-JP',
  'ja',
  'en'
]

How is this priority list used?

When we load data from ./locales we only ever load a locale that is present on that above list. If a user had the same list I typed above but there was ./locales/pulsar.ar.cson it would never be loaded by the system at all, because the user wouldn't encounter that language during this fallback list.

But when we ask for any individual translation of a string, no matter if via the API or a LocaleLabel, we find the string we want to translate within i18n.strings first via it's keypath, then iterate through the fallback list and return the first match we get. This means that partial translation is completely supported for every single string, given that there is always a en locale translation available. This is an important point, anytime there is translation the base translations needed that must be 100% translated should be en, not en-US or en-GB or anything else.

From the above you can see I've given tried to cover every use case and made it as easy as possible to partially translate and start small.

These changes are 100% backwards compatible, and support having a single string translated or the entire application.

Co-Authored-By: Meadowsys <[email protected]>

Because the `atom` global doesn't seem available during test runs Co-Authored-By: Meadowsys <[email protected]>

Co-Authored-By: Meadowsys <[email protected]>

savetheclocktower · 2024-08-10T03:16:46Z

Is there any urgency to get this into 1.120, or are you cool with letting it sit for a while? I think it looks fine at first glance, but I'd love to have a few weeks to play around with it if you're open to that.

confused-Techie · 2024-08-10T06:41:29Z

@savetheclocktower I'm not against this one waiting around a bit. Obviously it'd be awesome to get it in sooner rather than later, but a month seems completely reasonable

meadowsys

took quick (sleepy) look, looks good to me!

smol sidenote: the work translators have done so far can likely still be used! depending on final format, we may have to "import" manually / write a tool to change the format, but, still usable! that was probably my main worry after I lost motivation on the original one, that translators' work was for nothing ><

meadowsys · 2024-08-11T20:16:00Z

src/menu-manager.js

+      if (Array.isArray(item.submenu)) {
+        for (let y = 0; y < item.submenu.length; y++) {
+          if (this.i18n.isAutoTranslateLabel(item.submenu[y].label)) {
+            item.submenu[y].label = this.i18n.translateLabel(item.submenu[y].label);
+          }
+        }
+      }


maybe this should handle submenus recursively? (or am I mistaken, and submenus aren't possible / this already handles it?)

That's not a bad call at all, I'm not sure if there's a limit to how many submenus deep this structure could go, but I'll check to see if there's docs on the topic, since we want to make sure we don't forget anything

meadowsys · 2024-08-11T20:26:41Z

src/config-schema.js

+          primary: {
+            type: "string",
+            order: 1,
+            default: "en-US",


do you think maybe this default should just be en?

Since we are using RFC4647 Lookup Filtering Fallback Pattern it's best for your locale to be as specific as possible. Here if our primary is en-US by default, that'll automatically cover en-US and en, as that'll be the next available fallback language if en-US is not available. The same way that if someone's primary locale was zh-Hant-CN they'd still fallback quickly to just zh if no other locales were available.

So I'm not saying this should be the default necessarily, but whatever it is, is better off being as specific as possible in terms of locale. But since the majority of Atom was developed in English by Americans I assumed this would be the most logical default, as it's what's essentially already expected by most users.

You'll even notice the beginner locale file I made for Pulsar is ./locales/pulsar.en.cson. But that's still what's loaded with the primary locale of en-US due to the fallback behavior.

confused-Techie · 2024-08-12T02:01:13Z

@meadowsys Good call on importing translations after the fact.

Since after making this PR I did take a look to realize how much has already been translated, and was worried about that work going to waste.

But you are right, pulsar-edit/i18n-intermediate-sync still exists, and we could just put all of this stuff into a new file, then during the step where file structure is modified we could combine the files and move any translated keys to other keys in the final format.

If (hopefully when) we get this merged, I'll get started over there to work all that out

savetheclocktower · 2025-04-23T05:37:57Z

OK, this deserves a thorough review, but let me brain-dump some ideas in here:

The JSON/CSON file approach feels quite similar to how i18n is implemented in VS Code — a bunch of .json files that define strings, then a %magical.percent-based.syntax% to inject them into other JSON files.
Seems like, as long as something is designed to pull from a string key programmatically or with the %magical.percent-based.syntax%, any package can define any key. This makes a lot of sense — language support (even for core menus and such) can be added via community packages rather than always needing to land in core. But it does make me wonder what happens in case multiple packages want to provide the same strings. Maybe there's a unit test for it that I missed. I'd expect that latest to load wins, but it could get confusing to the user without a way for them to reason about it. (Not that I have a better idea.)
Other than interesting choices with method names (which I address a bit), I don't have any real quibbles about the API. Still, since this is a core API, I wonder if it makes sense to hide this behind a feature flag to start out, or otherwise mark it as experimental and subject to change. I think modern Tree-sitter benefited from a soft-launch like this and I worry a little bit about boxing ourselves in the way we kinda did with atom.ui. If we soft-launch it, we can theoretically play around with the API and work out if there are any pain points or design decisions we want to revisit.

I've steered clear of all the previous i18n discussions because (a) I'm not the target market; and (b) it felt like a minefield. I keep thinking about command names in particular — not just the problem of the command palette (since ultimately we can filter on whatever labels we choose, even if they're localized), but also keymaps, and the awkwardness of having the command's ID be English text no matter what. My instinct was always that, if I were truly more comfortable with a code editor in my hypothetical non-English native language, I'd be tormented by a “half-translated” Pulsar, and that half of the strings would be in English no matter what because of all the unmaintained community packages out there.

Maybe this is too cynical, though. I know that people ask about i18n on a somewhat regular basis, so maybe it's a way to get more folks engaged and contributing.

savetheclocktower

Not an approval as such, since I haven't tried it out yet — just some questions and suggestions.

spec/i18n-spec.js

src/config-schema.js

src/i18n.js

Co-authored-by: Andrew Dupont <[email protected]>

confused-Techie · 2025-04-25T05:11:18Z

@savetheclocktower Thanks a ton for the review, I'll go ahead and start incorporating that feedback.

…ameterized` This static method had to be renamed since a non-static method of `shouldIncludeLocale` already existed

confused-Techie · 2025-05-13T00:52:16Z

@savetheclocktower I've gone ahead and converted all static methods to proper camelCase to match our existing code better.

But I'd love to see what you think on the rest of it. To address some points you've made:

LocaleLabels

Like you mentioned the use of %somekey% was based on how VSCode did it. It seems to work pretty well for them, so I see no reason why it couldn't work just as well for us. And really this key structure is how all text would be translated, just a keypath provided for the API to lookup. So really any package can utilize any key they want.

Key Overrides

You do mention concern that any package could override the key path of another or one in core. But this is actually protected against.
While any package can utilize a key from any package or from core, they cannot override them. Package's keys are locked into their own namespace.
This means if my package's name is confused-Techies-cool-package and I have keys defined as:

'pulsar': {
  'ui': {
    'myString': 'Hello World'
  }
}

The the key myString is only accessible via confused-Techies-cool-package.pulsar.ui.myString.

This does ensure that translations are only ever provided as intended by the creator. But you do make a good point, that key overrides could mean faster translation of core, or other packages. So maybe we do want to allow override-able keys? Or make it configurable? It's hard to say, but since a package would have to start using translation anyway, it seems to me the only keys to get overridden would be core, it made sense to restrict this behavior.

Soft-Launch

I'm also not opposed to a soft-launch myself. But the logistics of it do seem a tad more difficult, so I'd be curious on any ideas on how we could do that.

Since to opt into translation for text would mean using the API and having that fill in the results. A soft launch would mean having both options working at the same time, which seems like it would mean having to check each time if the user has opted in, then returning based on that.

I suppose we could change the signature of t/translate to be something like:

t(keyPath, fallbackString?|opts?, opts?)

Where if a string is the second parameter (this would be the original string) then the translate method itself could check what the user has enabled, and if they haven't opted in then we return this fallback string. If they do have it enabled then we can translate the keypath as we would normally, then opts can optionally be provided as the secondary or tertiary parameter depending on if this function is supporting the opt-in fallback behavior.

Would you think something like this would safely support a more soft-launch opt-in behavior? Then maybe we just restrict community packages from it while we soft-launch?

Just like `translateLabel` if we fail to translate a label we should always return the original keyPath so the user knows something is supposed to be there, and hopefully gets a hint at what was intended. Plus hopefully this makes troubleshooting easier/faster to know there was a failure

…or this now

savetheclocktower · 2025-07-05T16:53:23Z

I think a soft launch could work like this:

Expose an experimental setting to allow users to select a locale. Default would be en-US (or en), since that's the status quo. Next would be “Use OS setting,” followed by a full list of locales. The setting description should make it abundantly clear that this will have minimal effect during the experimental period.
Pick one or two builtin packages that have many opportunities for localizable text. (For example: autocomplete-plus has lots configuration settings and descriptions.) Add translations for some common locales for those packages. (We seem to have plenty of users from France and Germany, so those seem like good initial translation targets, but that's just a gut feeling.)
Solicit feedback from a handful of users. I think enough European users idle in Discord that we can just ask for volunteers a few times and get some takers.

This would allow us to technically launch the API while making it clear that nobody should rely on it yet for their own packages, since anything would be subject to change.

I think it's fair to be cautious about this, but I definitely don't want it to become a road block or bureaucratic bottleneck. We do have prior art here in VS Code; and as long as we're not leaving out any major features that are present in VS Code's implementation, we can use their experience as proof that this approach is sound.

I think the difference between a soft launch and a full launch could be as little as one or two releases.

Taking another shot at i18n

40efe0b

Co-Authored-By: Meadowsys <[email protected]>

confused-Techie marked this pull request as draft August 9, 2024 03:14

confused-Techie and others added 11 commits August 8, 2024 20:38

Load core locales during initialization

7449b44

Co-Authored-By: Meadowsys <[email protected]>

UI translations, context-menu translations

9b9094c

Co-Authored-By: Meadowsys <[email protected]>

Menu item translations, cleanup auto-translations

9f47be8

Co-Authored-By: Meadowsys <[email protected]>

Stop using atom global

5c3b47a

Because the `atom` global doesn't seem available during test runs Co-Authored-By: Meadowsys <[email protected]>

Update specs to ensure they pass

431d802

Co-Authored-By: Meadowsys <[email protected]>

Localize: Settings Title & Description

d73cd15

Co-Authored-By: Meadowsys <[email protected]>

Add tests

b7bd819

Add LocaleLabel translation tests to context-menu and manus

977f1bd

Fix errors in tests

0807084

Update pulsar.en.cson with new format

6381419

Properly set fallback locale

0783bb8

confused-Techie marked this pull request as ready for review August 10, 2024 02:57

meadowsys reviewed Aug 11, 2024

View reviewed changes

confused-Techie added 5 commits February 9, 2025 15:58

Merge branch 'master' into localization-ct

e46ffba

Ensure context menu submenus are translated

40f84c0

Fix tests

2e1bc28

Merge branch 'master' into localization-ct

83597c5

Ensure we pass the correct parameters to I18n.ShouldIncludeLocale

78c7584

confused-Techie mentioned this pull request Feb 10, 2025

[core&packages] Begin Localization #1210

Draft

confused-Techie added 3 commits February 9, 2025 21:43

Support LocaleLabels in settings schema enum.description

a84d5ae

Merge branch 'master' into localization-ct

45e7da7

Cache formatters to speed up translations

501e482

savetheclocktower reviewed Apr 23, 2025

View reviewed changes

spec/i18n-spec.js Outdated Show resolved Hide resolved

src/config-schema.js Outdated Show resolved Hide resolved

src/i18n.js Outdated Show resolved Hide resolved

Update src/config-schema.js

625afec

Co-authored-by: Andrew Dupont <[email protected]>

confused-Techie added 3 commits May 12, 2025 17:22

camelCase localeNegotation

b868eb9

camelCase and rename ShouldInlcudeLocale => `shouldIncludeLocalePar…

df9835a

…ameterized` This static method had to be renamed since a non-static method of `shouldIncludeLocale` already existed

camelCase DoLocalesMatch => doLocalesMatch

dbb4cd0

confused-Techie added 3 commits May 12, 2025 18:03

Merge branch 'master' into localization-ct

7ee0c4a

Removed translateLabel specific fallback, as translate accounts f…

a815440

…or this now

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[core] i18n: Fresh Start #1074

[core] i18n: Fresh Start #1074

Uh oh!

confused-Techie commented Aug 9, 2024 •

edited

Loading

Uh oh!

savetheclocktower commented Aug 10, 2024

Uh oh!

confused-Techie commented Aug 10, 2024

Uh oh!

meadowsys left a comment

Uh oh!

meadowsys Aug 11, 2024

Uh oh!

confused-Techie Aug 12, 2024

Uh oh!

meadowsys Aug 11, 2024

Uh oh!

confused-Techie Aug 12, 2024

Uh oh!

confused-Techie commented Aug 12, 2024

Uh oh!

savetheclocktower commented Apr 23, 2025

Uh oh!

savetheclocktower left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

confused-Techie commented Apr 25, 2025

Uh oh!

confused-Techie commented May 13, 2025

Uh oh!

savetheclocktower commented Jul 5, 2025

Uh oh!

Uh oh!

Uh oh!

[core] i18n: Fresh Start #1074

Are you sure you want to change the base?

[core] i18n: Fresh Start #1074

Uh oh!

Conversation

confused-Techie commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Getting Started

Translating Strings

Methods of Translation

LocaleLabel

API

How does a user control translations?

How is this priority list used?

Uh oh!

savetheclocktower commented Aug 10, 2024

Uh oh!

confused-Techie commented Aug 10, 2024

Uh oh!

meadowsys left a comment

Choose a reason for hiding this comment

Uh oh!

meadowsys Aug 11, 2024

Choose a reason for hiding this comment

Uh oh!

confused-Techie Aug 12, 2024

Choose a reason for hiding this comment

Uh oh!

meadowsys Aug 11, 2024

Choose a reason for hiding this comment

Uh oh!

confused-Techie Aug 12, 2024

Choose a reason for hiding this comment

Uh oh!

confused-Techie commented Aug 12, 2024

Uh oh!

savetheclocktower commented Apr 23, 2025

Uh oh!

savetheclocktower left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

confused-Techie commented Apr 25, 2025

Uh oh!

confused-Techie commented May 13, 2025

LocaleLabels

Key Overrides

Soft-Launch

Uh oh!

savetheclocktower commented Jul 5, 2025

Uh oh!

Uh oh!

confused-Techie commented Aug 9, 2024 •

edited

Loading