Skip to content

Enhance and Automate Translations #494

Open
@ido777

Description

@ido777

Internationalization is a major strength of the Primer, but many translations are incomplete or outdated relative to the English content. This phase will modernize the translation workflow with AI assistance and better community processes.

Justification: The Primer has attracted contributors worldwide, resulting in multiple language versions. However, some languages (e.g. French, Chinese, etc.) have stalled updates – community members have noted translation efforts being “stalled” for years. Keeping translations up-to-date manually is difficult as the English source evolves. By leveraging AI translation and structured workflow, we can quickly fill gaps and reduce the burden on volunteer translators, while still ensuring quality through community review. This makes the content more accessible globally.

Implementation Steps:

  • Make Translation Gap Visible: Find automatic way to mark latest human approval of the translation and to see the gap.
  • Choose AI Translation Tool: Utilize an automated translation service or open-source model to generate initial translations:
    • Options include DeepL API, Google Translate API, or open-source LibreTranslate. An advanced option is to use a GitHub Action powered by GPT for high-quality output.
    • For example, the GPT Translate GitHub Action can translate markdown files into multiple languages using AI models. This could be configured to run whenever content changes or on demand, producing a PR with updated translations and adding automatic translation label to the PR.
  • Set Up Workflow: Create a dedicated branch or fork for each language (or use a single translations branch) where AI-generated translations will be staged:
    • Consider using a GitHub Action workflow that triggers on updates to English docs. It would take the diff or updated section and translate it into each supported language, committing the changes (or opening PRs) in the translation branch.
    • if using GitHub Action - Ensure the action is configured with proper credentials and limits (to avoid abuse or large costs). For instance, only maintainers can trigger it, or it runs on a schedule for batch updates.
    • Alternatively contributor can run it locally with thier own API key - with no central process and cost.
  • Community Validation: Machine translation, while much improved, can have errors especially on technical terms. Implement a community review step:
    • Announce the new translations or updates in the issue tracker or discussions, tagging previous translators or native speakers. Encourage them to review the AI-generated text and make corrections via PRs or comments.
    • Possibly mark auto-translated content with a flag (HTML comment or a badge in the text) indicating “Unreviewed translation – help improve this!” so readers know to take it with caution until verified.
    • Update the existing Translations Contributing Guidelines to include the new AI-assisted process (e.g. how to run the translation action, how to review).

Next Steps: (Not in this issue):

  • Missing Translations: For languages that the community has requested but never completed, use the AI pipeline to create an initial full draft:
    • For example, generate a French version if not complete, using AI as a starting point. This gives volunteers a base to refine, rather than starting from scratch.
    • Do this one language at a time to manage load, focusing on those with volunteer interest. (From issues, French, Turkish, etc. were requested.)
  • Backmerge Improvements: As community members review and fix translations, merge those improvements back into the main translation branch.
  • Ongoing Automation: Going forward, whenever the English master docs change, use the automated process to update translations within a short window (e.g. open PRs with changes in each language). This ensures no translation stays outdated for long without at least a machine suggestion for new content.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions