Docusaurus: [v2] Translations approach

Created on 24 Apr 2020  ·  12Comments  ·  Source: facebook/docusaurus

💥 Proposal

An umbrella issue to track the translations work for v2 and guide the design and implementation on the feature in v2. Give suggestions and feedback on this issue.

Some initial thoughts:

  • Translations should be a core feature built into Docusaurus core, not implemented specifically by any plugins
  • We expose a current locale value in the context. Elements which need it can read from it
  • For performance reasons, each translated website should be its own website (can be done via adding a /<locale>/ to the baseUrl). This is basically sharding the website bundle so that the main bundle doesn't contain routes for each locale. If performance isn't a concern, then probably there's no need to do this

Follow this issue for updates.

Some tools to consider using:

Issues with v1 translations

Have you read the Contributing Guidelines on issues?

Yes

proposal translations

Most helpful comment

@nebrelbug Versioning might change a bit. We might end up doing /<baseUrl>/<locale>/<docs>/<version>/<doc-path>. IMO locale should be a first class feature in Docusaurus.

All 12 comments

Do you plan use an approach like the react-i18n-intl? In the v1, we have the crowdin to make the translations, but this put the limitation to assume that the default language is english. Is a plan to v2 change this architecture, to make possible the developer submit your own translations and we use a framework to read them?

There are two things we need to translate here:

  • UI and React components - Those can be powered by i18n libraries such as FBT and react-i18n-intl
  • Documentation - I presume the docs for another language would be side-by-side with the original docs:
├── introduction.md
├── introduction.jp.md
├── introduction.es.md
...

@yangshun would #2302 need to be completed first, or will versioning stay similar enough as presented to the user?

@nebrelbug Versioning might change a bit. We might end up doing /<baseUrl>/<locale>/<docs>/<version>/<doc-path>. IMO locale should be a first class feature in Docusaurus.

//// meaning something like /en/docs/v2/somedoc-name?

@nebrelbug Sorry I had to wrap the text in a code tags for it to show. But you got it!

We would still love to be able to use crowdin if that is possible.

There are many potential solutions for i18n in v2 that we are thinking about.

I think for regular MD docs, we should stick to integrating with a SaaS that allows uploading MD files directly, so that translators can work in full doc context, instead of translating key/value pairs.

Keeping compability with Crowdin is probably important to v1 users.

We also need to validate that Crowdin and other i18n SaaS providers works great with MDX (ie docs pages embedding React components for interactive demos). If you have feedback on that I'm interested.

I've been exploring using Docusaurus for a multilingual site, only to realize v2 doesn't have any i18n layer. I have some experience with localizing products/documentation in various forms, including some automated localization pipelines, so here are my thoughts on how I would expect i18n to be handled:

  1. I'd prefer i18n of an open-source project to be SaaS platform agnostic, so that Docusaurus users could pick the SaaS of their choice, or stick with an open-source localization solution.
  2. MD/MDX, while not being the easiest format for localization purposes, is something Docusaurus has committed to use, so an external localization solution or process needs be able to work with MDX as a source and produce MDX as a target.
  3. Scalability of building/deploying a multilingual site is a concern. Adding a number of languages on top of a multi-versioned site adds even more complexity. However, it's unlikely that older versions of the documentation will need to continue to be translated or regularly rebuilt. Ideally, making a versioned snapshot would mean that the current state of the site and all its current translations is frozen, built, and not being rebuilt unless explicitly asked. This would potentially allow to have many versions that would not slow down the builds. Only the latest version of the site would be actively translated and built.
  4. For the speed of development, it is best for the developers or content creators to focus on the source language only, and build/preview only that language. This means that until explicitly asked, it should be fine to compile only the source language, and have the site functional. Ideally, docusaurus could be told to build a specific language (including live-reload scenarios), or to build all languages at once. With this, translations that are being automatically committed to the site repo (in continuous localization scenarios) would not slow down the development.
  5. For SEO, bookmarking and link sharing, a cleanest way is to have each language exposed explicitly in the URL, with the language negotiation being performed at each page level. For example:

    • Visiting / URL on a multi-lingual site would redirect to a language-specific URL (e.g. /en/, /de/, /fr/) based on available languages for that URL, and user's browser language preferences.

    • Visiting /some-folder/some-page/ would redirect the user to e.g. /en/some-folder/some-page/, /de/some-folder/some-page/ or /fr/some-folder/some-page/. An ability to provide deep links that auto-resolve to the target user's language is a must when using them in the product UI, marketing emails, and so on.

  6. Translating content takes some time, so if a page is missing in a specific language, it shouldn't break the build; If one goes to a language-specific URL, and the page doesn't exist [yet], the site could show a 404 page that says "this page is not available in your language yet, but you can switch to another language it is available in: xx, yy, zz."
  7. It should be possible to allow for a partial translation of websites. In other words, a situation where a particular page doesn't exist in a particular language, is valid and should not be considered temporary. Unresolved document IDs referenced from sidebars.js must not break the build, but hide that link from the document tree, and give a warning in the console log.

Proposal

A set of localizable files include docusaurus.config.js, sidebars.js and content folders (blog/, docs/). Imagine that you can have them sitting together in some common subfolder, e.g. content/. So the structure becomes:

content/
      blog/
      docs/
      docusaurus.config.js
      sidebars.js
node_modules/
src/
static/
package.json
...

All localizable content needed by plugins needs to be externalized into JavaScript/TypeScript/JSON files and reside in the content/ folder as well. Ideally there would be some common format and file naming convention that all plugins are expected to use.

Running docusaurus build without parameters would compile only the site inside the default (content/) subfolder, with its local config, sidebars and content.

An external localization process would need to take the source content/ folder and create localized variants of this folder, for example, content-de/, content-fr/, etc. Running Docusaurus against a specific content folder would render that portion of the site. For example, docusaurus build content-fr would try to load content-fr/docusaurus.config.js relative to the current working directory, and resolve all the documents relative to that file.

Each content variant needs to have its unique root URL (mount point) so that the variants can be deployed to the target site independently. This could be done directly in each version of docusaurus.config.js, but it might be cleaner to have them listed in some global config. content-de/ would get a /de/ root URL, content-fr/ would map to /fr/, and so on. The default content/ folder, depending on the source language, would also get a specific root URL, e.g. /en/. Having a root URL other than / is necessary for per-URL language redirects.

Making a version of a site means that the current state of these content folders needs to be duplicated. For example, when one decides to cut a version 1.x, what would be done is:

  • content folder is copied to content-1.x folder, and the root URL of the versioned copy changes from its current one, e.g. /en/, to /en/1.x/
  • content-de folder is copied to content-de-1.x with /de/1.x/ root URL
  • content-fr folder is copied to content-fr-1.x with /fr/1.x/ root URL
  • ...and so on.

Similarly, running docusaurus deploy <content-directory> would deploy this specific directory. It would be handy to have globs supported, so that one could say docusaurus deploy * to deploy all content directories, or docusaurus deploy *-1.x to redeploy all the content directories that end with -1.x.

Docusaurus would likely need to have static pre-rendered maps of all files for each of the content directories so that it can do quick lookups and per-URL language-based redirects, and that language negotiation/redirect portion will be shared across all content variants. Redeploying a particular language or a particular version would update these static maps, allowing to dynamically render the language switcher on all the pages.

Current workaround

Right now it should be possible to mimic the behavior I described above to some extent by creating a side project for each language variant and deploying each sub-site independently to a subdirectory of a global site. Symlinking directories like node_modules, src and some common files might help in keeping all these versions in sync. A global 404 handler might be used to dynamically switch to an available language-specific variant of the page if a generic URL is provided.

Sorry if that's a lot of text to process. :) Would love to hear your thoughts.

Thanks for your feedback @iafan . I agree with you on most parts. If you are on discord, that could be useful that we chat before I start working on this feature

@slorber Of course. I DM'ed you on Discord.

Hey there. I'm closing this issue in favor of a new one: https://github.com/facebook/docusaurus/issues/3317

I'm working on i18n, and have already integrated a few feedbacks from this issue already.

Was this page helpful?
0 / 5 - 0 ratings