Multilanguage - Use any Locale now in TYPO3 v12

|Benni Mack

Languages have infinite possibilities — so should your multilingual website

By default, TYPO3 uses English as the base language, and ships with a mechanism to translate text strings into other languages. Strings include all text in the backend user interface (UI) as well as the frontend or output of your website. Common text strings might be the words on a button such as “Log In” - these words are not handled as content but rather are stored in external translation files. One of TYPO3’s strengths is its long-lasting migration path from previous versions — so any changes to the handling of translation files is done with maximum backwards-compatibility in mind.

This article explains the basic idea of language and locales related to translation files in a CMS context — from choosing the language, their variations, and their format — and how TYPO3 has slowly shifted to the concept of using locales as a better identifier of the used language and its variation.

The Rigid Past - TYPO3 and Language Challenges

The admin interface and all TYPO3 plugins are English by default, and all labels reside in files of the XLIFF standard. But before v12, the way TYPO3 approached language had some quirks. The TYPO3 “supported languages” is a fixed list of “language codes” (such as “de” for German) which Kasper Skarhooj originally introduced over 20 years ago. Over time, the list has expanded to over 57 supported languages. The TYPO3 Localization Team works hard to translate the English strings and labels into these languages. In mid-2023, the translation for TYPO3 in Arabic was completed. This is a great community effort and an amazing system that works well. Every TYPO3 installation can download additional “language packs” for their extensions — even third-party extensions — in any of the 57 supported languages.

When using German for a site language or an editor in the TYPO3 backend, the Localization system automatically checks for available language packs in the local setup. It references a “de.mylabels.xlf” file, checks for the label, and returns it. If the label doesn’t exist in the XLF file, it falls back to the English version.

However, the challenge comes when a project needs to use a different “language” for a different version of a language, for example, German for Switzerland. This is where “locales” come into play. The format usually is bound to a Locale system where “de” is the ISO 639 language code (2 letters in lowercase), and an optional 2-letter country code (ISO 3166-1 alpha 2) separated by an underscore - resulting in “de_CH”. Previously, you could “add” an additional language with any format, such as “de_CH” (or any other kind of way), but you also needed to define the system to fall back to “de” and then to English, to override any necessary words or sentences.

Once manually registered and defined as a “dependency” in TYPO3 (until v11), the language system then checked for “de_CH.mylabels.xlf” — then, for a “de.mylabels.xlf” and then for the fallback to “mylabels.xlf”, when referenced. 

For the website output, language setup is handled in the Site Configuration. One example for a setting would be to define if a language or locale for your website should be rendered “right to left” or “left to right”. For editors, there are a lot of different language settings in Site Config that make it complicated to use.

BCP 47 — A standardized format for the Web

Although TYPO3 lets you construct a format to specify additional languages, there is actually a web standard format. The HTML standard for locales is built on a standardized code called IETF BCP 47 language tag. Rather than an underscore, each language tag is composed of one or more "subtags" separated by hyphens (-). For example, Swiss German looks like “de-CH”, but the format can also contain additional regions such as “nan-Hant-TW” (that’s Min Nan Chinese using traditional Han characters as spoken in Taiwan).

TYPO3 v12 Improvements

TYPO3 v12 ships with a list of all ISO 639 language codes, which can literally be combined with any other Country or Region code, and handles the fallback automatically. The fixed list of ISO 639 letter codes as defined in TYPO3 Core is still valid - but the changes introduced in version 12 should result in a reduction in the direct usages of the two-letter “language” format. 

In addition, a Locale class has been introduced, which can handle both BCP 47-like formats as well as previous syntax formats. When referencing a translation file then, the Locale is used and the fallback logic for multiple languages applies automatically.

$languageService = $languageServiceFactory->create(new Locale('de-CH));
$myTranslatedString = $languageService->sL('LLL:EXT:my_extension/Resources/Private/Language/myfile.xlf:my-label');

In addition, the necessary options for setting up a new Language in TYPO3 v12 have been drastically reduced, so it’s now easier than ever to configure a new language:

Most values such as right-to-left handling, and even the hreflang tags are now derived from the locale. 

For all extension authors, TYPO3 now includes a list of all ISO 639-1 and ISO 639-2 codes (two- and three-letter language codes), including official language names, in a common available PHP class `\TYPO3\CMS\Core\Localization\OfficialLanguages` — in all available translations for TYPO3 supported languages.

Locales all the way

TYPO3 v12 can make life easier if your requirements need it with languages and Locales. From setting up new languages, reducing configuration overhead, and adding custom language-country-region combinations, everything is possible now with TYPO3 v12.

We are really happy that TYPO3 as a leading open-source content management system can solve all the needs of our clients, and we haven’t seen any comparable system with similar capabilities yet.

Do you have a special case regarding languages or translations for your website? Reach out to us!

Get in touch.