DEV Community

Dan Greene
Dan Greene

Posted on • Edited on

Can you just combine region and language to make a locale?

As I continue to grow my knowledge in internationalization (i18n), I've had the request from internal and external customers to support something strange.

"Hey, can we support the region of "Spain" but avoid taking on translation since translation is costly and requires hiring translators?"

As any developer, I thought "sure, I can do this!"

As it turns out, the more important question is not “can you do it” but “should you do it?”

How I Tried To Solve This

My first attempt to solve this was to look at date-fns since many UI developers use that for formatting. If date-fns had the “en-SA” locale or if it allowed me to pass it anyway, then my journey would be over. Sadly, I quickly found out that their locale support was limited to a pretty small set of countries and it is manually maintained as opposed to taken from a continuously updated source.

So then I started to look for a way to get the most up-to-date locale information. This search brought me to the browser supplied Intl API. Seemed pretty great (at first), except Intl doesn’t allow me to get the formats, which hurts my ability to communicate to the user what they are expected to type into a date input widget. I’ve since submitted this as a feature request, but since support would only come if the ECMAScript spec committee agreed to it and all browsers implement the enhanced spec, it would take a very long time to get a full solution.

After researching it a lot, I finally came to the source of truth, the CLDR. This is actually what the browsers use under the hood with Intl. By using the CLDR data directly, we have an opportunity to:

  • stay continually up-to-date with locales (since new countries/regions form and existing countries/regions change preferences)
  • get access to the formats for each locale’s dates, date/time, etc. so that we can inform screen reader users of the expected string to type (Which again, Intl does not support at this time)
  • get access to the source data so we can finally see if there is data for a locale like “en-SA”

But wait… the CLDR data doesn’t have “en-SA” either! So what should we do?

How Does CLDR Handle Missing Locales

So my immediate finding was: there is no "en-SA" locale, there is only "ar-SA." In other words, there is Saudi Arabia that speaks Arabic, but no Saudi Arabia that speaks English. This reflects reality. So why would we try to fake "en-SA"?

Let's see what would happen anyway. Luckily, the CLDR Elixir library has a very clear (but less-than-ideal answer):

When validating a locale name, Cldr will attempt to match the requested locale name to a configured locale. Therefore Cldr.Locale.new/1 may return an {:ok, language_tag} tuple even when the locale returned does not exactly match the requested locale name. For example, the following attempts to create a locale matching the non-existent “English as spoken in Spain” local name. Here Cldr will match to the nearest configured locale, which in this case will be “en”.

That one part is so important that it is worth repeating:

the non-existent “English as spoken in Spain” locale name

It doesn't exist.

While I like the clarity of their documentation, I am not pleased with the resolving that happens (i.e. "match to the nearest configured locale, which in this case will be 'en'"). So if you were to pass "en-SA", you don't get an error, you get an unexpected mutation.

So now that we know that the Elixir library for CLDR will silently resolve our fake "English Saudi Arabia" locale to English, let's see how Intl handles it.

Intl.DateTimeFormat("en-SA").format(new Date())
// '8/11/2023'
Enter fullscreen mode Exit fullscreen mode

As you can see, it is using English United States standard for date where the month is first. This is NOT how Saudi Arabia prefers dates. They use the d‏/M‏/y pattern as shown in LocalePlanet, which is nearly identical to CLDR in my experience. So Saudi Arabia expects 11/8/2023 and yet if used our "fake locale" approach, we would be accidentally giving them 8/11/2023.

This would not be helpful to the user.

So What Should We Do?

Often the simplest answer is best for the user, and we have a simple answer:

Do not give your user to select a locale that does not exist in the CLDR.

Instead, give them some kind of UX feedback that helps them to find a locale that most closely matches their preferences.

For example, you could even give them a questionnaire.

Which of these number formats looks most normal to you? Which of these date formats look best to you?

Once they click on the radio buttons for that questionnaire, you can give them a list of commonly used locales that utilize those formats. For example, people who like to see months first, often will select "en-GB" for English as it is spoken in Great Britain. By doing this, you've:

  • 🌈 helped your user to get the end result they want
  • 🌈 you didn't have to mislead the user by giving them the false sense that their region/country is supported, which prevents a great many bugs
  • 🌈you didn't have to write a bunch of fancy code just to workaround the problem. This would have led to hacky code.

Ultimately, honesty is the best policy.

Top comments (0)