www.apress.com

23/1/17

Localizing Apps: What’s Involved with Translation?

By Chris Miller

When you write an app that will support multiple languages, you have to separate the text and other assets (images, videos, audio) from the code. Having the text strings in a resource file provides two benefits for you.

First, you can have the application framework select the appropriate language resource file at runtime; you don’t need the code to pick the correct language. The application runtime will determine the user's current language and culture and will load in the best match for the locale.

The second benefit is to be able to provide just the text resources to a translator. You don't need to provide all your source code—only the resource files containing the text to translate.

Text translation involves more than replacing one word or set of words with another. Other languages can have different rules and usages. It's important to be aware of what needs to be translated and how languages are handled differently between cultures.

One language can have multiple dialects or even character sets. For example, the written Chinese language has two standard character sets: traditional and simplified.

Simplified Chinese was instituted by the People’s Republic of China (PRC) in the 1950s and ‘60s. It was based on work started decades earlier. The simplified characters were created by reducing the number of strokes required to create a character and simplifying the number for forms for a character.

After the implementation of simplified Chinese, the term traditional Chinese has been used to describe the previous character set. Modern Unicode supports both simplified and traditional character sets.

If you are supporting mainland China (PRC) and Singapore, you need to support simplified Chinese. If you are supporting Taiwan, Hong Kong, and Singapore, you need to support traditional Chinese. Adoption of simplified Chinese has spread in recent years and support of simplified Chinese is needed to make your product salable.

All English-speaking countries have multiple dialects of the English language. Although the dialects are very close, they have terms unique to each culture. US and Australian English uses the term truck to refer to what is called a lorry in the UK. What an American refers to as a cell phone, Australians and Brits call a mobile phone.

Translate Sentences, Not Words

Although it's tempting to just translate words and reuse them in multiple places, that process usually doesn't work. Other languages can have rules that may not exist in your language. Although it may seem like an extra expense to translate the same text multiple times, it's better to treat each label in your app separately. You do get some level of reuse with translations: a set of resources that can be reused is called a translation memory. The text for the OK button will be the same every time you use it, for example. But you may find that the same label on two screens could get a different translation due to the length of the word or phrase. It may be short enough on one page, but too long on another page, requiring some human guidance to pick an alternate spelling or abbreviation.

For example, let’s say you have an upload button and then a dialog that displays a message after the upload has completed. In English, you have two resource strings:

"Upload"

"The upload is complete"

Although you could build the dialog message string and use the "Upload" string, it can fail for other languages. The same strings in Spanish are the following:

"Cargar"

"La carga está completa"

Upload is used as a verb when it's the button label and then as a noun in the dialog message. When translated into Spanish, the spelling changed between the verb and noun usages.

Dealing with Grammatical Genders

Many languages have grammatical genders; nouns are considered male, female, or neutral (Slavic, Latin, Greek). Some languages have just male and female genders (Romance languages). Other languages combine the male and female as a mixed gender and still have a neutral gender. Other languages do not use a grammatical gender (English, Afrikaans).

The grammatical gender becomes important when sentences are constructed and the noun is determined at runtime, based on some condition. If you were constructing that sentence in pieces, based some action or condition, you would want to make sure that the grammar is correct for each language.

For example, suppose that you have an app that is connected to the user's car and can report the status of the components of the car. Let's assume that there is some code that reports if the car is running or if the lights are on.

In English, it is the following:

The lights are XXXX 

The engine is XXXX

In Spanish, it is the following:

Las luces están XXXX 

El motor está XXXX

In Spanish, the engine of a car has the masculine gender, whereas the lights are considered to be female. There is also a difference with plural words, but you'll dive into pluralization in the next section.

You can deal with the gender rules in two ways. The first way is to work with sentences and treat the entire sentence as a translatable resource string. The other way is to construct the sentence so that text is abbreviated so the definitive articles are not used.

For example, consider this text:

The doors are unlocked

You can leave off the definitive article and display the text this way:

Doors are unlocked

Or you can display the text as a condition:

Doors: Unlocked.

Pluralization

The way languages handle plural forms of nouns can vary widely. English has two forms: one of something and then everything else. Other languages have more complicated rules. Asian languages (Chinese, Korean, Japanese, Vietnamese) use only the plural form. Slavic languages typically have three forms, and the rules that define the conditions for each form can vary with the Slavic family.

The simplest way to deal with the plural rules is to display the quantity as a condition:

Apples: 4

Avoid doing this, however. Your goal is to have an app that feels natural to the user.

Right-To-Left support

Supporting right-to-left (RTL) languages can be a bit tricky. Xamarin.Forms, as of version 2, does not support RTL layouts. If you need to support RTL languages such as Arabic, Hebrew, or Farsi, don’t use Xamarin.Forms—at least not for the RTL languages.

Xamarin.Android and Xamarin.iOS let you program using the native UI toolkits. Android has had full support for RTL layouts since Android 4.2. If you need RTL support, make 4.2 the minimum version. You can do RTL layouts in older versions, but it's much easier with version 4.2 and up.

To enable RTL support in Android layout files, you have to do the following:

  1. Declare in the app manifest file that this app supports RTL mirroring in the view layouts by adding android:supportRtl="true" to the <application> element.
  2. Replace any layout properties that end in left or right with start and end. For example, paddingLeft would become paddingStart. If you need to support Android versions prior to 4.2, you would have both the left/right and the start/ end properties.

In iOS, the mirroring of text layout for RTL languages should be handled transparently. This functionality was added in iOS 9. Unless you are using custom controls, you shouldn't have to do anything extra to support the RTL languages.

Windows uses a property named FlowDirection to set RTL mirroring. It is set by the current culture of the device; as a developer, you shouldn't have to do anything extra to provide RTL support. If you are using images that have any form of directional bearing, you have to validate those images to make sure that they are still correct on a RTL view. If you have created your own dialogs, you have to verify that the default button placement is correct for a RTL view.

Layout Considerations

When designing a view, try to place the labels above the text or value fields. When the view is rendered as RTL, the controls will still be correctly placed. Doing so also avoids the potential problem of having translated text being much longer than default text.

With Xamarin.Forms, the StackLayout control makes it very easy to place the controls in a vertical list. If you are using Xamarin.iOS, the vertical UIStackView accomplishes the same task. On Xamarin.Android, the LinearLayout with the orientation set to vertical allows you to group elements from top to bottom. For Universal Windows Program (UWP), the StackPanel layout control flows the controls from top to bottom by default.

Context Is King

When sending text out to be translated, the context is very important. A word or phrase can have multiple meanings, and there may be only a single correct interpretation, based on the usage. This is what separates a professional translation job from a machine translation job.

In US English, the word trunk has several meanings; one of them refers to the storage in the back of a car. In UK English, the word boot also has multiple meanings; one of them refers to the rear car storage area.

To provide context, you can provide descriptions of the text to be translated. Screenshots of the application running in the default language can also be useful.

A word or term can be short in length in one language and much longer in another language. A skilled translator can suggest an alternative term or an abbreviation that would provide a better fit with the screen layout.

Dates and Time

Date formatting is always culture specific. Although most countries follow the Gregorian calendar, the order of the date fields and the characters used to separate them can vary wildly.

Getting the right date format for display and data entry is very important. When the day part of the date is less than 12, you can’t tell whether the date is in the day/month/year format or the month/day/year format by just looking at the date.

If you need to convert a date value to a string to send to a service, your best bet is to the ISO 8601 format. The ISO 8601 standard defines a date as YYYY-MM-DD. If the date is October 1st, 2016, it is represented as an ISO 8601 string.

The ISO 8601 standard for time is hh:mm:ss, where hh is the number of hours since midnight (0-23), mm is the number of minutes (00-59), and ss is the number of seconds (0-60). 

NOTE: Seconds can go up to 60 to account for an inserted leap second. Every few years, an extra second is added into the Coordinated Universal time (UtC) scale, which keeps atomic clocks in sync with the rotation of the earth.

Values containing both time and date just combine the two formats as YYYY-MM-DDThh:mm:ss. Case matters: MM and mm have two different meanings. The former is the number of the month; the latter is the number   of   minutes.

The .NET Framework has standard patterns for formatting date and time strings. The "d" format string formats the date using the ShortDatePattern. When you use the date and time format strings, the .NET Framework returns string values using the correct field order and date and/or time separators. Let's take a look at some code and see how the date formatting changes for different cultures:

var dt = new DateTime(2016, 10, 2); 

System.Threading.Thread.CurrentThread.CurrentCulture  =  new  CultureInfo("en-US"); 

Console.WriteLine("English   (US)");

Console.WriteLine(dt.ToString("D"));

Console.WriteLine(dt.ToString("d")); 

Console.WriteLine(dt.ToString("s"));

System.Threading.Thread.CurrentThread.CurrentCulture  =  new  CultureInfo("pt-BR"); 

Console.WriteLine("\nPortuguese      (Brazil)");

Console.WriteLine(dt.ToString("D")); 

Console.WriteLine(dt.ToString("d")); 

Console.WriteLine(dt.ToString("s"));

System.Threading.Thread.CurrentThread.CurrentCulture    =    new    CultureInfo("de-GR"); 

Console.WriteLine("\nGerman    (Germany)");

Console.WriteLine(dt.ToString("D")); 

Console.WriteLine(dt.ToString("d")); 

Console.WriteLine(dt.ToString("s"));

When you run that code, you get the following output:

English (US)

Sunday, October 2, 2016 10/2/2016

2016-10-02T00:00:00

Portuguese (Brazil)

domingo, 2 de outubro de 2016 02/10/2016

2016-10-02T00:00:00

German (Germany) 

Sonntag, 2. Oktober 2016

02.10.2016

2016-10-02T00:00:00

The Long Date ("D") format shows that punctuation, spelling, and case will change based on the culture. The Short Date ("d") format shows how the ordering of the date fields and the date field separator changes. The Sortable ("s") format string follows the ISO 8601 standard, and the result is the same for each culture.

When working with multiple time zones or dealing with multiple calendars, consider using the Noda Time library. It has code conversions from one time zone to another and code for converting dates from the default calendar to other calendars, such as the Hebrew, Islamic, and Coptic calendars.

About the author

Chris Miller is a Senior R&D Engineer for Tyler Technologies.  He is a Microsoft MVP for .NET and is active in the community.  Chris is a Microsoft Certified Professional and a Xamarin Certified Mobile Developer. See more from Chris at rajapet.com.

Want more? This article is excerpted from Cross-platform Localization for Native Mobile Apps with Xamarin by Chriss Miller. Get your copy today for much more on translating and localizing apps. ISBN 978-1-4842-2465-6.