It is important to understand that Europe is not a melting pot like the U.S., where people of different cultures and religions blended together far from their native lands. Politically and culturally, Europe is diverse, but each country has a different history, religion, national language, culture and heritage to which it remains strongly attached. Europe consists of 520 million inhabitants in 34 countries, within which 32 languages are spoken and 23 different currencies used.
This diversity represents a real challenge for companies wanting to penetrate the European market, as they will need to localize their products for the specific needs of each country. As the former German Chancellor Willy Brandt once said: “If I’m selling to you, I speak your language. If I’m buying, dann mssen Sie Deutsch sprechen.”
However, the challenges are not only linguistic, but also cultural.
Technical Challenges Linked to Linguistics
- Text string expansion: It is not uncommon for short texts, such as the titles of software commands, to be three times as long in German as they are in English. For example, the English word “Redo” translated to German is “Wiederherstellen” — up from 4 characters to 16, representing an expansion of 400 percent.
- Character sets and encodings: Character encoding schemes are limited in size, and do not always cover all the characters of a specific language. For instance, encodings belonging to the ISO-8859 family are represented by one byte, and are thus limited to 256 characters. In this family, Latin 1 covers Western European languages with characters such as , , and , while Latin 2 covers Central and Eastern European languages with characters such as ś, ŭ, ť and č. Unicode standard covers almost 100,000 characters and most scripts (writing systems) in use today. Multilanguage software should thus be adapted to Unicode to support as many languages as possible. The necessary adaptation process is by no means trivial, as not only does the software need to be adapted, so do all related legacy systems that store data.
- Keyboard character layout: Keyboard layouts vary from one country to the next, as they have been customized to the characters and symbols mostly used in a given language. Keyboard input may not be appropriate if the product is not internationalized and does not recognize other keyboards.
- Keyboard short cuts: These will also need to be localized to make sense for a particular language. [Ctrl] O makes sense in English for “Open,” but not for its Portuguese equivalent “Abrir.”
- Alphabetical sorting order: Sorting rules for extended characters differ from language to language. In Polish, extended characters are collated after their non-extended counterparts: A, Ą, B, C, Ć, D, E, Ę, …. while in Swedish, they are placed at the end…, X, Y, Z, , , ). In Hungarian, both extended characters and consonants are written with single, double or triple characters. The alphabetic order thus looks something like: A, , B, C, CS, D, DZ, DZS, E, , F… It goes without saying that when sorting lists of items in the graphical user interface (GUI) or in your documentation’s indexes, you need to be fully concentrated and have all the rules at hand.
- Abbreviations: Abbreviations are often used to save space in English documentation, but they don’t always translate into other languages, or worse, may sound like an offensive word!
- Text and audio concatenation or placeholders: Concatenation occurs when a sentence is composed of different segments of text. For instance: “do not,” “click on” and “print” could be composed as “click on print” or “do not click on print.” Well-intentioned programmers use concatenation to save space. While it makes perfect sense in their native language, it may not work for other languages which are structured differently. German, for instance, requires the verb to be at the end of the sentence. If composed in the same way, a German sentence will thus be grammatically incorrect: “Nicht klicken auf Drucken” instead of “Klicken Sie nicht auf Drucken.” These localization issues require re-engineering and are time-consuming.
Similar issues occur with placeholders. If we take ” percentd red,” “flag” and “flags,” where percentd is the placeholder for the number, it will be impossible to obtain a correct sentence in Polish as the word “red” changes based on both the numbers and the gender: For illustration, we selected the Polish word for flag (flaga – female), for armchair (fotel – male) and for lake (Jezioro – neutral).
Technical Challenges Linked to Cultural Differences
- Numeric formats: 10,000 means the number 10 for a French person, while for an American it means 10 thousand! 100,000.99 does not mean anything in France, where decimals are separated by a comma “,” and not a dot “.” as in the U.S. In France, this number should be written 100 000,99.
- Currency: In Europe, the symbol associated with the currency is placed after the numbers and not before. For example 25,000.00 is written 25 000,00 . Currency needs to be localized to its local equivalent to be meaningful for the end user.
- Payment methods: E-commerce solutions need to account for local payment preferences. In Germany, people tend to prefer wire transfer payments to online Visa payments. In Slovakia, payments by check do not exist! In other countries, people may simply not have credit cards or know PayPal.
- Local address and telephone formats: Local address input forms have to be adapted to the country’s address format. The same goes for phone numbers, either for input or when published in the content. Locals should understand address forms straight away without having to guess whether a country code or an area code has to be included for it to work.
- Name formats: Hungarians expect to see their family name first and their first name second. A Hungarian called Zoltan (first name) Kiss (family name) would thus expect to see his name displayed as: Kiss Zoltan. However, in Germany, the Netherlands or the U.S., the first name is displayed first: Christian Meier.
- Date: 03/05/01 does not have the same meaning for all of us. It could be referring to the 3rd of May 2001, or the 5th of March 2001, or the 1st of May 2003 — and does the 01 refer to 2001 in the first two cases? In the U.S., the date format is mm/dd/yy, whereas in Europe, dd/mm/yy tends to be used. However, in Sweden, the year is placed at the start!
- Time format: Use of a.m. and p.m. instead of a 24-hour clock needs to be localized for Europe. Only a minority of Europeans will understand 8 p.m.
- Metric system: Only three countries have not officially adopted the metric system, and among them you find the U.S. Also, when speaking to British or Irish citizens, you will make far more sense if you talk about miles and pounds rather than in kilometers and kilograms.
These illustrations show that the translation process, which entails communicating the meaning of words or sentences, only represents a subset of what we refer to as localization.
In localization, there are ways to ensure that these issues are addressed. Terminology needs to be correct and consistent. Source files have different formats, and translations should ideally be reused from one project to another and from one version to another. A great many tasks and people are necessary to accomplish all of this.
To ease the localization cycle and enhance the quality, different tools are available on the market. These tools range from terminology management systems to pure translation memory tools to comprehensive workflow management systems.
The impartial nonprofit Globalization and Localization Association maintains a Language Technology and Services Directory of professional firms, many of which focus on helping customers localize for European markets.
Janaina Wittner is Strategic Development Manager ofWhP, a privately owned, international localization company. WhP is a member of the Globalization and Localization Association (GALA), a nonprofit, international industry association for the translation, internationalization, localization and globalization industry.