Localization, Localisation

Practical and concise answers to common questions in G11N, I18N and L10N

Archive for the ‘Crowdsourcing’ Category

Crowdsourcing in Localisation: Next Step or Major Faux Pas?

Posted by Nick Peris on June 23, 2009

Crowdsourcing: Together Everyone Achives More

As the Information Technology industry continues to evolve, so does the Localisation industry. Often in reaction to the former, the evolution of the latter is always the response to a specific need, supported by either advances in technology, processes or both.

Crowdsourcing, far from being only a buzz word, is a tangible trend born of the so-called Web 2.0 era. It has shown signs of spilling over into Localisation for some time and the first stages of this process have been somewhat less than successful. While user-generated content, web-based applications and social networking products/websites are flourishing, crowdsourcing seems to consistently yield controversy.

So what makes Web 2.0 hip and Crowdsourcing, especially in Localisation, decidedly uncool? It is partly the age-old debate on whether the internet should be used for mercantile purposes. But it is also the very nature of Localisation and our struggle to get recognized as an integral part of Product Development Life Cycle. We are despite our best efforts, still seen as an unfortunate cost which gets in the way of Product to market efforts.

Some definitions

Web 2.0 was once an empty buzz word for whatever comes next. “C’est tout simplement l’internet d’aujourd’hui (…) celui que vous et moi utilisons tous les jours. ” said a member of French parliament early this year (2009!), only weeks before he was expected to become State Secretary for the Digital Economy! Also used and abused as a fresh marketing slogan, Web 2.0 seems to have now gained respectability as a description of the combination of Rich Internet Applications (RIAs) and user-generated content. Importantly, ideas reminiscent of the Open Internet ethos and a stronger sense of community also feature in most definitions of Web 2.0.

Crowdsourcing describes the act of outsourcing a task to an undefined, generally large group of people. It also carries the idea of by-passing the professionals in favor of a strength-in-number effort.

Localisation 2.0 is a newer concept yet, partly championed by one specific LSP, which attempts to describe current trends in Localisation tools and processes, designed to respond to the exponential rate at which localisable content is generated in the Web 2.0 paradigm.

Wikipedia: a success story

The free encyclopedia that anyone can edit was created to “distribute a free encyclopedia of the highest possible quality to every single person on the planet in their own language”. Launched in January 2001 by Jimmy Wales and Larry Sanger, it has 265 localised editions with a total of other 13 million articles.

The recipe is simple: Wiki is a non-profit, non ad-supported site, where users can publish their own articles and add or correct existing ones. Articles often differ from one language to the next so Wikipedia is a true example of an internationalised rather than just translated website. For example, the article about Wikipedia contains a statistics table by language in its French version which does not appear in the English version.

The model of Wikipedia creates a community with a feeling of shared-ownership and allows it to get the most out of its user-base without ever appearing to be exploiting anyone. This flavor of user-generated content, of which Wikipedia is only one example, should probably not be called crowdsourcing at all, although I put it to you that it may be the only viable way to use a “crowd” as a resource: for its own interest!

Online Translators: the first signs of trouble

Most everyone uses an online dictionary. Everyone who uses a bilingual online dictionary thinks they’re great. Once you double-check your results, and are familiar enough with the languages to navigate your way through synonyms, grammatical rules etc, they do the job. From that point of view, they are no different from their paper ancestors. Just a little more… portable.

But already a line was crossed with online translators: they created the illusion that linguistic skills are no longer required. They created the possibility for non-linguists to type a sentence in their source language and output a “translation”. While this may well be useful to a qualified translator as a reference, it should not ever be used to replace a translator.

An esteemed colleague of mine, well versed with internet searches and other smart ways to get what he wants, recently contacted me to translate “Plastical Surgery at Home” into French (I never asked why and never will…). By simple curiosity, I typed it into an online translator and received the suggestion “Plastical Chirurgie à l’Accueil”. This not only differed greatly from the translation I was about to suggest, it also gave me a good example of why it just doesn’t work. Because of a small error in the source text, the online translator reverted to guessing a word by word translation and used “Accueil” which is an IT translation for “Home”. The suggested target translation really means that someone is offering to surgically alter your appearance behind the receptionist’s desk. Not very inviting

Every time I ask someone “Which Translation Memory system do you use?” and they reply “Google Translate” or “Bablefish” etc. it gives me the shivers!

Facebook: crossing the Rubicon

Facebook has been the center of one or two controversies of late, and its localisation strategy could easily have become one. Whether it was taken out of focus by other issues such as facebook’s Terms of Use changes or whether it was a smart and creative move, remains debatable.

Facebook is available in 63 languages which is considerably more than their main competitor MySpace. Upcoming languages are expected to be Persian, Arabic, Hebrew, Syriac, Urdu, Yiddish and Divehi. It seems clear that the collaborative and benevolent effort behind this did allow faster localisation and opened it up to an array of languages which most likely would not have been deemed economically viable to localise the traditional way. And this is an important point: one of the challenges in localising Web 2.0 is keeping up with exponentially increasing content creation rate and the growing expectation for localised products. With the number of languages spoken in the world estimated in the thousands, how could anyone pretend to have a Global strategy and only localise their product into FIGS or even L17?

The methodology employed by facebook also seems to hold some ground. A web-based application (facebook Translations) is provided, and a staged plan is rolled out beginning with Glossary Translation, continuing with Strings Translation and including post-release Error Reporting and New Features Translation. Community votes decide between alternative translations and consistency checks are run. This doesn’t sound all that un-professional.

But the fact remains: having asked their users to translate the facebook UI for free, facebook are now deriving new users and therefore new advertising revenue through work which was donated not to them but to the facebook community.

LinkedIn: crossing the line

Attempting to emulate projects such as facebook, it would seem LinkedIn have manage to create a pretty big stir before they even got started. By all accounts’ it appears that a survey was circulated to LinkedIn members who are translators, and offended most of them by the wording of their enquiries regarding alternate compensation for translation work.

The survey has now been closed but some results have been published by Nico Posner project manager responsible for LinkedIn’s internationalization efforts. The fact and the matter is that thousands of responses came through, and only a minority selected the category Other, which was the only outlet for translators who considered the only suitable compensation was direct remuneration.

So what does that tell us? The professional translators community is not amused, and this survey is not a PR stunt LinkedIn will be looking to duplicate. However even through the controversy, and the claims of bias in the way questions were asked, there is still a substantial interest for collaborative and benevolent efforts in the linguistic community. The question now is how to liberate this potential in an ethically acceptable fashion?

Google Translate Toolkit

The Google Translator Toolkit is a new-comer (actually still at beta stage). A free and web-based translation application, which uses Machine Translation and includes TM (.tmx) and Terminology (.csv) management tools. In their own words, it is an attempt to bring human touch back into Machine Translation.

So does it work? This tool appears to bring the facebook model one step further in the right direction: it is not designed to help translate Google for free. It is designed to help amateur and professional translators alike to collaborate, share resources, and use a TM and Terminology enabled tool for free.

While it is not comparable to any powerful native CAT tools, it does offer a viable solution: the TM sharing potential is huge, the built-in collaborative tools are the right idea, and the limited file format compatibility remains functional (extract to TMX, create Terminology Databases without expansive tools etc.).

Google Translation ToolkitBut there is always a catch: in this case, the fact that Machine Translation remains Machine Translation. The screencap included here shows the raw output from English into French of one of our articles. It wouldn’t take long to a French translator to recognize the tortured prose which time and time again comes out of such systems. If quality rather than quantity is a concern in a translation job, and if the content to translate is in any way wordy, I find it hard to believe that a translator would do a better work righting such blurb than they would translating in a TM + Terminology enviroment!

As a parting note, I will not provide any pearl of wisdom. First because the wheels are still in motion and we’ll only fully understand what is happening to the Localisation industry once it has happened. Second, because I would like to end by inviting you to translate this article in a language of your choice, email it to LocalizationLocalisation@gmail.com and include an SAE if you would like to receive a limited edition Localization, Localisation pen.
Pen

Posted in Crowdsourcing, Globalization | Tagged: , , , , , , , , , , , , , | 16 Comments »