Blog Archives

WtD Prague: Localisation of open source docs

This week I’m attending Write the Docs Prague. It’s super exciting to attend a European Write the Docs conference, and to be visiting the lovely city of Prague. This post contains my notes from a talk at the conference. All credit goes to the presenter, any mistakes are my own.

Zachary Sarah Corleissen‘s talk was titled, “Found in Translation: Lessons from a Year of Open Source Localization”.

[From Sarah Maddox, author of this blog: Localisation is the process of translating content to different languages, and of adapting the content to take regional idioms and linguistic customs into account.]

Zach’s experience comes from localising the Kubernetes docs.

Advantages of localisation

Zach discussed the advantages of localising an open source project. Localisation opens doors to a wider audience. It’s a tool to drive adoption of a product. Localisation also offers the opportunity for more people to contribute new features to a product. It therefore distributes power within the open source project.

When the Kubernetes docs managers considered localising the docs, they made some assumptions that later proved to be unfounded. For instance, they thought the localisation contributors would contribute only to their language. That proved not to be the case. Localisation contributors update the English source as well as their own language source, and they also help other localisation teams get started. For example the French teams help other teams get started with localisation infrastructure, and groups of related languages get together to define grammatical structures for technology-specific terms, such as “le Pod”. Thus the localisation contributors embody the best of open source contributions.

Localised pages increase the number of page views, which is a good thing for a doc set. Zach showed us some stats from Google Analytics with some impressive numbers. Each language added around 1% page views, which represents a big number in a doc set as large as Kubernetes.

Zach said we should also consider the support ratio that the localised docs provide. For example, there are 8 localisation contributors for the Korean docs, catering for 55,187 readers. So, 8 : 55,187 is a ratio of 1 : 6,900.


These are some of the nuggets of advice Zach shared:

  • Let each of the local teams determine for themselves how they create the localised content. That fits in best with open source philosophy, and the local teams know their local customs best.
  • The Kubernetes project does require that the localisation teams adhere to the Kubernetes code of conduct, and that the code of conduct is one of the first docs translated.
  • Bottlenecks include infrastructure, permissions, and filtering by language. You need to put solutions in place to manage these bottlenecks.
  • Trust is essential to collaboration.
  • To make it possible for a high level of mutual trust, make sure the boundaries are clear, and be careful with the permissions that you assign in the repository.
  • Choose a site generator that has strong multi-language support. A good one is Hugo. Jekyll makes things very difficult.
  • Filter the issues and pull requests by language. Zach doesn’t know of any good tools for this filtering. (If you know of any, shoot him a tweet.) Zach mentioned some possibilities from the Kubernetes world: Prow is a possibility but it’s a heavyweight tool for just localisation. Another option is zparnold’s language labeler.
  • Use version control to review and approve by language. Require review and approval from a user in a different company from the submitter of the pull request.

Some cautionary tales:

  • Look out for raw machine-generated content.
  • Make sure the translators are not being exploited as free labour. Even if you’re not directly engaging the translators, take steps to ensure ethical content.


I learned a lot from this session. It was especially relevant as we’re starting to consider localisation of the Kubeflow docs which I work on. Thank you Zach for a very informative session.

ASTC-NSW day 2: Preparing your documentation for translation

This weekend I attended the ASTC-NSW 2010 conference in Sydney. These are the notes that I took during a session by Sarah Forget. All the credit for the content and ideas goes to Sarah. Any mistakes are mine.

Sarah Forget presented a session on preparing your documentation for translation, titled “How to write for translation: New challenges for the writer”.

Introducing the topic

In the course of her work, Sarah has seen many mistranslations. Most of those were caused by ambiguous English structures.

The non-functional literacy rate in Australia is 46%.

You need to know who the translators are, as an audience. This helps you to know how to write for them.

More than just translation, localisation includes a customisation of the message in the target language and culture.

  • Formatting
  • Regional settings
  • Layout

As well as being well written, your content must be well structured, to optimise the translation process.

Getting to know how the translators work

Translators use CAT tools (computer-assisted translation tools), which include the following functionality:

  • Translation memory – Stores strings of text in the source and the target language. This helps the translator to keep consistency across their translations.
  • Terminology manager – Stores concepts (“terminological values”) in two or more languages, including synonyms, antonyms, definitions, and so on. Example of such a tool: MultiTerm.

Sarah showed us a video of a translator’s desktop with the software in action while the translator was working. It showed how the translation memory prompts the translator with existing translations for each phrase, or with translations that are similar but not exact.

The formatting and layout of the source document affects the processing done by the translation memory tool. For example, tabs can be mistaken for an end of paragraph.

If you use consistent terminology and Plain English, the terminology manager (MultiTerm) can find the match easily. As a by the way, MultiTerm also educates you about what Plain English is.

Once the translator has translated a term, they can then choose to add it to the terminology manager.

The translator can save the document as a bilingual document, containing both languages so that they can continue working on it or make corrections after review. When the work is complete, they save the document in the target language.

Tips from Sarah

These are just some of the tips Sarah gave us:

1) Assume that your text will increase in size by approximately 30% after translation. To manage this expansion, add extra space in the original document, such as after paragraphs or in tables.

2) Avoid manual formatting.

  • Use the templates, styles and automated capabilities provided by your authoring software.
  • You don’t even need to include the table of contents for translation, if it’s automatically generated.
  • Don’t add spaces, line breaks and so on. Manual line breaks, tabs and spaces will limit the efficiency of your CAT tool.
  • Avoid manual hyphenation, because the conventions are different in different languages.
  • If you take care with this sort of thing, then there will be no work for you to do when you get the document back from translation.

3) Use consistent and clear terminology throughout a single document, or even better throughout the documentation of the entire company.

  • Avoid jargon, colloquialism and regional language.
  • A good way to manage this is to create a glossary, defining the terms to use and those not to use. You can then also send this glossary to the translator.

4) Write sentences that can be understood without context. For example, this sentence is almost impossible to translate: “How is it used.” The pronoun “it” needs to be replaced by the masculine, feminine or neutral pronoun depending on the target language.

5) Ensure there is no ambiguity in the sentences. This sentence is an example of ambiguity: “Remove the part using the filter”. Do you use the filter to remove the part, or do you remove the part that uses the filter?

6) Include the articles in a sentence. Don’t leave them out, because they are often necessary to clarify meaning.

7) Don’t stack your nouns. Stacked nouns are particularly hard to translate. A comment from the floor caused some laughter there: “It’s called a ‘noun sandwich’!”

8 ) Keep the subject and verb close to each other. (Being technical writers, you’re probably wondering why there’s an extra space between the 8 and the bracket at the beginning of this line. It’s because WordPress is kindly converting 8 plus bracket to an icon of a smiley with sunglasses. 8) )

9) Use the appropriate punctuation.

10) Spell out acronyms, at least once at the beginning. Also tell your translators how to handle acronyms in the translated text.

11) Give your translators some contextual information. This affects the translation, because different terms mean different things depending on the industry or other context.

12) Send only signed-off documents to the translators. Otherwise you’ll end up paying for re-translation.

13) Create a working communication channel with your translator. They will have lots of questions to ask. This is important for a qualitative translation.

Recommended book

Sarah recommends a book: The Guide to Translation and Localization – Communicating with the Global Marketplace. You can request it from LingoSystems. They will send it free of charge.

My conclusion

This was a very useful session. At Atlassian, where I work, we don’t yet optimise our documentation for translation, but it’s something we’re going to need to do very soon. It’s also something I’m interested in personally. Thank you for all the information and tips, Sarah.

AODC day 2 – Translation and localisation

This week I’m attending the 2009 Australasian Online Documentation and Content Conference (AODC) in Melbourne. Today is the second day of the conference.

Here are some notes I took from the session on best practices for translation and localisation, by Emily Cotlier of Harris Stratex Networks. I hope these notes are useful to people who couldn’t be at the conference this year. The AODC organisers will also publish the session slides and supplementary material once the conference is over.

Translation and Localisation Best Practices

Emily started by asking how many of us work for companies that sell products in other countries, and how many of us know that our content needs translation. Most of the room said yes to both questions.

Some concepts:

  • Translation = Translating from one language to another.
  • Localisation = Aligning a product with the culture of a particular country. Some examples are the use of language, use of colour, spelling, use of sensitive images.
  • Internationalisation = Designing a product so that it can be localised relatively easily.

Would a technical writer or technical communicator be qualified to manage a translation project? Emily says Yes. She listed the skills required to manage translation and localisation projects:

  • Research
  • Organisation of the source content
  • Scheduling of the content release, along with the translated content
  • Managing of freelancers
  • Communication
  • Global thinking

Next she gave some answers to the type of questions we have when considering such a project. Here are my notes, but please refer to Emily or other sources for up-to-date and regional information:

  • How long does it take? Emily’s experience is that a document of 1000-3000 words requires 1 week turnaround; a short manual takes 2 weeks to a month; over 50 000 words takes one to two months, depending on length and on whether more than one translator is involved. Other factors are the inclusion of graphics, review/approval process, etc.
  • How much does it cost? Recently, a complex manual cost USD 6500 (translator in China) and another cost USD 35000 (translator in USA).
  • How do you manage the project? If you outsource the work, the pros are expertise and speed; the cons are a higher cost. If you do the work in-house, the pros are lower cost and retention of knowledge; the cons are the time required and the challenge.
  • Who’s going to pay? Emily says be sure to sort this out beforehand, because the cost can be quite a surprise.

We took a look at what the localisation management technology can do. Examples of such tools are Trados, the localisation module in Author-IT, and many more. They provide an interface for translating. For example, they may lay the original content on one side and the translation on the other side. They compare source with the previously-translated version and indicate any differences, via highlighting on the interface for example. Good software will provide a translation memory. This means that it “remembers” what has been translated, offers existing translations for terms in the source, and points out what has changed since.

Preparing for content localisation:

  • Use clear language. Avoid colloquialisms, passive statements and long convoluted sentences.
  • Prepare a glossary that you can give to the translators, and use it as a style guide for the rest of the translation.
  • Link to graphics rather than embedding them. They need to be translated via a graphics editor, not in the text translation tool. So remember to save them in editable format. Use numbers instead of text for callouts.

To keep your costs down, use single sourcing, and don’t send topics that haven’t changed and therefore don’t need translating again. Keep track of your files and graphics, so that you know what has changed.

Always assess whether extra translation costs are worth it. For example, you will need someone to do a native language review of the translated content. You may be able to ask someone in-house to do this, if there’s someone who speaks that language. Or you may have to outsource this work too.

On a software GUI and on labels in diagrams, be aware that the translation can often take up extra space on the GUI or diagram.

When working with translators/localisers recommended by other people, do your research to see the work they’ve done and any other references. Check the software they use to make sure it’s compatible with yours. Also ask them if they use translation memory software. Many don’t, and this can increase your costs dramatically.

A question from the floor: Who owns the translation memory?

Emily’s answer: We always make sure we get the translation memory back. This helps if you have to start using a different translator. Most translation tools can read each other’s format.

Once you’ve chosen your translator, share knowledge with them. Bring them on board, so that they can produce quality work. Spend time with them reviewing the materials and setting the expectations.

Appoint a person in your company to be the liaison person between you and the localiser, especially if you want to build a long-term relationship with them.

Tools for localising a software GUI (as opposed to documentation): Passolo; WizArt.

Remember that you need to relate the documentation to the GUI, where one or both may have been localised. So your documentation, when translated, may still need to refer to the untranslated GUI terms if the GUI has not been translated.

Managing the completed localised files: Your computer needs the international fonts/languages used in the translated content. Your online help generator must be able to handle the new language too. You also need a good archive system, to keep track of the duplicate sets of files you will now have, one for each language.

Emily also gave us very useful tips about the administrative side of things e.g. keeping a job log and keeping track of the text on images for translation.

Question time produced some interesting questions and comments:

1) Are audio translations similar to document translations?

Answer: You would translate from the scripts. This is usually relatively cheap because they tend to be simple. Then there would be an additional cost if you want to record the voice audio or add subtitles to a video.

2) A hint about translating images: Export them in SVG format. That’s very easy to translate because it’s just XML. Then convert them back to raster or whatever is required for your documentation.

A great presentation, Emily!

%d bloggers like this: