Thursday, November 23, 2006

Translating a web site into Chinese and Japanese

We are a small company selling software as a service. Most of our leads come from advertising on Google, MSN and Yahoo and although we are European, 70% of our customer base is located in the US. This is not surprising because our web site is only available in English and French and our advertising too. Considering that most of our competitors are located in the US too, we can assume that the cost of a lead in the US is much higher. So we have decided to tap into the great reservoir of non English-speaking countries and have our web site translated in 8 languages, among which Chinese and Japanese.

Word documents

We have put our web content into Word 2003 to get it translated by professional translators. Obviously the Chinese and Japanese documents that we have received were not properly displayed in Word. We have found on the web that we had to install the Proofing Tools for Microsoft Office to display the content, which we have done and it works. We have realized later on a computer that had downloaded the Asian language packs for Internet Explorer that this is another way to get the Japanese and Chinese content properly displayed in Office, and contrary to the Proofing Tools, it is free. In fact Word 2003 is Unicode and you just need the proper fonts.

Html pages

The next step is to get the Word document into HTML. Apparently copying and pasting from Word to Dreamweaver works quite well but I have had so many issues in the past with the way Word handles HTML that I have preferred another way. You need to choose whether you will have an UTF-8 encoded page or whether you will use a code page (GB 2312 for simplified Chinese). A code page will produce a more compact file but this is the old way. We have decided to use UTF-8 for our entire site. In Word, save your word document as “filtered html” and in the Save As dialog select “Web options”. Select the UTF-8 encoding and a simplified Chinese font. Do the same for Japanese. Then you can open your new html document in Dreamweaver and copy paste reliably within Dreamweaver.

You will need to add the following meta tags in your translated html pages:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Language" content="zh" />

Images

We use Adobe ImageReady and Macromedia Fireworks for our images. Fireworks is less powerful but the use of the PNG file format makes it much easier to maintain large quantities of image files because Windows Explorer displays the thumbnails. When you copy Chinese characters from Word or Html into ImageReady, only half of them display properly and the others are replaced by question marks. In fact ImageReady selects the MS Gothic font by default when it should paste SimHei or SimSon. These fonts are not even displayed in the font drop-down list but you can key in the font name and this works.

Finally, no matter how well you prepare your work with the translators, you will realize that some strings will be too long to fit your buttons or some last minute changes will not have been taken into account or else and you will find much easier to get an approximate automated translation from http://www.systransoft.com/index.html.

You can check the result at http://www.velodoc.com/zh/ and http://www.velodoc.com/ja/. To download the Asian Pack, simply select View -> Encoding -> More -> Chinese Simplified in Internet Explorer.

No comments: