Say that I were to build a product, and I do a nice job to make it internationalizable. The product is a success, and it is localized into various languages. Next, a customer in a far-away place sends an email complaining about the server failing to start up. For my convenience, he attached the log file to it. Since I did a good job writing decent error and logging messages, I expect it to be no problem to diagnose what's going wrong. But oops, since I did such a nice job internationalizing the product, the log is in some far-away language, say Japanese! Now what?
Let me give exemplify the example. Let's say that the log contains this entry:
(F1231) Bestand bestelling-554 in lijst bestellingen kon niet geopend worden: (F2110) Een verbinding metLooks foreign to you? (It doesn't to me, but I would have great problems if this example were in Japanese).
server 35 kon niet tot stand gebracht worden.
Fortunately, through the error codes, we could make an attempt to figure out what it says by looking up the error codes. However, if there are many log entries, this becomes a laborious affair. Would it be possible to obtain an English log file without tedious lookups and guess work?
I think there are a few different approaches to this problem:
- always create two log files: an English one and a localized one.
- store the log in a language-neutral form, and use log viewers that render the log in a localized form
- try to automatically "translate" the localized log file into English
try {The exception message is already localized when it is thrown. E.g. in the catch-block, there is already no language neutral message anymore. It would have been nice if there were a close cousin to the Object.toString() method: one that takes the locale: toString(Locale) and if the Exception class would take an Object instead of limiting itself to a String.
...
throw new Exception(msgcat.get("F2110: Could not establish connection with server {0}", serverid));
} catch (Exception e) {
String msg = msgcat.get("F1231: Could not open file {1} in directory {0}: {2}", dir, file, ex);
throw new IOException(msg, ex);
}
In a previous product where I had more control over the complete codebase, I approaches this problem by introducing a specialized text class that supported the toString(Locale) method, and Exception classes that could return this text class. This solution was also ideal for storing text in locale-neutral form in a database, so that different customers could view the data in different locales.
There is a kludgy work-around: we could change the msg.get() method so that it returns a String that is locale neutral rather than localized. A separate method would convert the locale neutral String into a localized String, e.g. msg.convert(String, Locale). This method would have to be called any time a String would be prepared for viewing, e.g. in the logging for a localized log.
In the products that I am currently working on, these approaches to support locale-neutral strings are not feasible because they would require widespread. So let's take a look at option 3.
Given the resource bundle
F1231 = Bestand {1} in lijst {0} kon niet geopend worden: {2}and
F2110 = Een verbinding met de server {0} kon niet tot stand gebracht worden.
F1231 = Could not open file {1} in directory {0}: {2}let's see if there is a way to automatically translate the message
F2110 = Could not establish connection with server {0}
(F1231) Bestand bestelling-554 in lijst bestellingen kon niet geopend worden: (F2110) Een verbinding metinto
server 35 kon niet tot stand gebracht worden.
(F1231) Could not open file bestelling-554 in directory bestellingen: (F2110) Could not establish aI think it is possible to build a tool that can do that. The tool would read in all known resource bundles (possibly by pointing it to the installation image, after which the tool would scan all jars to exttact all resource bundles), and translate them into regular expressions. It would have to be able to recognize error codes (e.g. \\([A..Z]dddd\\) ) and use these to successively expand the error message into its full locale neutral form. In the example, the neutral form is:
connection with server 35.
[F1231, {0}=bestellingen, {1}=bestelling-554, {2}=[F2110, {0}=35]]The neutral form then can be easily converted into the localized English form.
4 comments:
Aren't you speaking essentially about reverse localization ? To get ASCII messages from a localized one. Why the name re localization ?
I used the term "relocalization" because the goal is to translate a set of messages from one locale to the other, e.g. from Japanese to English.
Just google for relocalization and reverse localization you'll get totally different meanings, my point is this may confuse some reader who already are familiar with the terms.
I see your problem with the term relocalization. Thanks! I'll change it to "translation".
Post a Comment