Saturday, October 6, 2007

Server side Internationalization made easy

Last year I wrote a blog entry on my gripes with Internationalization in Java for server side components. Sometime in January I built a few utilities for JMSJCA that makes internationalization for server side components a lot easier. To make it available to a larger audience, I added the utilities to the tools collection on http://hulp.dev.java.net. What do these utilities do?

Generating resource bundles automatically

The whole point was that when writing Java code, I would like to keep internationalizable texts close to my Java code. Rather than in resource bundles, I prefer to keep texts in my Java code so that:

  1. While coding, you don't need to keep switching between a Java file and a resource bundle.
  2. No more missing messages because of typos in error identifiers; no more obsolete messages in resource bundles
  3. You can easily review code to make sure that error messages make sense in the context in which they appear, and you can easily check that the arguments for the error messages indeed match.

In stead of writing code like this:

        sLog.log(Level.WARNING, "e_no_match_w_pattern", new Object[] { dir, pattern, ex}, ex);

Prefer code like this:

        sLog.log(Level.WARNING, sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}"
          ,
dir, pattern, ex), ex);

Here's a complete example:

public class X {
    Logger sLog = Logger.getLogger(X.class.getName());
    Localizer sLoc = Localizer.get();
    public void test() {
        sLog.log(Level.WARNING, sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}"
          ,
dir, pattern, ex), ex);
    }
}

Hulp has an Ant task that goes over the generated classes and extracts these phrases and writes them to a resource bundle. E.g. the above code results in this resource bundle:

# DO NOT EDIT
# THIS FILE IS GENERATED AUTOMATICALLY FROM JAVA SOURCES/CLASSES
# net.java.hulp.i18ntask.test.TaskTest.X
TEST-E131 = Could not find files with pattern {1} in directory {0}\\: {2}

To use the Ant task, add something like this to your Ant script, typically between <javac> and <jar>:

<taskdef name="i18n" classname="net.java.hulp.i18n.buildtools.I18NTask" classpath="lib/net.java.hulp.i18ntask.jar"/>
<i18n dir="${build.dir}/classes" file="src/net/java/hulp/i18ntest/msgs.properties" prefix="TEST" />

How does the Ant task know what strings should be copied into the resource bundle? It uses a regular expression for that. By default it looks for strings that start with a single alpha character, followed by three digits followed by a colon, which is this regular expression: [A-Z]\\d\\d\\d: .\*.

Getting messages out of resource bundles

With the full English message in the Java code, how is the proper localized message obtained? In the code above, this is done in this statement:

sLoc.t("E131: Could not find files with pattern {1} in directory {0}: {2}", dir, pattern, ex)

The method t takes the string, extracts the message ID out of it (E131) and uses the message ID plus prefix (TEST) to lookup the message in the right resource bundle, and returns the substituted text. The method t lives in class Localizer. This is a class that needs to be declared in the package where the resource bundles are placed. The class derives from net.java.hulp.i18n.LocalizationSupport. E.g.:

public static class Localizer extends net.java.hulp.i18n.LocalizationSupport {
    public Localizer() {
        super("TEST");
    }
    private static final Localizer s = new Localizer();
    public static Localizer get() {
        return s;
    }
}

The class name should be Localizer so that the Ant task can be extended later to automatically detect which packages use which resource bundles.

Using the compiler to enforce internationalized code

It would be nice if the compiler could force internationalized messages to be used. To do that, Hulp includes a wrapper around java.util.logging.Logger that only takes objects of class LocalizedString instead of just String. The class LocalizedString is a simple wrapper around String. The Localizer class produces these strings. By avoiding using java.util.logging.Logger directly, and instead using net.java.hulp.i18n.Logger the compiler will force you to use internationalized texts. Here's a full example:

public class X {
    net.java.hulp.i18n.Logger sLog = Logger.getLogger(X.class);
    Localizer sLoc = Localizer.get();
    public void test() {
        sLog.warn(sLoc.x("E131: Could not find files with pattern {1} in directory {0}: {2}"
          ,
dir, pattern, ex), ex);
    }
}

Logging is one area that requires internationalization, another is exceptions. Unfortunately there's no general approach to force internationalized messages in exceptions. You can only do that if you define your own exception class that takes the LocalizedString in the constructor, or define a separate exception factory that takes this string class in the factory method.

Download

Go to http://hulp.dev.java.net to download these utilities. The jars (the Ant task and utilities) are also hosted on the Maven repository on java.net.

2 comments:

Thomas Einwaller said...

I like your idea - another advantage would be that I do not have to restart my web application for every small text change.

Was the word "Internationaliation" in your header your intention or just a typo?

Frank Kieviet said...

Re Thomas:

About web-apps: you bring up an interesting idea where this same concept can be extended to HTML files!

Yes, I mistyped the word in the header. Thanks; I fixed it. Internationalization is a tough word to type.

Thanks for your feedback!

Frank