What is a Nested Diagnostics Context?
Let's say that we're writing a message driven bean (MDB) that we'll
deploy on Glassfish. Let's say
that the MDB's onMessage()
method grabs the payload of the message and calls into a stateless
session bean (SLSB) for processing. Let's say that the implementation
of the SLSB
calls into org.apache.xparser:
my.company.MDB > my.company.SLSB > org.apache.xparser
Let's say that he xparser
package may log some warnings if the
payload is not properly formatted. No problem so far. Now let's say
that the application is put in production together with a dozen other
applications and that many of these applications use this library. The
administrator once in a while finds these warnings in Glassfish's server.log:
[#|2007-09-17T18:36:03.247-0400|WARN|sun-appserver9.1|org.apache.xparser.Parser |_ThreadID=18; ThreadName=ConsumerMessageQueue:(1); |Encoding missing, assuming UTF-8|#]
Let's say that the administrator wants to relay this information to the developer responsible for this application. Using the category name org.apache.xparser.Parser, the administrator can find out what code is responsible (a third party component in this case), but how can the administrator find out which application is responsible for this log output?
One approach is to always log
the application name before calling into the SLSB, so that the
administrator can find the application name using the _ThreadID: he would look at the
_ThreadID of the warning,
then look for a message earlier in the log that has the same _ThreadID that identifies the
application. Not only is this
cumbersome, it's also a big problem that the application now fills up
the log with the application name just in case the SLSB would log
something.
It would be nice if somehow the MDB could associate the thread with
the application name, so that if code downstream logs anything, the log
message will be adorned with the application name:
[#|2007-09-17T18:36:03.247-0400|WARN|sun-appserver9.1|org.apache.xparser.Parser |_ThreadID=18; Context=Payrollsync; ThreadName=ConsumerMessageQueue:(1); |Encoding missing, assuming UTF-8|#]
In Log4J, this is quite simple using Log4J's NDC class: before the MDB calls
into the SLSB, it would call NDC.push("Payrollsync")
to push the context onto the stack, and after the SLSB it would call NDC.pop(). NDC stands for
Nested Diagnostic Context. It's called nested because it maintains a
stack, so that the SLSB could push another context onto the stack,
hiding the context of the MDB, and pop the context off the stack to
restore the
stack in its original state before returning. Of course each thread has
to have its own stack.
The NDC is a nice facility in Log4J. Unfortunately, in java.util.logging there's no such facility. Let's build one!
Building a Nested Diagnostic Context
The Nested Diagnostics Context will have to keep a stack per thread.
When
the logging mechanism logs something, it needs to peek at the stack
and add the top most item on the stack to the log message. The stack
needs to be accessible somehow by both the application that sets the
context
and by the logging mechanism. A complicating factor in this is that it
needs to work with both delegating-first
and self-first classloaders.
The latter is found in some web applications (special setting in sun-web.xml) and in some JBI
components. Furthermore, we would like to use this mechanism in
Glassfish and avoid having to make changes
to the Glassfish codebase. Lastly, we need to avoid making
changes to the application that would cause the application to be no
longer portable to other application servers.
Logger.getLogger("com.sun.EnterContext").fine("Payrollsync");
slsb.process(msg.getText());
Logger.getLogger("com.sun.ExitContext").fine("Payrollsync");
The loggers com.sun.EnterContext
and com.sun.ExitContext
are special loggers that we'll develop; messages written to these
loggers directly interact with the context stack. Through these special
loggers, this example will result in adding the context to any log
messages that are
produced in the slsb.process(msg)
call. On other application servers without these special loggers, this
will result in logging the
context at FINE level
before and after the call to the SLSB is made, so that one can
associate
a log message using the _ThreadID;
it will not do anything if FINE
logging is turned off.
What if we want to add more than one context parameter to the log message? For instance, what if we want to add the ID of the message that we're processing?
Logger.getLogger("com.sun.EnterContext") .log(Level.FINE, {0}={1}, {2}={3}, new Object[] {"Application", "Payrollsync", "Msgid", msg.getMessageID()});
slsb.process(msg.getText());
Logger.getLogger("com.sun.ExitContext") .log(Level.FINE, {0}={1}, {2}={3}, new Object[] {"Application", "Payrollsync", "Msgid", msg.getMessageID()});
The special logger will take the Object[] and push these on the
stack. The message string "{0}={1},
{2}={3}" is there merely for portability: if the the code is
deployed onto an
application server to which we didn't install the NDC facilities, this
will simply log the context parameters at FINE level.
Implementation
In a stand alone Java application, you would simply set your own LogManager and implement the NDC functionality there. Glassfish already comes with its own LogManager, and we don't want to override that. Rather, we want to plug in new functionality without any changes to the existing code base. Here's what we need to do:- create the special loggers com.sun.EnterContext and com.sun.ExitContext
- hookup these special loggers
- hook into the log stream to print out the context
To create the special loggers, we can simply create a new class that
derives from java.util.logging.Logger,
say EntryLogger. Next, we
need to make sure that when someone calls Logger.getLogger("com.sun.EnterContext"),
it will be this class that is returned. Without making any changes to
the LogManager, the way
that that can be accomplished is by instantiating the new EntryLogger and registering it
with the LogManager
immediately. This has to be
done before anybody calls Logger.getLogger("com.sun.EnterContext").
In other words, we should do this before any application starts. In
Glassfish there's an extensibility mechanism called LifeCycleListeners. An object
that implements this interface can be loaded by Glassfish automatically
upon startup.
Lastly, we need to find a way to add the context to the log entries
in the log. Glassfish already has a mechanism to add key-value pairs to
each log entry: when formatting a LogRecord
for printing, Glassfish calls LogRecord.getParameters()
and checks each object in the returned Object[] for objects that
implement java.util.Map
and java.util.Collection.
For objects that implement java.util.Map,
Glassfish adds the key-value pairs to the log message. For objects that
implement java.util.Collection,
Glassfish adds each entry as a String to the log message.
If each LogRecord can
somehow be intercepted before it reaches Glassfish's Formatter, the context can be
added as an extra parameter to the LogRecord's parameter list.
This can be done by adding a new java.util.logging.Handler
to the root-Logger before
Glassfish's own Handler.
For each LogRecord that
this new Handler
receives, it will inspect the Context stack and add a Map with the Context to the LogRecord. Next, the root-Logger will send the LogRecord to Glassfish's own Handler which takes care of
printing the message into the log. Once again, the LifeCycleListener
is the ideal place to register the new Handler.
Give it a spin!
You can download the jar that has these new classes and/or download the sources. Put the jar in Glassfish's lib directory. Restart the server and install the LifeCycleListener:
4 comments:
In my opinion, Maped Diagnostic Context (MDC) is better than NDC. I may be wrong but NDC requires push and pop..If a developer forgets to pop then it may cause some memory leaks. MDC how ever useds maps.
Re Rajan:
Correct, the NDC requires push and pop. Through that, a thread can go from module to module and maintain a debugging context.
Although the MDC certainly has its merits, going through multiple modules may be confusing because values are overwritten. And if values are not removed, they may confusion in subsequent calls, especially if threads can be dispatched to different applications.
Nevertheless, the same approach as described in this blog entry can be used to hookup an MDC in the jdk logging package.
Frank
An update: see http://frankkieviet.blogspot.com/2009/09/lessons-of-interesting-deadlock-problem.html
This is a nice feature in Glassfish - and is particularly useful with the JBI BPEL Service Engine.
There is a problem with this code when glassfish runs on JDK1.6.0_18 (that is not present with at least JDK1.6.0_11 or earlier). The java.util.logging.LogManager now maintains weak references to its loggers (including the NDCEntryLogger and NDCExitLogger). And so these are garbage collected by the container.
If the NDCProvider code is changed to declare the NDCLogger references as static this stops them from being garbage collected.
Post a Comment