Showing posts with label Sun. Show all posts
Showing posts with label Sun. Show all posts

Saturday, July 31, 2010

GlassFish Security

Pact Publishing was kind enough to send me a copy of the book "GlassFish Security" that was released very recently.

image

It's tough to find the time to read a book cover to cover. In fact, it's been a while since I've read a book from beginning to end. Typically I'm only interested in a few chapters which I then read. Later, when the need arises, I may get back to other chapter. It's like treating a book like an encyclopedia or a dictionary. I bet that most people read technical books that way nowadays. So it's important that a book lends itself to be read that way.

Security is a very broad field with many many different topics, e.g. user authentication and authorization in web applications, integration with an external security server, web services security, and so on. Fortunately, very few people will have to deal with all these aspects at the same time. GlassFish Security covers many if not all of those aspects, and that's another reason why it should be possible to read this book like an encyclopedia.

I went back in time and tried to remember all the times that I have had anything to do with security in GlassFish and checked what the book had to say about it. For instance, to start simple, there's the issue of configuring realms and users. There's the time I tried to get JSPWiki to work on GlassFish using declarative authentication and authorization, something that goes through server.policy. More advanced, there's the time we tried to integrate GlassFish with AccessManager. On all these topics, the book delivers. There are many other topics (e.g. integration with OpenSSO, securing JMX, etc), all described in detail without wasting much space on repetition or endless code or XML listings. The book would have saved me a lot of time had I had it at the time when I needed it.

If you're using GlassFish, this book belongs on your bookshelf!

Monday, April 26, 2010

Working with java.net: tips and tricks for project owners

In the past two years I've managed the OpenESB open source project. This project is hosted on java.net. In this post I outline a number of tips and tricks I've learned while managing this project on java.net. This is useful information for my successor, but it also may be useful to other people who are managing projects on java.net.

Web presence

An important factor in the success of an open source project is its web presence: it will be the first point of contact between a potential new user and the project.

New visitors will want to see answers to these four items immediately:

  • What is it?
  • License?
  • Getting started
  • Downloads

It is important that the links to these pages are static, i.e. that they do not change. That is because a new visitor may enter the site through a link on a different site or through a search engine. For this very reason, it is especially important to keep a single downloads page rather than a page per version, so that users do end up immediately on the latest version, rather than a previous (old) version.

Looking at the front page of OpenESB, these four items are immediately apparent, and the links are permanent. Two more items are important: a visitor will judge a project for it being

  • Alive
    • Active mailing list
    • Recent/frequent announcements
  • Professional
    • Well organized, thought through
    • Documentation easily found and comprehensive
  • Inviting
    • New users should feel welcome to not just use the product, but also to participate in the community.

You be the judge how well these goals have been met. Let's look at the mechanics of java.net now.

openesbsite

java.net project web site

How java.net processes HTML files

The HTML files for a project's web site are stored in the project's VCS system in a directory called www. There's a big twist here: although the HTML files are ordinary HTML files, i.e. with an <HTML>, <HEAD> and <BODY> tag, the java.net web server reads the HTML files and does a number of substitutions on the file before it writes the resulting file to the client.

Conceptually, there is a java.net template HTML file which has its own <HEAD> element, its own <BODY> element with static content, and some placeholders in these two elements for:

  • some elements of <HEAD> element of the user defined HTML file, e.g. <TITLE>
  • the contents of the project_tools.html file
  • the contents of the <BODY> element from the user defined HTML file

As such, the project owner has to perform special tricks to have the page displayed in other ways than the standard java.net way.

In OpenESB we have several java.net projects. All of them should have the same look and feel, i.e. the OpenESB "brand" and a common navigation. The same goes for the OpenESB wiki.

How OpenESB organizes HTML files

Each project has the same project_tools.html file. This file takes care of the following:

  • It includes custom CSS files. These files override the java.net styles.
  • It hides the standard java.net navigation bar. For admins, the navbar display can be turned on: the state is maintained in a cookie. The value in the cookie can be toggled by loading the admin.html file in OpenESB: https://open-esb.dev.java.net/admin.html.
  • Loads the menu. The menu is defined in a separate JavaScript file.
  • Displays the search box
  • Sets an event handler that is invoked when the page's loading is complete. The event handler manipulates the layout of the page and invokes Google Analytics.

As a result, the other HTML files in OpenESB or any of its sub projects don't have to bother with the common look and feel, the menu, etc. All of that is taken care of by project_tools.html, which as mentioned should be duplicated to all the OpenESB projects.

Some or if you will, many, HTML files in OpenESB have another common style element: a right hand side bar that displays common advertising for downloading GlassFish ESB, the wealth of components, and the a partner highlight. This right side bar is taken care of a separate JavaScript file. Files that should display this right hand side bar should explicitly be formatted to do so: it should contain a table in which the right column loads the JavaScript of the right hand side bar.

Previewing HTML

The fact that the look and feel of an HTML file is now defined in project_tools.html makes it difficult to see what a file will actually look like when deployed (i.e. checked in into the project's VCS) on java.net. There is a workaround: I created a directory that when deployed on a web server, emulates the java.net environment. To edit a file with the ability to preview it without checking in the file, follow this procedure:

  1. Load the file in your browser through java.net's web server, e.g. https://open-esb.dev.java.net/Downloads.html.
  2. Save this file to the emulation directory, e.g. Tomcat/webapps/ROOT/Downloads.html
  3. Now load the file into your browser through your local web server, e.g. http://localhost:9080/Downloads.html. This should look identical to what was loaded from java.net.
  4. Locate the text in the file to be edited. Change the text and save the file. Reload the file in your browser. Repeat this step until the file looks OK.
  5. Do a diff using a tool like Araxis Merge, or Beyond Compare between the local file and the file that was checked in into VCS, e.g. diff Tomcat/webapps/ROOT/Downloads.html and open-esb/www/Downloads.html.
  6. Merge the changes into the VCS file, i.e. open-esb/www/Downloads.html. Save and check in into VCS.

Here is the directory for Tomcat. If you use it for projects other than OpenESB, you can remove many of the files. Make sure to leave branding and css directories in tact. If you intend to reuse project_tools.html, don't forget to change the Google Analytics account ID.

SSL

Unfortunately, java.net uses SSL for all its files. This is unfortunate for two reasons:

  • Files do not get cached in the browser: every time a user opens his browser and goes to OpenESB, all the images, style sheets, etc are reloaded again.
  • To avoid security warnings in Internet Explorer, files that are included in an HTML file also should come through SSL. This rule is violated on the OpenESB main page (of all places!) where it loads a dynamic list of news item from a non-https server.

There's no workaround for this. Collabnet promised a long time ago that SSL would be removed but this still has not happened.

Changing the menu

The menu is defined in a json like format in a file called menu.js. This file is duplicated in the wiki. See below. Make sure to use absolute URLs because the file is used from different projects, i.e. different roots.

Wiki (not on java.net)

OpenESB is not using the java.net wiki infrastructure. I don't recall why this decision was made. New projects should probably evaluate the java.net wiki infrastructure before considering setting up a wiki elsewhere.

The OpenESB wiki is hosted on a machine hosted at Sun/Oracle, not at Collab net. It is collocated with other wikis, e.g. GlassFish, UpdateCenter, etc. The wiki engine is JSP wiki. The templates have been adapted to look and feel like the java.net OpenESB web site.

Since java.net has / used to have a lot of downtime, and to avoid having this downtime impact the wiki, the menu.js file is duplicated on the wiki. That java.net uses ssl is another reason to duplicate the file. For the same reason, the icons displayed in the menu are also duplicated on the wiki server.

Users are managed separately from java.net. User management / restrictions are necessary to avoid spam. Spam has been a been big problem on the OpenESB wiki and the other wikis collocated with it. Cleaning up spam can be done by directly manipulating the text files that make up the wiki page.

JSPWiki security is configured to use JAAS. This fact is important if the wiki needs to be setup on a different GlassFish server. In that case make sure to update the server.policy file of GlassFish. The policy file specifies that only "validated" users can edit pages. Users are stored in a user database and a group database. The latter specifies group membership, of which the "validated" group is important. Both files are in the etc directory of the wiki directory.

To add a user, the group database needs to be updated. This can be done through the web interface, http://wiki.open-esb.java.net/EditGroup.jsp?group=Validated or can be done by editing the group file directly.

The raw data files and user management files of the wiki can be accessed through SSH.

Since the user management of the wiki is separate from java.net, users can choose a different userid on the wiki than they do on java.net. Allowing this is a mistake in retrospect: editing the wiki requires a SCA (Sun Contributor Agreement) or employment in Sun/Oracle, and having to manage/audit two sets of users instead of one is a waste of time. Again, in retrospect.

openesbwiki

java.net user management

Anybody on java.net can request developer privileges. When that happens, an email is sent by java.net to the project owner. An automated process is setup that replies automatically to the user list asking the applicant for his motivation, background, etc. This reply is specified in role-approval.policy in the www directory.

For some reason, many people ask for developer privileges. They are not known, have not contributed or communicated before, and are never heard from again.

Consequently, this automated process sounds nice, but is useless in practice. A better process is that someone first communicates on the mailing list, proves he/she is serious and able to contribute, after which the project owner grants privileges. Before privileges are granted, a signed SCA needs to be on file. This process is documented on the OpenESB web site.

The link to the membership management page is displayed in the java.net nav bar. Recall that the nav bar is hidden. To toggle display of the nav bar, load https://open-esb.dev.java.net/admin.html. Alternatively, jump to the page directly: https://open-esb.dev.java.net/servlets/ProjectMemberList.

When a user is granted commit privileges to a project, he/she also automatically gets commit privileges to the code repositories of the sub projects. E.g. granting commit privileges to open-esb also gives commit privileges to open-jbi-components, a sub project of open-esb. This is only the case if the VCS of the child project is of the same type as the parent project. Since open-jbi-components and open-esb both use CVS, open-esb committers have access to open-jbi-components. However, if the VCS system is different, e.g. openesb-builds uses SVN, separate access to this project needs to be granted next to open-esb.

java.net email management

Email lists are managed by an email list manager running in java.net. This email list manager is only partially integrated in java.net. Users can click on a button on the java.net site to subscribe to an email list. What effectively happens is that it will register the email address that is associated with that user at that moment in the list server. Similarly, a user can unsubscribe through a button, which will remove the email list associated with that user at that moment. Where this knowledge comes in handy? There is confusion if the user changes his email address associated with his java.net id between subscribing and unsubscribing. Sometimes users get so confused that they'll need help unsubscribing from the list. After all the last thing you want is repeated posts of "how do I unsubscribe from this list?".

Here is another task with respect to email: spam. Protecting the list to spam is very important. Therefore lists should be setup as "discuss" or "moderated". A setting of "discuss" means that only "subscribed" or "allowed" posters can post. If someone posts a message who is not in the "subscribed" or "allowed" list, the message will be forwarded to the list owner(s) for review. Replying to that message will allow the message to be posted. To avoid getting too many of these "review messages", frequent posters can be added to the "allowed" list. The "allowed" list is accessible from the email management page.

The link to the email list management page is displayed in the java.net nav bar. Recall that the nav bar is hidden. To toggle display of the nav bar, load https://open-esb.dev.java.net/admin.html. Alternatively, jump to the page directly: https://open-esb.dev.java.net/servlets/ProjectMailingListList.

Nabble is a view on the mailing list. OpenESB has integrated it on the web site. Note that posters are identified by their "from" email address only, but posters on Nabble somehow do not need to be in the "allowed" of "subscribed" list. This is fortunate, but puzzling from a technical point of view.

openesbemail

This brings me to one more email list related tasks: once in a while, someone posts an email that he/she wants to revoke. For example, someone could accidentally send a private email to the list, or include confidential (customer information for example) information in an email.

Of course there's no such thing as undoing the sending of an email. Subscribed users will have received the email by the time that the sender notices the mistake. However, there is an option to remove the email from Nabble and from the java.net mail browser. Note that there are two places where the email needs to be removed: both Nabble and java.net. Removing the mail there at least makes the mistake less visible and will make it harder for the mistaken email to be picked up by search engines.

java.net automation

Some of the management tasks on java.net can be automated. Kohsuke has developed an extensive set of tools that provide a Java api to java.net web interface: https://javanettasks.dev.java.net/.

I've written a few tools to make community management easier. One such tool is used to make an inventory of all the users in the community with commit privileges. Everybody in this list should either be a Sun/Oracle employee, or should have signed an SCA. The tool loads and parses the list of signatories (from https://sca.dev.java.net/CA_signatories.htm) and compares this list with the one extracted from java.net. Discrepancies are people who are currently or have been Sun/Oracle employees. Since it is difficult to track who is no longer an employee (especially with the layoffs that used to happen every now and then), the tool can email all "discrepancies" and ask them to either submit an SCA or to verify employment by clicking a link pointing to an internal server. These tools are available on https://open-esb-build.dev.java.net/svn/open-esb-build/trunk/community-mgt/.

Wednesday, April 21, 2010

OpenESB under Oracle

It's been a few months since Oracle acquired Sun. At the time there were a lot of questions about what Oracle would do with OpenESB. On Feb 15, I posted a plan to the users mailing list. What was the plan? And how is the plan coming together? How is OpenESB doing a few months after the acquisition? What will the future hold for OpenESB?

First, let's try to understand Oracle's perspective. Oracle already has an integration product: the SOA Suite. Through the BEA acquisition it got another one, and through the Sun acquisition it got yet two more: CAPS and OpenESB. Of course there's no sense in keeping high levels of investments in all these products; it makes much more sense to focus on one product. What is this strategic product? Oracle is very clear about that: Oracle's strategic SOA middleware platform is the Oracle SOA Suite. Consequently, Oracle has reduced the level of investment in OpenESB.

That was the bad news. Now for the good news. Oracle could have just pulled out from OpenESB and let it fall into a black hole. Or worse, it could have taken the site down, removed the downloads, or put other obstacles in place. It did not. It did quite the contrary.

Although Sun was the main sponsor behind OpenESB, Oracle recognized that a lot of  people made investments into OpenESB, either in the community or into their own company by using OpenESB, and with a feeling of responsibility and fairness, Oracle decided to help the community to find a way to stand on its own, so that all these investments would bear fruit for as long a time as possible.

Did you invest in OpenESB? Let's look at what will change for you depending on the kind of investment you made.

What will change for you

What will change for you if you bought GlassFish ESB support? Nothing will change: you are still fully supported. The same Sun SOA Support department is still available to help on issues. Should patches be created to address issues, the full Sun SOA Sustaining department is still there to create patches.

Are you using GlassFish ESB but did not buy support? Unlike Sun, Oracle is no longer trying to sell GlassFish ESB licenses to new customers. Instead, you may rely on community support, or commercial support provided by one of the OpenESB community partners. The user mailing list is nowadays less frequented by Sun/Oracle engineers, but other community members have stepped up and are keeping the mailing list responsive. More about that below.

Are you investing in OpenESB by contributing code or other artifacts, or are considering to do so? You will find that it has become easier to contribute. I've put together a new governance document that gives greater freedom to community contributors. Oracle can still exert influence, but that is intended to keep the peace in the community should that be necessary. Overall, you will find it easier to propose and implement new changes, to commit code, and last but not least, to influence the roadmap. This brings us to the future of OpenESB.

The future of OpenESB

The future of OpenESB revolves around OpenESB becoming an open source community that can stand on its own, i.e. without Sun or Oracle as the single major sponsor. That transformation will of course not happen overnight, and Oracle is committed to help with this transformation. For instance, Oracle will do periodic builds and post these on the downloads site. Another commitment is that Oracle will merge patches that it makes for customers, back into the open source repository.

What did Oracle do so far? Next to the governance document I already mentioned, the sources of GlassFish ESB v2.2 can now be found in the open source repository. The HL7 BC and the WLM SE are also back in the open source repository, and the binaries can be downloaded from the downloads page. The process of an automatic periodic build has incurred some delays due to technical issues, but is well on its way.

OpenESB today

How is OpenESB doing today? Let's look at the users mailing list. From the Users mailing list on Markmail, it can be seen that the list activity has declined a bit, but is not much below the activity of that of a year ago.

emailactivity

What we also can learn from Markmail is that community members who are not on the Oracle payroll are stepping up. For example, here are the most active posters for March:

emailactivity2

Another metric that we can look at is the number of users. In GlassFish ESB v2.1 we introduced a feature in which NetBeans checks for updates upon startup. By looking at the traffic to the updates-server, we can estimate how many users there are of GlassFish ESB. I defined the number of users as the number of NetBeans installations that ping the server at least three times in a time window greater than five days. As can be seen, the number of users is going up.

useractivty

In terms of a roadmap there are no concrete proposals from the community yet, but several members have expressed interest in continuing with Fuji.

More good news: a few new committers have joined the project and are contributing code.

Moving on… next stop: Google

I've decided it's time to move on. Yesterday I put in my resignation at Oracle. I'll be starting at Google in two weeks.

Why? Was Oracle such a bad place to work? Was it such a culture shock for a Sun employee that it just could not work? Could I not stand my manager? Could I not get along with my new co-workers?

If you're expecting a bitching session about big-bad-Oracle, I have to disappoint you. In the months that I've been at Oracle, I've come to the conclusion that it is a pretty decent company to work for. I like my manager, my co-workers are intelligent and pleasant, and the culture at Oracle is very similar as it was at Sun. So, honestly, no complaints there.

The reason for my leaving is that I've been doing SOA for 7+ years now. I've been involved in the complete lifecycle of two products: CAPS and OpenESB. Now, with my transition into Oracle, I was about to commit to a third SOA product. Before I dove into that, I did some soul searching, and realized that I ready for something new, something in a different field.

As I was looking for a change, I looked at what interesting companies there are here in Orange County. Google and Amazon came to mind. Google is said to be the engineering Valhalla, so I submitted my resume there. And surprise, surprise, I got an offer.

It's a big jump. It's starting from the bottom again… jumping into something brand new. Will I like the change? I'm eager to find out!

feelinglucky

Thursday, February 25, 2010

I got a patent

Today I got a pleasant surprise when I got into the office: I found a big box on my desk. It turns out it was a nice engraved plaque of my patent on "Protected Concurrent FIFO" message processing.

Image1

The funny thing is that I didn't even know that the patent was awarded. After the patent application in 2004 or 2005, I don't even remember when it was, I lost track of its status. Emails to the patent lawyer went unanswered.

I googled for the text of the actual patent (US7644197 Queue management by multiple processors) and found a PDF which I copied to Mediacast.

A thank-you to Jerry Waldorf (my boss during the SeeBeyond days and co-applicant of the patent) for asking me to work on this subject, and a thank-you to Sun for the plaque!

By the way, I have often said that almost all software patents are nonsense and inhibit progress rather than promote it. I still feel this way. Does that make me a hypocrite? Not necessarily: companies are forced to submit patents applications so that they can defend themselves against patent litigation from competitors. In that sense I'm happy that Sun got another patent. And that I'm happy that my name is on it: well, yes, I guess that's an ego-thing.

Friday, January 15, 2010

How to enlist and delist XA resources: trial and error

Once in a while a problem pops up with how a middleware stack supports XA. The problem I'm discussing here is the one of how to exactly enlist and delist XAResource-s with isSameRM()returning true. The XA spec is vague on this very topic, so this warrants some exploration.

First of all, what do I mean with "enlist and delist XAResource-s with isSameRM()returning true?"

Enlisting and delisting

When using an XA capable system, the connection needs to be enlisted in the transaction before it is used. A typical sequence on the XAResource, let's call it r1, is:

   1: r1.start(xid, TMNOFLAGS);
   2: ... // use (r1)
   3: r1.end(xid, TMSUCCESS);
   4: r1.commit(xid, true);

With a second resource in the mix, the call sequence may become:

   1: r1.start(xid1, TMNOFLAGS);
   2: ... // use r1
   3: r2.start(xid2, TMNOFLAGS);
   4: ... // use r2
   5: r2.end(xid2, TMSUCCESS);
   6: r1.end(xid1, TMSUCCESS);
   7: r1.prepare(xid1);
   8: r2.prepare(xid2);
   9: p1.commit(xid1, false);
  10: p2.commit(xid2, false);

The transaction is now escalated to a full two-phase transaction. But if r1 and r2 represent the same resource manager, a shortcut can be taken: one resource can "piggy back ride" on the other resource. The isSameRM() method can be used to check if this should be attempted. If it returns true, the sequence may become:

   1: // MULTIPLE ACTIVE
   2: r1.start(xid, TMNOFLAGS);
   3: r1.isSameRM(r2); // returns true
   4: r2.start(xid, TMJOIN);
   5: r2.end(xid, TMSUCCESS);
   6: r1.end(xid, TMSUCCESS);
   7: r1.commit(xid, true);

The transaction now again is a single-phase transaction, and the performance difference with the two-phase commit case is usually very significant.

This sequence doesn't work for a number of popular systems, e.g. MQSeries and Oracle. Unfortunately the XA specification, but that specification is not quite clear at all about what resource managers should be able to do, so there is to some extent some trial and error involved.

Trial and error

How should isSameRM() be used then? By testing a number of different systems, it turns out that for some systems, there can be only one enlisted XAResource active in the transaction. If a second XAResource joins the transaction, the first one should be deactivated. Here is an example:

   1: // SINGLE ACTIVE 1
   2: r1.start(xid, TMNOFLAGS);
   3: r1.isSameRM(r2); // should return true
   4: ... // use r1
   5: r1.end(xid, TMSUSPEND);
   6:  
   7: r2.start(xid, TMJOIN);
   8: ... // use r2
   9: r2.end(xid, TMSUCCESS);
  10:  
  11: r1.start(xid, TMRESUME);
  12: ... // use r1 again
  13: r1.end(xid, TMSUCESS);
  14:  
  15: r1.commit(xid, true);
A variation of this is calling TMSUCCESS instead of TMSUSPEND. In that case TMJOIN should be called instead of TMRESUME:
   1: // SINGLE ACTIVE 2
   2: r1.start(xid, TMNOFLAGS);
   3: r1.isSameRM(r2); // should return true
   4: ... // use r1
   5: r1.end(xid, TMSUCCESS);
   6:  
   7: r2.start(xid, TMJOIN);
   8: ... // use r2
   9: r2.end(xid, TMSUCCESS);
  10:  
  11: r1.start(xid, TMJOIN);
  12: ... // use r1 again
  13: r1.end(xid, TMSUCESS);
  14:  
  15: r1.commit(xid, true);
I ran these tests on a number of systems, and here is what I found:
  isSameRM() Test: multiple active Test: single active 1 Test: single active 2
STCMS yes yes yes yes
JMQ as of 4.4 yes no: throws on line 7 yes
WebSphereMQ 6 yes no: blocks on line 4 no: blocks on line 7 yes
Derby 10.5.3.0 yes no: blocks on line 4 yes yes
Oracle 11.1.0.6 yes no: blocks on line 4 yes yes
MySQL 5.1 yes no: throws on line 4 no: throws on line 5 no: throws on line 7

As can be seen, the SINGLE ACTIVE 2 code sequence works best.

As a side note, MySQL is showing surprising behavior: TMJOIN and TMSUSPEND are not supported (as documented in the MySQL documentation), so why does it bother to return true on isSameRM()? Behavior like this makes it difficult to write portable code: a container now has to provide a wrapper around the DataSource that corrects for this. It would have been much better if it simply had returned false on isSameRM().

How does this relate to what application code can do in for instance a Java EE container? That's the topic of a different blog post.

Tuesday, January 12, 2010

Regex editor now on code.google.com

By popular demand, the sources of the Regular Expression Editor I discussed in a previous post, is now available on code.google.com.

Book "SOA with Java": Rough cut now online

image

A few months ago I got a request from Satadru Roy, one of the authors of the book "SOA with Java", to write a section on Open Source ESBs for this book.

This was a good opportunity to advertise OpenESB, so I eagerly said yes. Satadru suggested that a practical example should be the core of the chapter, so I sat down with two of my colleagues, Murali Pottlapelli and Sujit Biswas, to discuss how the example would look like.

The result is a 17 page chapter that gives an overview of OpenESB. The book is now available for review on Safari. This link will probably be invalidated when the review period ends, but perhaps the link to the book itself is more permanent (the book is also on Amazon).

Tuesday, September 22, 2009

A SEED certificate

With the Oracle acquisition looming on the horizon, and the uncertainty that comes with it, it feels like many organizations within Sun are bringing projects in a state in which they can be transitioned in a clean way. Transitioned meaning adopted or abandoned.

I also find myself doing this: for instance I'm trying to put the finishing touches on JMSJCA, and trying to put Hulp in a state where other people can use it, etc.

Maybe it was in that light that a few days ago, I got a certificate from the SEED organization within Sun. SEED is a mentoring program within Sun in which I had the good fortune to participate. I also was a SEED mentor. Or perhaps I should let the speak for itself.

image

Monday, September 21, 2009

Lessons of an interesting deadlock problem

Two years ago I wrote a blog entry about Nested Diagnostics Contexts in GlassFish. The approach was based on custom Logger objects. Now, two years later when running GlassFish on AIX, an issue with that turned up: a deadlock. Digging into it, there are two lessons I'd like to share: one about deadlocks, and the other one about workarounds.

Deadlock in the JDK

At the time of my writing the NDC facility, calling addLogger() and getLogger() from different threads may cause a deadlock in the JDK classes. Why is this?

This is how the code in the Sun JDK was:

public class Logger {
...
    public static synchronized Logger getLogger(String name) {
	LogManager manager = LogManager.getLogManager();
	Logger result = manager.getLogger(name);
	if (result == null) {
	    result = new Logger(name, null);
	    manager.addLogger(result);
	    result = manager.getLogger(name);
	}
	return result;
    }
}
public class LogManager {
...
    public synchronized boolean addLogger(Logger logger) {
...
        Logger plogger = Logger.getLogger(pname);
    }
...
}

There is a problem if there are two threads calling these two methods. Say Thread 1 calls Logger.getLogger() in its application code, e.g.

public void doSomething() {
  Logger.getLogger("com.stc.test").info("doSomething!");
}

Now let's consider another thread Thread 2 tries to register a custom logger using addLogger(), e.g.

LogManager.getLogManager().addLogger(new Logger() {
...
});

There are two locks that come into the picture: one is the lock on Logger.class, and the other one on LogManager.getLogManager(). Thread 1 simply calls Logger.getLogger() which first locks Logger.class, and while having this lock, will call LogManager.addLogger(). This call will get a lock on LogManager.getLogManager(). This will try to lock Logger.class again, which is no problem because it already had the lock on that. Now consider Thread 2: it will first lock LogManager.getLogManager() through its call to addLogger(), and while having this lock, it will try to lock Logger.class through its call to Logger.getLogger().

So Thread 1 will lock first A and then B, while Thread 2 will first lock B and then A. A classical deadlock situation.

This is clearly a bug in the JDK. With the knowledge of this bug, as the developer of the custom logger, we can use a simple workaround for this problem. Before calling addLogger(), first lock Logger.class. By doing that, we can guarantee that the locks are called in the same order: first Logger.class and then LogManager.getLogManager().

This bug was actually reported by a customer, see bug report, and was fixed in 6u11 and 5.0u18. The LogManager now no longer calls Logger.getLogger(), so there's no locking of Logger.class anymore.

Deadlock in GlassFish

GlassFish installs its own LogManager, and there's some locking going on there too. It has an internal lock which it locks in addLogger():

    public boolean addLogger(Logger logger) {
...
    super.addLogger(logger);
...
    synchronized (_unInitializedLoggers) {
...
        doInitializeLogger(logger);
    }
}

And this calls Logger.getLogger() again.

    protected void doInitializeLogger(Logger logger) {
...
    Logger newLogger =
        Logger.getLogger(logger.getName(), 
            getLoggerResourceBundleName(logger.getName()));
...
    }

As a developer of the custom logger, we can still use the same fix: first lock Logger.class before calling addLogger() so that the locking sequence becomes: Logger.class, transient lock on LogManager.getLogManager() followed by a lock on _unInitializedLoggers, lock on Logger.class().

A fix in GlassFish would simply make addLogger() synchronized. An alternative fix is to make the synchronized block smaller: it should not extend over doInitializeLogger().

Deadlock with the IBM JDK

All was well until this was run on the IBM JDK. As it turns out, Logger.getLogger() doesn't lock Logger.class, it locks LogManager.getLogManager(). Now there is a new deadlock: Thread 1 calls Logger.getLogger() and by doing so, locks LogManager.getLogManager(), and then tries to lock _unInitializedLoggers and later tries to lock LogManager.getLogManager() again. Thread 2 calls addLogger() and by doing so locks first _unInitializedLoggers which later tries to lock LogManager.getLogManager(). The problem is back: the sequence of locking is reversed.

Lessons learned

The first lesson is that of the deadlock itself: one should be extremely careful when locking an object, and then calling into another object, especially if that object is meant to be a baseclass in a public API. Both Logger and LogManager may be overridden, and are clearly very public.

The second lesson is that instead of building workarounds, try to get a fix the original bug. Or at least try to have the original issue addressed while adding a workaround. In my case, I never filed a ticket against the JDK or against GlassFish. I should have known better.

... or not?

At this very moment, GlassFish v2.1.1 is about to be shipped, and we need GlassFish v2.1.1 on AIX for the upcoming CAPS release. A bug fix in GlassFish v2.1.1 is probably not going to get in. So I'm making yet another workaround: don't use a custom logger at all as to avoid the whole addLogger() problem. Instead, use standard Logger objects, but provide a custom filter that manipulates the log record. This will work on both unpatched versions of GlassFish, and also on unpatched versions of the JDK. Still, I hope the GlassFish problem will be fixed.

Friday, August 28, 2009

JavaOne 2009 TS-5341: Rethinking the ESB: lessons learned challenging the limitations and pitfalls – audio recording uploaded

image

At JavaOne 2009 I gave a number of presentations. One was "Rethinking the ESB: lessons learned challenging the limitations and pitfalls" which I did with Andi Egloff. JavaOne sessions were not recorded this year, that is not by the JavaOne Organization. But I used an mp3 player to record the sessions I was involved in. I finally processed the audio and uploaded it. I also uploaded a PDF of the presentation:

Thursday, August 6, 2009

Message interceptors with JMS

Interceptors are a way to add behavior to a system without directly invoking this behavior in from code. The typical anti-example is that if you would want to log the entry and exit of methods on a class, you could do that of course by adding log statements in each and every method. The downside is that you would have to change each and every method, and that all these methods now have repetitive code in there. With the interceptor approach on the other hand, you could do that by adding an interceptor that is invoked before and after the method is invoked. In the code of the interceptor you would then add the log statement. The advantage is that the existing methods are not changed, that repetitive code is avoided, and that the logging concern is concentrated in one part of the code base rather than spread all over.

I think that this logging example is pretty dumb, but I guess it drives home the message of adding behavior non-invasively, and a separation of concerns. I'll discuss some more interesting examples in a minute.

Interceptors in EE

Interceptors have long been the domain of AOP (Aspect Oriented Programming) packages that add interceptors through byte code manipulation, either at runtime or as a post-compile step. For EE developers, things changed with the advent of EE5 in which interceptors were added to EJBs.

An interceptor can be added to all the methods or individual methods of an EJB by adding an annotation to either the class or to individual methods. In the following example, the NoConcurrency interceptor is invoked when the getUnclaimedAccounts() method is called.

@Interceptors(NoConcurrency.class)
public void getUnclaimedAccounts(String category) {...

The interceptor class is a simple Java class that has to have a method with the following signature:

@AroundInvoke
Object <METHOD>(javax.interceptor.InvocationContext) throws Exception

The interceptor method is called when a client calls an EJB method. The interceptor typically calls InvocationContext.proceed(). This will call into the EJB method (or the next interceptor, should there be one).

Interceptor classes are typically packaged as part of an EAR file: they are part of the application.

Interceptors are referenced in an EJB using the @Interceptors annotation. They can also be referenced from the EJB's deployment descriptor. In either case, the application needs to be modified to specify interceptors.

Another limitation of EJB interceptors is that they only work on EJB methods. When common behavior needs to be added to other interactions, for instance if an EJB invokes something like an outbound resource adapter, a different mechanism must be used.

Interceptors in CAPS

A Java CAPS customer had an interesting case for using interceptors. The customer had hundreds of integration applications that all received and sent JMS messages. If a message cannot be processed, the message is sent to an error handling system using JMSJCA's dead letter queue facility. An operator will monitor the dead letter queue, and may resubmit the message. If a resubmitted message is processed successfully, the error handling system needs to be notified of that fact. One way of adding this behavior would be to add a chunk of code to all MDBs. That would of course lead to a lot of duplicated code. It also leads to mixing this system level behavior with the business logic in the integration applications. This is clearly undesirable.

Because Repository based projects in Java CAPS generate all the MDB code and assembles the applications, it is difficult to make use of EJB interceptors. What is needed is to specify the use of interceptors on a global basis, that is outside of the applications, and preferably for all applications at the same time.

Another requirement is that the customer wanted to add common behavior not just when messages are received, but also when messages were sent from an EJB. Upon sending a message, payload validation should happen. Since EJB interceptors can only be used for EJB methods, this was another reason that EJB interceptors could not be used.

The solution was to add a new feature to JMSJCA to support interceptors.

Interceptors in JMSCA

Interceptors in JMSJCA are invoked not only before / after a message is delivered to the MDB, but also any time a message is sent.

Rather than introducing a new Java interface, which would bring with it the complication of compile time dependencies on JMSJCA jars, the JMSJCA interceptors use the same annotations as in EJB3: any class with a method with the right signature and the @AroundInvoke annotation can serve as an interceptor.

Which classes will JMSJCA consider as interceptors? For this, it uses the service locator mechanism from JDK 6. This is done without introducing dependencies on JDK 6 by the way. With the service locator mechanism, a jar that holds the interceptor should have a special manifest file that lists the class names of the interceptors. The name of this manifest should be META-INF/services/myinterceptor where myinterceptor is the name of the interceptor. This name needs to be specified in the JMSJCA configuration. If no name is specified, the default name jmsjca.interceptor is used. This means that interceptors using the service name jmsjca.interceptor are automatically loaded and invoked if not overridden in the JMSJCA configuration.

With this, it's easy to add interceptors to all applications running in the application server, whenever they send or receive messages through JMSJCA, without having to change any application.

For more information, see http://jmsjca.dev.java.net/ .

Sunday, March 22, 2009

The java.lang.LinkageError: loader constraint violation" demystified

My colleague Murali Pottlapelli ran into an interesting problem the other day. He added Rhino to his BPEL Service Engine, and saw this error happen:

java.lang.LinkageError: loader constraint violation: loader (instance of <bootloader>) previously initiated loading for a different type with name "org/w3c/dom/UserDataHandler"

The weird thing was that this exception was thrown from a call to Class.getMethods() on a class shipped with the JVM!

Googling this problem revealed that there are a lot of people running into this issue, often when using OSGi. Most search results referred to email lists postings where people ran into this problem. None of the web pages properly explained what the problem was. Intuitively I felt we could solve our problem by moving a jar to a different classloader, but was that merely hiding the problem? As with my post on "How to fix the dreaded "java.lang.OutOfMemoryError: PermGen space" exception (classloader leaks)", I was convinced that understanding the problem is key. So Murali and I set out to dig in this problem deeper until we completely grasped it.

As it turns out, there are some aspects about this problem that make it very confusing:

  1. In a dynamic component system, a change in one component may cause the other components to fail in areas that used to work properly before. At the same time, the changed component appears to be working properly.
  2. The order in which components are activated determines where and how the problem shows up.
  3. The effects of the problem may show up in innocuous and seemingly unrelated calls such as Class.getMethods().

In the following sections I'll first illustrate the problem, and then explain in detail what's causing the problem.

The problem

Let's look at a model example. In stead of looking at an OSGi example or a JBI example, let's look at EE because it will be more familiar. Imagine we have an EAR file with an EJB and two WAR files. The WAR files are identical, and have a servlet that uses an EJB to log in. As such we have three classes:

public static class User {
}

public static class LoginEJB {

  static {
    System.out.println("LoginEJB loaded");
 
}

  public static void login(User u) {
 
}
}

public static class Servlet {
 
public static void doGet() {
   
User u = new User();
   
System.out.println("User in " + u.getClass().getClassLoader());
   
LoginEJB.login(u);
 
}
}

Let's say that one WAR is configured with a self-first classloader, and the other one uses the default parent-first class loading model.

image

Now consider these three scenarios:

  1. We log in using the parent-first servlet, and then inspect the EJB with Class.getMethods(). Everything works as expected, but when we then try to login on the second servlet, we see the linkage error.
  2. We log in using the self-first servlet. Then when we call Class.getMethods() on the EJB, this fails. Also, we can no longer log in on the parent-first servlet!
  3. We first call Class.getMethods() on the EJB. We can no longer login using the self-first servlet, but the parent-first servlet still works.

What is going on? To explain, let's first revisit some classloader basics. If parent-first and self-first is in your daily vocabulary, feel free to skip the next section.

Self-first versus parent-first delegation

What is meant with self-first delegating classloaders? Here's the skinny on classloaders. In Java you can create your own classloader for two reasons: this allows multiple versions of the same class to co-exist in memory, as is often found in OSGi. It also allows classes to be unloaded, as is found in application servers. A classloader typically represents a set of jars that make up the module, the component, or the application. Each classloader must have a parent classloader. Hence, classloaders form a tree with the bootstrap classloader as the root. See the picture above.

When a classloader is asked to load a class, it can first ask its parent to load the class. If the parent fails to load the class, that classloader will then try to load the class. In this scheme, called parent-first class loading, common classes are always loaded by the parent classloader. This allows one application or module to talk to another application or module in the same VM.

Instead of asking the parent classloader first, a classloader can also try to find a class itself first, and only if it cannot find the class would it ask the parent classloader to find the class. A self-first classloader allows for an application to have a different version of a class than found in the parent classloader.

Classloader lab

To show what's going on, I've developed a small demo that emulates the scenario with the two WARs and the EJB. Key in this demo is a custom classloader. The constructor takes a list of classes that should be defined by that classloader, i.e. the classloader behaves as self-first for those classes, and delegates to the parent classloader for the other classes. The custom classloader is listed in the code at the bottom of this post.

This is how the system is setup: a classloader for the EJB that loads the LoginEJB and the User class. A classloader for the parent-first WAR that loads the Servlet only, and a self-first classloader that loads the Servlet and the User class.

CustomCL ejbCL = new CustomCL("EJB  ", Demo.class.getClassLoader(), "com.stc.Demo$User", "com.stc.Demo$LoginEJB");
CustomCL pfWebCL = new CustomCL("PFWeb", ejbCL, "com.stc.Demo$Servlet");
CustomCL sfWebCL = new CustomCL("SFWeb", ejbCL, "com.stc.Demo$User", "com.stc.Demo$Servlet");

The custom classloader prints all class loading requests so it can be easily seen what's happening.

Classloading eagerness

What exactly happens when the LoginEJB class is loaded? In the demo program, the following line causes the following output:

ejbCL.loadClass("com.stc.Demo$LoginEJB", true).newInstance();

EJB  : Loading com.stc.Demo$LoginEJB in custom classloader
EJB  : super.loadclass(java.lang.Object)
EJB  : super.loadclass(java.lang.System)
EJB  : super.loadclass(java.io.PrintStream)
LoginEJB loaded

When the JVM loads the LoginEJB class, it goes over references in the class and loads those classes too: the java.lang.Object class because it's the super class of the EJB, and the java.lang.System and java.io.PrintStream class because they are used in the static block. That these "JVM" classes are loaded is in itself remarkable and shows an interesting aspect of how classloading works. "JVM" classes are not treated specially, and it is not relevant that they are already loaded in the bootstrap classloader and are used all over the place.

When the EJB classloader receives the request to load these "JVM" classes, that classloader of course delegates those requests to the parent classloader. In fact, it's a requirement to delegate all class load requests to the parent classloader for all classes that are in java.* and javax.*.

Something also remarkable is what is not loaded: the User class. Apparently, the fact that this class is used in a method is not enough to cause this class to be loaded when the EJB class is loaded. It is difficult to predict what classes are loaded as the result of loading a particular class. I think the spec leaves a lot of room to implementers to decide when to do so.

It's important to realize that when the EJB class is loaded, the User class is not loaded yet.

Linking classes

Next, let's use the parent-first Servlet to log in:

pfWebCL.loadClass("com.stc.Demo$Servlet", false).getMethod("doGet").invoke(null);

PFWeb: Loading com.stc.Demo$Servlet in custom classloader
PFWeb: super.loadclass(java.lang.Object)
EJB  : already loaded(java.lang.Object)
PFWeb: super.loadclass(com.stc.Demo$User)
EJB  : Loading com.stc.Demo$User in custom classloader
PFWeb: super.loadclass(java.lang.System)
EJB  : already loaded(java.lang.System)
...
Logging in with User loaded in EJB
PFWeb: super.loadclass(com.stc.Demo$LoginEJB)
EJB  : already loaded(com.stc.Demo$LoginEJB)

Ignoring the "JVM" classes, it can be seen that the Servlet causes the User class to be loaded, and that that class is loaded in the EJB classloader. Nothing unexpected here.

If subsequently the self-first Servlet gets a go, the following happens:

sfWebCL.loadClass("com.stc.Demo$Servlet", false).getMethod("doGet").invoke(null);

SFWeb: Loading com.stc.Demo$Servlet in custom classloader
SFWeb: super.loadclass(java.lang.Object)
EJB  : already loaded(java.lang.Object)
SFWeb: Loading com.stc.Demo$User in custom classloader
SFWeb: super.loadclass(java.lang.System)
...
Logging in with User loaded in SFWeb
SFWeb: super.loadclass(com.stc.Demo$LoginEJB)
EJB  : already loaded(com.stc.Demo$LoginEJB)
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...
Caused by: java.lang.LinkageError: loader constraints violated when linking com/stc/Demo$User class
    at com.stc.Demo$Servlet.doGet(Demo.java:82)
    ... 6 more

The User class is loaded and defined in the self-first WAR classloader, and when the login() method is called, an instance of that class is passed to the EJB. Why does the linkage error happen?

When the parent-first servlet invoked the EJB, it passed in a User object. At that moment, the JVM links the reference to com.stc.Demo$User to a class instance. A class instance is identified using the fully qualified name and the classloader instance in which it was loaded. Upon invocation of login(User), the JVM will check that the class object of the User passed in, matches the class object that was linked to com.stc.Demo$User. If they don't match, the JVM will throw a LinkageError.

This linking happened on the first invocation. We can also force the linking to happen by calling LoginEJB.class.getMethods(). I can illustrate that by changing the test program to fist inspect the EJB class, and then make the self-first servlet to login.

System.out.println("Loading EJB");
ejbCL.loadClass("com.stc.Demo$LoginEJB", true).newInstance();
System.out.println("Examining methods of LoginEJB");
ejbCL.loadClass("com.stc.Demo$LoginEJB", false).getMethods();
System.out.println("Logging in, self-first");
sfWebCL.loadClass("com.stc.Demo$Servlet", false).getMethod("doGet").invoke(null);

Loading EJB
EJB  : Loading com.stc.Demo$LoginEJB in custom classloader
LoginEJB loaded
Examining methods of LoginEJB
EJB  : already loaded(com.stc.Demo$LoginEJB)
EJB  : Loading com.stc.Demo$User in custom classloader
Logging in
SFWeb: Loading com.stc.Demo$Servlet in custom classloader
SFWeb: Loading com.stc.Demo$User in custom classloader
Logging in with User loaded in SFWeb
SFWeb: super.loadclass(com.stc.Demo$LoginEJB)
EJB  : already loaded(com.stc.Demo$LoginEJB)
java.lang.reflect.InvocationTargetException
Caused by: java.lang.LinkageError: loader constraints violated when linking com/stc/Demo$User class

The login fails because the LoginEJB.class.getMethods() invocation causes the com.stc.Demo$User reference to be linked with the User class loaded in the EJB classloader. When the self-first servlet invokes the method, the two class objects don't match, causing the Error to be thrown.

Small mistake results in "poisoning" a shared class

By now it should be obvious that the User class should not have been packaged in the self-first WAR. What may not be obvious yet, is that this small mistake has big consequences. If a login happens on the self-first servlet before anything else, the linking happens with the erroneous User class object from the self-first classloader. This will cause the login of the parent-first WAR to fail. It will also cause the LoginEJB.class.getMethods() invocation to fail.

System.out.println("Loading EJB");
ejbCL.loadClass("com.stc.Demo$LoginEJB", true).newInstance();
System.out.println("Logging in, self-first");
sfWebCL.loadClass("com.stc.Demo$Servlet", false).getMethod("doGet").invoke(null);
System.out.println("Examining methods of LoginEJB");
ejbCL.loadClass("com.stc.Demo$LoginEJB", false).getMethods();

Loading EJB
EJB  : Loading com.stc.Demo$LoginEJB in custom classloader
LoginEJB loaded
Logging in, self-first
SFWeb: Loading com.stc.Demo$Servlet in custom classloader
SFWeb: Loading com.stc.Demo$User in custom classloader
Logging in with User loaded in SFWeb
SFWeb: super.loadclass(com.stc.Demo$LoginEJB)
EJB  : already loaded(com.stc.Demo$LoginEJB)
Examining methods of LoginEJB
EJB  : already loaded(com.stc.Demo$LoginEJB)
EJB  : Loading com.stc.Demo$User in custom classloader
java.lang.LinkageError: Class com/stc/Demo$User violates loader constraints
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:465)
    at com.stc.Demo$CustomCL.findClass(Demo.java:36)
    at com.stc.Demo$CustomCL.loadClass(Demo.java:54)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
    at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2395)
    at java.lang.Class.privateGetPublicMethods(Class.java:2519)
    at java.lang.Class.getMethods(Class.java:1406)
    at com.stc.Demo.test1(Demo.java:98)
    at com.stc.Demo.main(Demo.java:111)

The first login with the self-first WAR has effectively poisoned the EJB, making it unusable. In everyday life, the problem is likely not so clear cut. For instance, a developer adds a jar to a component, or changes the classloading delegation model of a component, and all tests with that component may succeed. The problem may only show up in next day's build when integration tests are run. And as the example above shows, the stacktrace does not tell much about where the real cause of the error is.

Formalization and references

A formal description of loading constraints can be found in detail in section 5.3.4 of The Java Virtual Machine Specification (online available at http://java.sun.com/docs/books/jvms/second_edition/html/ConstantPool.doc.html). In simple words, a linkage error can occur if two different classes interact with each other, and in this interaction the classes refer to types with the same symbolic name but with different class objects. In the example,  the self-first servlet referred to EJB:LoginEJB.login(sfWeb:User), but the EJB's representation was EJB:loginEJB(EJB:User).

Other places where linkage errors may occur are in class hierarchies. If class Derived overrides a method f(A a, B b) in class Super, A and B as seen from Super must be the same A and B as seen from Derived. References to static variables are another example.

More information can also be found in the book Inside the Java Virtual Machine by Bill Venners. Chapters from this book are also available at Artima.

Classloader lab

If you like to experiment, here's the source of the Demo program. (click to expand)