Frank Kieviet's Engineering Notebook: GlassFish

Showing posts with label GlassFish. Show all posts

Saturday, July 31, 2010

GlassFish Security

Pact Publishing was kind enough to send me a copy of the book "GlassFish Security" that was released very recently.

It's tough to find the time to read a book cover to cover. In fact, it's been a while since I've read a book from beginning to end. Typically I'm only interested in a few chapters which I then read. Later, when the need arises, I may get back to other chapter. It's like treating a book like an encyclopedia or a dictionary. I bet that most people read technical books that way nowadays. So it's important that a book lends itself to be read that way.

Security is a very broad field with many many different topics, e.g. user authentication and authorization in web applications, integration with an external security server, web services security, and so on. Fortunately, very few people will have to deal with all these aspects at the same time. GlassFish Security covers many if not all of those aspects, and that's another reason why it should be possible to read this book like an encyclopedia.

I went back in time and tried to remember all the times that I have had anything to do with security in GlassFish and checked what the book had to say about it. For instance, to start simple, there's the issue of configuring realms and users. There's the time I tried to get JSPWiki to work on GlassFish using declarative authentication and authorization, something that goes through server.policy. More advanced, there's the time we tried to integrate GlassFish with AccessManager. On all these topics, the book delivers. There are many other topics (e.g. integration with OpenSSO, securing JMX, etc), all described in detail without wasting much space on repetition or endless code or XML listings. The book would have saved me a lot of time had I had it at the time when I needed it.

If you're using GlassFish, this book belongs on your bookshelf!

Monday, April 26, 2010

Working with java.net: tips and tricks for project owners

In the past two years I've managed the OpenESB open source project. This project is hosted on java.net. In this post I outline a number of tips and tricks I've learned while managing this project on java.net. This is useful information for my successor, but it also may be useful to other people who are managing projects on java.net.

Web presence

An important factor in the success of an open source project is its web presence: it will be the first point of contact between a potential new user and the project.

New visitors will want to see answers to these four items immediately:

What is it?
License?
Getting started
Downloads

It is important that the links to these pages are static, i.e. that they do not change. That is because a new visitor may enter the site through a link on a different site or through a search engine. For this very reason, it is especially important to keep a single downloads page rather than a page per version, so that users do end up immediately on the latest version, rather than a previous (old) version.

Looking at the front page of OpenESB, these four items are immediately apparent, and the links are permanent. Two more items are important: a visitor will judge a project for it being

Alive
- Active mailing list
- Recent/frequent announcements
Professional
- Well organized, thought through
- Documentation easily found and comprehensive
Inviting
- New users should feel welcome to not just use the product, but also to participate in the community.

You be the judge how well these goals have been met. Let's look at the mechanics of java.net now.

java.net project web site

How java.net processes HTML files

The HTML files for a project's web site are stored in the project's VCS system in a directory called www. There's a big twist here: although the HTML files are ordinary HTML files, i.e. with an <HTML>, <HEAD> and <BODY> tag, the java.net web server reads the HTML files and does a number of substitutions on the file before it writes the resulting file to the client.

Conceptually, there is a java.net template HTML file which has its own <HEAD> element, its own <BODY> element with static content, and some placeholders in these two elements for:

some elements of <HEAD> element of the user defined HTML file, e.g. <TITLE>
the contents of the project_tools.html file
the contents of the <BODY> element from the user defined HTML file

As such, the project owner has to perform special tricks to have the page displayed in other ways than the standard java.net way.

In OpenESB we have several java.net projects. All of them should have the same look and feel, i.e. the OpenESB "brand" and a common navigation. The same goes for the OpenESB wiki.

How OpenESB organizes HTML files

Each project has the same project_tools.html file. This file takes care of the following:

It includes custom CSS files. These files override the java.net styles.
It hides the standard java.net navigation bar. For admins, the navbar display can be turned on: the state is maintained in a cookie. The value in the cookie can be toggled by loading the admin.html file in OpenESB: https://open-esb.dev.java.net/admin.html.
Loads the menu. The menu is defined in a separate JavaScript file.
Displays the search box
Sets an event handler that is invoked when the page's loading is complete. The event handler manipulates the layout of the page and invokes Google Analytics.

As a result, the other HTML files in OpenESB or any of its sub projects don't have to bother with the common look and feel, the menu, etc. All of that is taken care of by project_tools.html, which as mentioned should be duplicated to all the OpenESB projects.

Some or if you will, many, HTML files in OpenESB have another common style element: a right hand side bar that displays common advertising for downloading GlassFish ESB, the wealth of components, and the a partner highlight. This right side bar is taken care of a separate JavaScript file. Files that should display this right hand side bar should explicitly be formatted to do so: it should contain a table in which the right column loads the JavaScript of the right hand side bar.

Previewing HTML

The fact that the look and feel of an HTML file is now defined in project_tools.html makes it difficult to see what a file will actually look like when deployed (i.e. checked in into the project's VCS) on java.net. There is a workaround: I created a directory that when deployed on a web server, emulates the java.net environment. To edit a file with the ability to preview it without checking in the file, follow this procedure:

Load the file in your browser through java.net's web server, e.g. https://open-esb.dev.java.net/Downloads.html.
Save this file to the emulation directory, e.g. Tomcat/webapps/ROOT/Downloads.html
Now load the file into your browser through your local web server, e.g. http://localhost:9080/Downloads.html. This should look identical to what was loaded from java.net.
Locate the text in the file to be edited. Change the text and save the file. Reload the file in your browser. Repeat this step until the file looks OK.
Do a diff using a tool like Araxis Merge, or Beyond Compare between the local file and the file that was checked in into VCS, e.g. diff Tomcat/webapps/ROOT/Downloads.html and open-esb/www/Downloads.html.
Merge the changes into the VCS file, i.e. open-esb/www/Downloads.html. Save and check in into VCS.

Here is the directory for Tomcat. If you use it for projects other than OpenESB, you can remove many of the files. Make sure to leave branding and css directories in tact. If you intend to reuse project_tools.html, don't forget to change the Google Analytics account ID.

SSL

Unfortunately, java.net uses SSL for all its files. This is unfortunate for two reasons:

Files do not get cached in the browser: every time a user opens his browser and goes to OpenESB, all the images, style sheets, etc are reloaded again.
To avoid security warnings in Internet Explorer, files that are included in an HTML file also should come through SSL. This rule is violated on the OpenESB main page (of all places!) where it loads a dynamic list of news item from a non-https server.

There's no workaround for this. Collabnet promised a long time ago that SSL would be removed but this still has not happened.

Changing the menu

The menu is defined in a json like format in a file called menu.js. This file is duplicated in the wiki. See below. Make sure to use absolute URLs because the file is used from different projects, i.e. different roots.

Wiki (not on java.net)

OpenESB is not using the java.net wiki infrastructure. I don't recall why this decision was made. New projects should probably evaluate the java.net wiki infrastructure before considering setting up a wiki elsewhere.

The OpenESB wiki is hosted on a machine hosted at Sun/Oracle, not at Collab net. It is collocated with other wikis, e.g. GlassFish, UpdateCenter, etc. The wiki engine is JSP wiki. The templates have been adapted to look and feel like the java.net OpenESB web site.

Since java.net has / used to have a lot of downtime, and to avoid having this downtime impact the wiki, the menu.js file is duplicated on the wiki. That java.net uses ssl is another reason to duplicate the file. For the same reason, the icons displayed in the menu are also duplicated on the wiki server.

Users are managed separately from java.net. User management / restrictions are necessary to avoid spam. Spam has been a been big problem on the OpenESB wiki and the other wikis collocated with it. Cleaning up spam can be done by directly manipulating the text files that make up the wiki page.

JSPWiki security is configured to use JAAS. This fact is important if the wiki needs to be setup on a different GlassFish server. In that case make sure to update the server.policy file of GlassFish. The policy file specifies that only "validated" users can edit pages. Users are stored in a user database and a group database. The latter specifies group membership, of which the "validated" group is important. Both files are in the etc directory of the wiki directory.

To add a user, the group database needs to be updated. This can be done through the web interface, http://wiki.open-esb.java.net/EditGroup.jsp?group=Validated or can be done by editing the group file directly.

The raw data files and user management files of the wiki can be accessed through SSH.

Since the user management of the wiki is separate from java.net, users can choose a different userid on the wiki than they do on java.net. Allowing this is a mistake in retrospect: editing the wiki requires a SCA (Sun Contributor Agreement) or employment in Sun/Oracle, and having to manage/audit two sets of users instead of one is a waste of time. Again, in retrospect.

java.net user management

Anybody on java.net can request developer privileges. When that happens, an email is sent by java.net to the project owner. An automated process is setup that replies automatically to the user list asking the applicant for his motivation, background, etc. This reply is specified in role-approval.policy in the www directory.

For some reason, many people ask for developer privileges. They are not known, have not contributed or communicated before, and are never heard from again.

Consequently, this automated process sounds nice, but is useless in practice. A better process is that someone first communicates on the mailing list, proves he/she is serious and able to contribute, after which the project owner grants privileges. Before privileges are granted, a signed SCA needs to be on file. This process is documented on the OpenESB web site.

The link to the membership management page is displayed in the java.net nav bar. Recall that the nav bar is hidden. To toggle display of the nav bar, load https://open-esb.dev.java.net/admin.html. Alternatively, jump to the page directly: https://open-esb.dev.java.net/servlets/ProjectMemberList.

When a user is granted commit privileges to a project, he/she also automatically gets commit privileges to the code repositories of the sub projects. E.g. granting commit privileges to open-esb also gives commit privileges to open-jbi-components, a sub project of open-esb. This is only the case if the VCS of the child project is of the same type as the parent project. Since open-jbi-components and open-esb both use CVS, open-esb committers have access to open-jbi-components. However, if the VCS system is different, e.g. openesb-builds uses SVN, separate access to this project needs to be granted next to open-esb.

java.net email management

Email lists are managed by an email list manager running in java.net. This email list manager is only partially integrated in java.net. Users can click on a button on the java.net site to subscribe to an email list. What effectively happens is that it will register the email address that is associated with that user at that moment in the list server. Similarly, a user can unsubscribe through a button, which will remove the email list associated with that user at that moment. Where this knowledge comes in handy? There is confusion if the user changes his email address associated with his java.net id between subscribing and unsubscribing. Sometimes users get so confused that they'll need help unsubscribing from the list. After all the last thing you want is repeated posts of "how do I unsubscribe from this list?".

Here is another task with respect to email: spam. Protecting the list to spam is very important. Therefore lists should be setup as "discuss" or "moderated". A setting of "discuss" means that only "subscribed" or "allowed" posters can post. If someone posts a message who is not in the "subscribed" or "allowed" list, the message will be forwarded to the list owner(s) for review. Replying to that message will allow the message to be posted. To avoid getting too many of these "review messages", frequent posters can be added to the "allowed" list. The "allowed" list is accessible from the email management page.

The link to the email list management page is displayed in the java.net nav bar. Recall that the nav bar is hidden. To toggle display of the nav bar, load https://open-esb.dev.java.net/admin.html. Alternatively, jump to the page directly: https://open-esb.dev.java.net/servlets/ProjectMailingListList.

Nabble is a view on the mailing list. OpenESB has integrated it on the web site. Note that posters are identified by their "from" email address only, but posters on Nabble somehow do not need to be in the "allowed" of "subscribed" list. This is fortunate, but puzzling from a technical point of view.

This brings me to one more email list related tasks: once in a while, someone posts an email that he/she wants to revoke. For example, someone could accidentally send a private email to the list, or include confidential (customer information for example) information in an email.

Of course there's no such thing as undoing the sending of an email. Subscribed users will have received the email by the time that the sender notices the mistake. However, there is an option to remove the email from Nabble and from the java.net mail browser. Note that there are two places where the email needs to be removed: both Nabble and java.net. Removing the mail there at least makes the mistake less visible and will make it harder for the mistaken email to be picked up by search engines.

java.net automation

Some of the management tasks on java.net can be automated. Kohsuke has developed an extensive set of tools that provide a Java api to java.net web interface: https://javanettasks.dev.java.net/.

I've written a few tools to make community management easier. One such tool is used to make an inventory of all the users in the community with commit privileges. Everybody in this list should either be a Sun/Oracle employee, or should have signed an SCA. The tool loads and parses the list of signatories (from https://sca.dev.java.net/CA_signatories.htm) and compares this list with the one extracted from java.net. Discrepancies are people who are currently or have been Sun/Oracle employees. Since it is difficult to track who is no longer an employee (especially with the layoffs that used to happen every now and then), the tool can email all "discrepancies" and ask them to either submit an SCA or to verify employment by clicking a link pointing to an internal server. These tools are available on https://open-esb-build.dev.java.net/svn/open-esb-build/trunk/community-mgt/.

Wednesday, April 21, 2010

Moving on… next stop: Google

I've decided it's time to move on. Yesterday I put in my resignation at Oracle. I'll be starting at Google in two weeks.

Why? Was Oracle such a bad place to work? Was it such a culture shock for a Sun employee that it just could not work? Could I not stand my manager? Could I not get along with my new co-workers?

If you're expecting a bitching session about big-bad-Oracle, I have to disappoint you. In the months that I've been at Oracle, I've come to the conclusion that it is a pretty decent company to work for. I like my manager, my co-workers are intelligent and pleasant, and the culture at Oracle is very similar as it was at Sun. So, honestly, no complaints there.

The reason for my leaving is that I've been doing SOA for 7+ years now. I've been involved in the complete lifecycle of two products: CAPS and OpenESB. Now, with my transition into Oracle, I was about to commit to a third SOA product. Before I dove into that, I did some soul searching, and realized that I ready for something new, something in a different field.

As I was looking for a change, I looked at what interesting companies there are here in Orange County. Google and Amazon came to mind. Google is said to be the engineering Valhalla, so I submitted my resume there. And surprise, surprise, I got an offer.

It's a big jump. It's starting from the bottom again… jumping into something brand new. Will I like the change? I'm eager to find out!

Friday, January 15, 2010

How to enlist and delist XA resources: trial and error

Once in a while a problem pops up with how a middleware stack supports XA. The problem I'm discussing here is the one of how to exactly enlist and delist XAResource-s with isSameRM()returning true. The XA spec is vague on this very topic, so this warrants some exploration.

First of all, what do I mean with "enlist and delist XAResource-s with isSameRM()returning true?"

Enlisting and delisting

When using an XA capable system, the connection needs to be enlisted in the transaction before it is used. A typical sequence on the XAResource, let's call it r1, is:

   1: r1.start(xid, TMNOFLAGS);

   2: ... // use (r1)

   3: r1.end(xid, TMSUCCESS);

   4: r1.commit(xid, true);

With a second resource in the mix, the call sequence may become:

   1: r1.start(xid1, TMNOFLAGS);

   2: ... // use r1

   3: r2.start(xid2, TMNOFLAGS);

   4: ... // use r2

   5: r2.end(xid2, TMSUCCESS);

   6: r1.end(xid1, TMSUCCESS);

   7: r1.prepare(xid1);

   8: r2.prepare(xid2);

   9: p1.commit(xid1, false);

  10: p2.commit(xid2, false);

The transaction is now escalated to a full two-phase transaction. But if r1 and r2 represent the same resource manager, a shortcut can be taken: one resource can "piggy back ride" on the other resource. The isSameRM() method can be used to check if this should be attempted. If it returns true, the sequence may become:

   1: // MULTIPLE ACTIVE

   2: r1.start(xid, TMNOFLAGS);

   3: r1.isSameRM(r2); // returns true

   4: r2.start(xid, TMJOIN);

   5: r2.end(xid, TMSUCCESS);

   6: r1.end(xid, TMSUCCESS);

   7: r1.commit(xid, true);

The transaction now again is a single-phase transaction, and the performance difference with the two-phase commit case is usually very significant.

This sequence doesn't work for a number of popular systems, e.g. MQSeries and Oracle. Unfortunately the XA specification, but that specification is not quite clear at all about what resource managers should be able to do, so there is to some extent some trial and error involved.

Trial and error

How should isSameRM() be used then? By testing a number of different systems, it turns out that for some systems, there can be only one enlisted XAResource active in the transaction. If a second XAResource joins the transaction, the first one should be deactivated. Here is an example:

   1: // SINGLE ACTIVE 1

   2: r1.start(xid, TMNOFLAGS);

   3: r1.isSameRM(r2); // should return true

   4: ... // use r1

   5: r1.end(xid, TMSUSPEND);

6:

   7: r2.start(xid, TMJOIN);

   8: ... // use r2

   9: r2.end(xid, TMSUCCESS);

10:

  11: r1.start(xid, TMRESUME);

  12: ... // use r1 again

  13: r1.end(xid, TMSUCESS);

14:

  15: r1.commit(xid, true);

A variation of this is calling TMSUCCESS instead of TMSUSPEND. In that case TMJOIN should be called instead of TMRESUME:

   1: // SINGLE ACTIVE 2

   2: r1.start(xid, TMNOFLAGS);

   3: r1.isSameRM(r2); // should return true

   4: ... // use r1

   5: r1.end(xid, TMSUCCESS);

6:

   7: r2.start(xid, TMJOIN);

   8: ... // use r2

   9: r2.end(xid, TMSUCCESS);

10:

  11: r1.start(xid, TMJOIN);

  12: ... // use r1 again

  13: r1.end(xid, TMSUCESS);

14:

  15: r1.commit(xid, true);

I ran these tests on a number of systems, and here is what I found:

	`isSameRM()`	Test: multiple active	Test: single active 1	Test: single active 2
STCMS	yes	yes	yes	yes
JMQ	as of 4.4	yes	no: throws on line 7	yes
WebSphereMQ 6	yes	no: blocks on line 4	no: blocks on line 7	yes
Derby 10.5.3.0	yes	no: blocks on line 4	yes	yes
Oracle 11.1.0.6	yes	no: blocks on line 4	yes	yes
MySQL 5.1	yes	no: throws on line 4	no: throws on line 5	no: throws on line 7

As can be seen, the SINGLE ACTIVE 2 code sequence works best.

As a side note, MySQL is showing surprising behavior: TMJOIN and TMSUSPEND are not supported (as documented in the MySQL documentation), so why does it bother to return true on isSameRM()? Behavior like this makes it difficult to write portable code: a container now has to provide a wrapper around the DataSource that corrects for this. It would have been much better if it simply had returned false on isSameRM().

How does this relate to what application code can do in for instance a Java EE container? That's the topic of a different blog post.

Monday, September 21, 2009

Lessons of an interesting deadlock problem

Two years ago I wrote a blog entry about Nested Diagnostics Contexts in GlassFish. The approach was based on custom Logger objects. Now, two years later when running GlassFish on AIX, an issue with that turned up: a deadlock. Digging into it, there are two lessons I'd like to share: one about deadlocks, and the other one about workarounds.

Deadlock in the JDK

At the time of my writing the NDC facility, calling addLogger() and getLogger() from different threads may cause a deadlock in the JDK classes. Why is this?

This is how the code in the Sun JDK was:

public class Logger {
...
    public static synchronized Logger getLogger(String name) {
	LogManager manager = LogManager.getLogManager();
	Logger result = manager.getLogger(name);
	if (result == null) {
	    result = new Logger(name, null);
	    manager.addLogger(result);
	    result = manager.getLogger(name);
	}
	return result;
    }
}

public class LogManager {
...
    public synchronized boolean addLogger(Logger logger) {
...
        Logger plogger = Logger.getLogger(pname);
    }
...
}

There is a problem if there are two threads calling these two methods. Say Thread 1 calls Logger.getLogger() in its application code, e.g.

public void doSomething() {
  Logger.getLogger("com.stc.test").info("doSomething!");
}

Now let's consider another thread Thread 2 tries to register a custom logger using addLogger(), e.g.

LogManager.getLogManager().addLogger(new Logger() {
...
});

There are two locks that come into the picture: one is the lock on Logger.class, and the other one on LogManager.getLogManager(). Thread 1 simply calls Logger.getLogger() which first locks Logger.class, and while having this lock, will call LogManager.addLogger(). This call will get a lock on LogManager.getLogManager(). This will try to lock Logger.class again, which is no problem because it already had the lock on that. Now consider Thread 2: it will first lock LogManager.getLogManager() through its call to addLogger(), and while having this lock, it will try to lock Logger.class through its call to Logger.getLogger().

So Thread 1 will lock first A and then B, while Thread 2 will first lock B and then A. A classical deadlock situation.

This is clearly a bug in the JDK. With the knowledge of this bug, as the developer of the custom logger, we can use a simple workaround for this problem. Before calling addLogger(), first lock Logger.class. By doing that, we can guarantee that the locks are called in the same order: first Logger.class and then LogManager.getLogManager().

This bug was actually reported by a customer, see bug report, and was fixed in 6u11 and 5.0u18. The LogManager now no longer calls Logger.getLogger(), so there's no locking of Logger.class anymore.

Deadlock in GlassFish

GlassFish installs its own LogManager, and there's some locking going on there too. It has an internal lock which it locks in addLogger():

    public boolean addLogger(Logger logger) {
...
    super.addLogger(logger);
...
    synchronized (_unInitializedLoggers) {
...
        doInitializeLogger(logger);
    }
}

And this calls Logger.getLogger() again.

    protected void doInitializeLogger(Logger logger) {
...
    Logger newLogger =
        Logger.getLogger(logger.getName(), 
            getLoggerResourceBundleName(logger.getName()));
...
    }

As a developer of the custom logger, we can still use the same fix: first lock Logger.class before calling addLogger() so that the locking sequence becomes: Logger.class, transient lock on LogManager.getLogManager() followed by a lock on _unInitializedLoggers, lock on Logger.class().

A fix in GlassFish would simply make addLogger() synchronized. An alternative fix is to make the synchronized block smaller: it should not extend over doInitializeLogger().

Deadlock with the IBM JDK

All was well until this was run on the IBM JDK. As it turns out, Logger.getLogger() doesn't lock Logger.class, it locks LogManager.getLogManager(). Now there is a new deadlock: Thread 1 calls Logger.getLogger() and by doing so, locks LogManager.getLogManager(), and then tries to lock _unInitializedLoggers and later tries to lock LogManager.getLogManager() again. Thread 2 calls addLogger() and by doing so locks first _unInitializedLoggers which later tries to lock LogManager.getLogManager(). The problem is back: the sequence of locking is reversed.

Lessons learned

The first lesson is that of the deadlock itself: one should be extremely careful when locking an object, and then calling into another object, especially if that object is meant to be a baseclass in a public API. Both Logger and LogManager may be overridden, and are clearly very public.

The second lesson is that instead of building workarounds, try to get a fix the original bug. Or at least try to have the original issue addressed while adding a workaround. In my case, I never filed a ticket against the JDK or against GlassFish. I should have known better.

... or not?

At this very moment, GlassFish v2.1.1 is about to be shipped, and we need GlassFish v2.1.1 on AIX for the upcoming CAPS release. A bug fix in GlassFish v2.1.1 is probably not going to get in. So I'm making yet another workaround: don't use a custom logger at all as to avoid the whole addLogger() problem. Instead, use standard Logger objects, but provide a custom filter that manipulates the log record. This will work on both unpatched versions of GlassFish, and also on unpatched versions of the JDK. Still, I hope the GlassFish problem will be fixed.

Frank Kieviet's Engineering Notebook