Once in a while a problem pops up with how a middleware stack supports XA. The problem I'm discussing here is the one of how to exactly enlist and delist XAResource-s with isSameRM()returning true. The XA spec is vague on this very topic, so this warrants some exploration.
First of all, what do I mean with "enlist and delist XAResource-s with isSameRM()returning true?" 
Enlisting and delisting
When using an XA capable system, the connection needs to be enlisted in the transaction before it is used. A typical sequence on the XAResource, let's call it r1, is:
   1: r1.start(xid, TMNOFLAGS);2: ... // use (r1)
   3: r1.end(xid, TMSUCCESS);   4: r1.commit(xid, true);With a second resource in the mix, the call sequence may become:
   1: r1.start(xid1, TMNOFLAGS);2: ... // use r1
   3: r2.start(xid2, TMNOFLAGS);4: ... // use r2
   5: r2.end(xid2, TMSUCCESS);   6: r1.end(xid1, TMSUCCESS);   7: r1.prepare(xid1);   8: r2.prepare(xid2);   9: p1.commit(xid1, false);  10: p2.commit(xid2, false);The transaction is now escalated to a full two-phase transaction. But if r1 and r2 represent the same resource manager, a shortcut can be taken: one resource can "piggy back ride" on the other resource. The isSameRM() method can be used to check if this should be attempted. If it returns true, the sequence may become:
1: // MULTIPLE ACTIVE
   2: r1.start(xid, TMNOFLAGS);3: r1.isSameRM(r2); // returns true
   4: r2.start(xid, TMJOIN);   5: r2.end(xid, TMSUCCESS);   6: r1.end(xid, TMSUCCESS);   7: r1.commit(xid, true);The transaction now again is a single-phase transaction, and the performance difference with the two-phase commit case is usually very significant.
This sequence doesn't work for a number of popular systems, e.g. MQSeries and Oracle. Unfortunately the XA specification, but that specification is not quite clear at all about what resource managers should be able to do, so there is to some extent some trial and error involved.
Trial and error
How should isSameRM() be used then? By testing a number of different systems, it turns out that for some systems, there can be only one enlisted XAResource active in the transaction. If a second XAResource joins the transaction, the first one should be deactivated. Here is an example:
1: // SINGLE ACTIVE 1
   2: r1.start(xid, TMNOFLAGS);3: r1.isSameRM(r2); // should return true
4: ... // use r1
   5: r1.end(xid, TMSUSPEND);   6:     7: r2.start(xid, TMJOIN);8: ... // use r2
   9: r2.end(xid, TMSUCCESS);  10:    11: r1.start(xid, TMRESUME);12: ... // use r1 again
  13: r1.end(xid, TMSUCESS);  14:    15: r1.commit(xid, true);1: // SINGLE ACTIVE 2
   2: r1.start(xid, TMNOFLAGS);3: r1.isSameRM(r2); // should return true
4: ... // use r1
   5: r1.end(xid, TMSUCCESS);   6:     7: r2.start(xid, TMJOIN);8: ... // use r2
   9: r2.end(xid, TMSUCCESS);  10:    11: r1.start(xid, TMJOIN);12: ... // use r1 again
  13: r1.end(xid, TMSUCESS);  14:    15: r1.commit(xid, true);| isSameRM() | Test: multiple active | Test: single active 1 | Test: single active 2 | |
| STCMS | yes | yes | yes | yes | 
| JMQ | as of 4.4 | yes | no: throws on line 7 | yes | 
| WebSphereMQ 6 | yes | no: blocks on line 4 | no: blocks on line 7 | yes | 
| Derby 10.5.3.0 | yes | no: blocks on line 4 | yes | yes | 
| Oracle 11.1.0.6 | yes | no: blocks on line 4 | yes | yes | 
| MySQL 5.1 | yes | no: throws on line 4 | no: throws on line 5 | no: throws on line 7 | 
As can be seen, the SINGLE ACTIVE 2 code sequence works best.
As a side note, MySQL is showing surprising behavior: TMJOIN and TMSUSPEND are not supported (as documented in the MySQL documentation), so why does it bother to return true on isSameRM()? Behavior like this makes it difficult to write portable code: a container now has to provide a wrapper around the DataSource that corrects for this. It would have been much better if it simply had returned false on isSameRM().
How does this relate to what application code can do in for instance a Java EE container? That's the topic of a different blog post.
 

3 comments:
That's great :) hope this leads to better compatibility between CAPS/GFESB and WebSphere MQ.
And this under-documented mystery: http://www.google.com/search?q=oracle-xa-recovery-workaround&ie=utf-8&oe=utf-8&
Re sysprv:
Indeed, this highlights a problem with GlassFish and MQSeries; a problem I built a workaround for in JMSJCA, but based on this information, the workaround can be made better performing.
Re Oracle recovery: you'd wish there was a CTK that would test XA implementations!
Frank
I really am digging up an old subject but I wanted to add my 2 cents.
Here is what the XA spec says on the subject (chapter 3.3, 'join' paragraph, page 14):
"If multiple threads use an RM on behalf of the same XID, the
RM is free to serialise the threads' work in any way it sees fit. For example, an RM may block a second or subsequent thread while one is active."
As for the usefulness of isSameRm(), I've seen so many creative (ab)uses for that method and so many different implementations that I'm not sure anymore what it's really supposed to do.
That's another subject I could have added over there: http://blog.bitronix.be/2011/02/why-we-need-jta-2-0/
Oh well, I'm ranting again...
Post a Comment