Sunday, December 21, 2008

Blogger.com not a success

I’ve had great trouble to make the previous entry to appear properly.  I’m using raw HTML, but somehow when I publish, the server adds a lot of <br/> tags in my text.

I think I’ll switch to wordpress.com (http://fkieviet.wordpress.com).

The size of Java objects

Here's a blog entry / research note / note-to-self that I wrote months ago, but never found the time to publish.

At JavaOne there was an interesting BOF about Efficient XML. It made me wonder about how efficiently the use of DOM in Java is. To find out, I wrote a small program to find out how many bytes of RAM a Java object uses. The program counts how many objects it can allocate before it runs out of memory. Here's the class:

    public static class H {
public H next;
byte[] b = new byte[8];
}

By changing the size of the byte array and running the program again, the size of an H instance can be calculated, as well as the total memory space available for object allocation in the test program. This number can be used in subsequent runs. Running the program on this trivial class on a Windows machine yielded a size of 16 bytes for each allocation.

    public static class H {
public H next;
}
16 bytes

The VM apparently uses 8 byte alignment, because when another reference is added to the class above, the size does not increase, but it does increase in steps of 8 bytes when more members are added:

    public static class H {
public H next;
public H next1;
}
16 bytes

public static class H {
public H next;
public H next1;
public H next2;
}
24 bytes

public static class H {
public H next;
public H next1;
public H next2;
public H next3;
}
24 bytes

public static class H {
public H next;
public H next1;
public H next2;
public H next3;
public H next4;
}
32 bytes

Hence an empty class takes 8 bytes, and each object references takes 4 bytes. This is much better than what I intuitively expected: underneath there should be at least a pointer to a virtual method table (4 bytes), there should be a pointer to this object from some VM list of objects for memory management. I'd expected that list to be a linked list, and I'd expected that there perhaps would be some additional flags for garbage collection, etc. Clearly the VM does it much more efficiently than I would have done.

Similarly, the size of arrays can be measured. A char[] comes down to 12 bytes plus two times the number of characters, rounded up to an 8 byte boundary. The size of a String object is more interesting: it is 32 bytes plus two times the number of characters; the minimum size is 40 bytes. Hence, a string of 3 characters is 48 bytes, and an eight character string is 56 bytes.

Now let's take a look at XML, a DOM structure to be exact. How much does an element with a text node take, as in this XML snippet?

<root>
<ElementName>1000000</ElementName>
<ElementName>1000001</ElementName>
<ElementName>1000002</ElementName>
...

(line breaks and indentation added for clarity; "ElementName" is a string constant)

Measuring this yields 144 bytes per node. Let's add another child element:

<root>
<ElementName>
<SubEl>1000000</SubEl>
</ElementName>
<ElementName>
<SubEl>1000001</SubEl>
</ElementName>
<ElementName>
<SubEl>1000002</SubEl>
</ElementName>
...

Measuring this yields 200 bytes per repeating element. When we add an attribute the size increases again:

<root>
<ElementName>
<SubEl last="false">1000000</SubEl>
</ElementName>
<ElementName last="false">
<SubEl>1000001</SubEl>
</ElementName>
<ElementName last="false">
<SubEl>1000002</SubEl>
</ElementName>
...

This increases the size to 312 bytes per repeating element

Summarizing the results:

new Object()
8 bytes
""
40 bytes
"123"
48 bytes
"12345678"
56 bytes
int data member
4 bytes
byte data member
1 byte
Object reference data member
4 bytes
char[n]
12 + 2*n
<ElementName>1234567</ElementName>
144 bytes
<ElementName>
<SubEl>1234567</SubEl>
</ElementName>
200 bytes
<ElementName>
<SubEl last="false">1234567</SubEl>
</ElementName>
312 bytes

Indeed, DOM is not very efficient with respect to memory usage. The last XML snippet was only 62 single byte characters. The information content is actual a lot less: since ElementName, SubEl, and last are constant, they could be replaced with a reference. Without using schema information, the information content can be encoded with about 25 bytes. With JAXB it can be done with 88 bytes.

Hello world

This is my first blog entry on blogger.com. It's really a continuation of my other blog, the one that I started at http://blogs.sun.com/fkieviet. Why did I start a new blog? Blogs at blogs.sun.com are tied to Sun Microsystems, my employer. With the world in economic turmoil, Sun included, I felt it was a good idea to have a blog not tied to my employer. A few words about me: I'm a software engineer. Building software has been my hobby since the Commodore 64. I'm glad that I've been able to turn this hobby into a full time profession. I currently work as a Senior Staff Engineer at Sun Microsystems. There I work as a lead in the SOA/Business Integration group, and am a contributor to OpenESB, an open source platform for Integration and SOA.

It's been a long time since my last blog entry on blogs.sun.com. Been busy -- work has been piling up, and it's difficult to justify taking time to write a blog when people are waiting on me for work to be finished. (And that's another reason to start this blog instead of continuing on blogs.sun.com.) How I find the time to write this then? I'm on vacation this week!

Monday, May 19, 2008

JavaOne 2008


It's a week after JavaOne 2008 now. I finally have time to post a blog. I've been extremely busy for and before JavaOne: not only with the presentations that I gave at JavaOne, but also because the Java CAPS 6 code freeze was the week before JavaOne.

At JavaOne I gave three presentations:

For Java University (the day before JavaOne), I presented a part of Joe Boulenouar's class "How Java EE 5 and SOA Help in Architecting and Designing Robust Enterprise Applications". In my part I covered ESBs, JBI and Composite Applications.


A technical session: TS-5301 Sun Java Composite Application Platform Suite: Implementing Selected EAI Patterns. I presented this with Michael Czapski, a colleague in Sun's field organization in Australia. He's also the author of the book Java CAPS Basics: Implementing Common EAI Patterns. In this session we went over a number of EAI patterns from Hohpe and Woolf's book and showed that when you use the right Integration Middleware, you use these patterns almost without realizing it.


A Birds-of-a-feather session: BOF-6211: Transactions and Java Business Integration (JBI): More Than Java Message Service (JMS). I presented this with Murali Pottlapelli, a colleague in Monrovia. Since there was interest in the slides that we presented, and because unlike Sessions, the slides of BOFs are not made available by the JavaOne organization, you can download the slides of Transactions and JBI: More Than JMS from my blog. I also recorded the sound using my MP3 player, but the quality of the recording is pretty bad. Nevertheless, I've also uploaded the mp3 of Transactions and JBI: More Than JMS.


What's next? Now that CAPS 6 is almost out of the door, we're going to focus on the next release. Even more than in the past, we'll be doing this in open source. More to come!