Sunday, November 05, 2006

Clustering - EJBs vs JMS vs POJOs

Scaling out a java application traditionally has not been easy. There are a number of technologies that allow you to distribute data across multiple JVMs, but most of them are cumbersome, high maintenance and do not scale linearly. Let's discuss the pain points involved in using three different approaches namely EJBs, JMS and POJOs.

I am going to start this article by outlining my experience as a server lead for a major services company for more than 3 years, focusing mainly on the major pain points we faced in an effort to cluster our application in a cost effective manner. The experience was a hair-losing and maddeningly frustrating one.

When I started work, it was very obvious to me that there needed to be an effort to redesign the existing architecture of the system. It was a typical consumer facing application serving web and mobile clients. The entire application was written in Java (JSE and JEE) in a typical 3-tier architecture topology. The application was deployed on 2 application server nodes. All transient and persistent data was pushed into the Database to address high availability. As it turned out, as the user-base grew, the database was the major bottleneck towards RAS and performance. We had a couple of options: pay a major database vendor an astronomical sum to buy their clustering solution, or redesign the architecture to be a high-performing RAS system. Choosing the first option was tempting, but it just meant we were pushing the real shortcomings of our architecture "under the carpet", over and above having to spend an astrnomical sum. We chose the latter.

In the next evolution of the application, we made a clear demarcation between the transient and persistent data, and used the database to store only the "System of Record" information. To distribute all the transient information (sessions, caches etc.), we took the EJB+JNDI route (this was in 2001-2002). This is what our deployment looked like. Some of the known shortcomings of EJBs are that they are too heavy-weight and make you rely on a heavy weight container. EJB3 has somewhat changed that. In terms of clustering, we found the major pain point to be the JNDI discovery.

If you implement an independent JNDI tree for each node in the cluster, the failover is developer's responsibility. This is beacause the remote proxies retrieved through a JNDI call are pinned to the local server only. If a method call to an EJB fails, the developer has to write extra code to connect to a dispatcher service, obtain another active server address, and make another call. This means extra latency.

If you have a central JNDI tree, retrieving a reference to an EJB is a two step process: first look up a home object (Interoperable Object Reference or IOR) from a name server and second pick the first server location from the several pointed by the IOR to get the home and the remote. If the first call fails, the stub needs to fetch another home or remote from the next server in the IOR and so on. This means that the name servers become a single point of failure. Also, adding another name server means every node in the cluster needs to bind its objects to this new server.

As a result, over the next three years, we moved on from EJBs and adopted the Spring Framework in our technology stack, along with a bunch of open source frameworks: Lucene for indexing our search, Hibernate as our ORM layer, role based security, Quartz for scheduling. We also dumped our proprietary Application Server for an open source counterpart. We wrote a JMS layer to keep the various blocks in the technology stack coherent across the cluster. To distribute the user session, we relied upon our application server's session replication capabilities. This is what our deployment looked like in mid 2006.

As you can see, our architecture had evolved into a stack of open source frameworks, with many different moving parts. The distribution of shared data happened through the JMS layer. The main pain points here were the overheads of serialization and maintaining the JMS layer.

The application developer had to make sure that all of these blocks were coherent across the cluster. The developer had to define points at which changes to objects were shipped across to the other node in the cluster. Getting this right was a delicate balancing act, as shipping these changes entailed serializing the relative object graph on the local disk and shipping the entire object graph across the wire. If you do this too often, the performance of the entire system will suffer, while if you do this too seldom, the business will be affected. This turned out to be quite a maintenance overhead. Dev cycles were taking longer and longer as we were spending a lot of time maintaining the JMS layer. Adding any feature meant we had to make sure that the cluster coherency wasn't broken.

Then came the time when we needed add another node into our cluster to handle increased user traffic. This is what our deployment looked like. With each additional node in the cluster, we had to tweak how often the changes were shipped across the network in order to get optimum throughtput and latency and so as not to saturate the network. The application server session replication had its own problems with regards to performance and maintenance.

The irony is that every block in our technology stack was written in pure JAVA as POJOs, and yet there was a significant overhead to distribute and maintain the state of these POJOs across multiple JVMs. One can argue that we could have taken the route of clustering our database layer. I will still argue that doing so would have pushed the problems in our architecture under the DB abstraction, which would have surfaced later as our usage grew.

I have now been working at Terracotta for a while and have a new perspective from which to look at the above problem. Terracotta offers a clustering solution at the JVM level and becomes a single infrastructure level solution for your entire technology stack. If we had access to Terracotta, this is what our deployment would have looked like. You would still use the database to store the "System of Record" information, but only that.

Terracotta allows you to write your apps as plain POJOs, and declaratively distributes these POJOs across the cluster. All you have to do is pick and choose what needs to be shared in your technology stack and make such declarations in the Terracotta XML configuration file. You just have to declare the top level object (e.g. a HashMap instance), and Terracotta figures out at runtime the entire object graph held within the top level shared object. Terracotta maintains the cluter-wide object identity at runtime. This means obj1 == obj2 will not break across the cluster. All you need to do in the app is get() and mutate, without an explicit put(). Terracotta guarantees that the cluster state is always coherent and lets you spend your time writing business logic.

Since Terracotta shares the state of POJOs at the JVM level, it is able to figure out what has changed at the byte level at runtime, and only ship the deltas across to other nodes, only when needed. The main drawback of database/filesystem session persistence centers around limited scalability when storing large or numerous objects in the HttpSession. Every time a user adds an object to the HttpSession, all of the objects in the session are serialized and written to a database or shared filesystem. Most application servers that utilize database session persistence advocate minimal use of the HttpSession to store objects, but this limits your Web application's architecture and design, especially if you are using the HttpSession to store cached user data. With Terracotta, you can store as much data in the HttpSession as is required by your app, as Terracotta will only ship across the object-level fine grained changes, not all the objects in the HttpSession. Also, Terracotta will do this only when this shared object in the HttpSession is accessed on another node (on demand), as Terracotta maintains the Object Identity across the cluster. Here are some benchmarking results.

When I looked closely at the Terracotta clustering technology after I started working here, I couldn't help but look back at those 3+ years I spent with my team trying to tackle everyday problems associated with clustering a Java application.

All the pain points I mentioned above are turned into gain points by Terracotta: no serialization, cluster-wide object identity, fine grained sharing of data and high performance.

Bottom Line: Terracotta is the only technology available today that lets you distribute POJOs as-is. Introducing Terracotta in your technology stack lets you spend your development resources on building your business application.

Links:

Terracotta web site.

Download Terracotta.

12 comments:

Agusti Pons said...

For a different approach http://www.microcalls.org

Andres Almiray said...

Nice article! first time I see the evolution of a project into the Terracota clustering solution.

Dan Pritchett said...

This techinque will work until the rate of state changes saturate the cluster back channels. What architectural patterns exist to insure the state changed is truly a small subset of the overall state? (From what you've said, the advantage of Terracotta is based on the assumption that only small subset of state actually changes).

I would be curious how well this works across data cener locations. If the architecture involves load balancing across nodes on each coast with the possibility of flapping betwen data centers occuring, how efficient is the state replication?

Michael Fasosin said...

If you'd known in 2000-1 That ATG was doing everything you're doing in 2006 with Terracotta would this have made your life better? The problem with Java is that it's so simple. Yet we make it so complicated. All that EJB for what. The amount of times I've seen people waste time writing redundant code. I'm glad you've validated what I've always known. Most people that mess about with Java today know very little about how to build enterprise scale business applications. Nice article. Thanks

Anonymous said...

Great so instead of clustering the DB you push it to terracotta. Is this method of "sweeping your architecture issues under the carpet" any different? Its a nice option to have, but it sounds more like a sales pitch. It would be good if someone of neutral position could share their experiences to know the real advantage and disadvantages.

Cost of clustering DB is not a problem to everyone as in many large enterprises, we have existing infrasture on DB farms. Does it mean that terracotta is merely a low cost alternative? And how scalable is terracotta compared to clustering DB?

GuruKool said...

Dan -

"This techinque will work until the rate of state changes saturate the cluster back channels. What architectural patterns exist to insure the state changed is truly a small subset of the overall state? (From what you've said, the advantage of Terracotta is based on the assumption that only small subset of state actually changes)."

Regardless of how much data is being changed, there are some advantages since there is no serialization involved. But yes, you will unleash the real power of Terracotta when you stick to pure simple natural java. Let me give you an example - you distribute a Map implementation through Terracotta. All you need to do is =>

synchronized (map) {
Object obj = map.get(key);
}
String str = newValue;
synchronized (obj) {
obj.setValue(str);
}

That's it, Terracotta takes care of when and where the delta of str needs to shipped, and it ships only the str delta, not the entire obj.

GuruKool said...

Dan -

"I would be curious how well this works across data cener locations. If the architecture involves load balancing across nodes on each coast with the possibility of flapping betwen data centers occuring, how efficient is the state replication?"

You mention some very good real world issues. Without naming companies, one of the largest websites in the world ran Terracotta through its paces for cross-datacenter clustering of 80K transactions / second. At fractional-scale they
concluded it will work. They are considering TC for 2007. Reasons
include:

1. Fine-grained changes which leads to pushing only the deltas...a reasonable start for WAN-needs

2. Heap-level visibility leads to pushing those changes only where needed...also a reasonable start for WAN-needs

3. Ability to stop VM-level caching and, thus, use TC for clustering which leads to pushing all changes ONLY to TC, and not amongst the cluster...a reasonable start for high scale

4. Ability to chop data sets amongst TC servers by-hand so w/ each server doing 8K transactions per second, they needed only 10, world- wide to cover the volumes (20 w/ HA)

Put 1 - 4 together and you have a non-chatty, reasonably lightweight WAN protocol, a scalable way to divide the workload, and Terracotta disbursed across the datacenters means no SPoF.

It's not perfect compared to what we plan to do over the next few releases, but it was better than anything else out there (faster than even a DB writing to a RAMDisk).

I would say watch closely the next few releases of Terracotta and you will see that these (and many more enterprise class features) are already on our roadmap.

Thanks for the comment!

GuruKool said...

"Great so instead of clustering the DB you push it to terracotta. Is this method of "sweeping your architecture issues under the carpet" any different? ...

Cost of clustering DB is not a problem to everyone as in many large enterprises, we have existing infrasture on DB farms. Does it mean that terracotta is merely a low cost alternative? And how scalable is terracotta compared to clustering DB?"

--

First of all, the goal of this post was not to do a Terracotta Vs. DB comparison. Nor was it to target any databases. The discussion is my personal view, based on my experiences and it discusses what should and should not go in a database with an architectural perspective.

Let me give you an example - You decide to store the user session in the database. Over time, the session object grows organically and there is a considerable overhead in fetching the entire object data from the database, marshalling and unmarshalling the entire object (an ORM will do this for you, but there is still a runtime performance hit), and pushing the entire object into the database.

Now let's say you have more than 80K concurrent dirty user sessions at peak usage time. Also, the session object is one of the most frequently touched objects in the application (such is the nature of transient information). How much data is flowing across the network because of this chatter? What is the latency of each round trip to the DB? Are you saturating the number of DB connections in the pool (each connection has a considerable memory overhead)? You get the idea ..

With Terracotta technology, all the session objects would already be materialized and only the deltas (e.g. last touched time or some page counter) would be shipped only where needed. Combined with some high locality of reference load balancing, this is a very scalable and powerful combination.

Now "session" is just one example - one of the most common transient data examples. If you get carried away with putting such transient information into the database, you will pay the price sooner or later. That is all I was trying to point out. Again, it is just my humble personal opinion.

Anonymous said...

We have a Java rich client (basically, a Swing desktop client) that will access our servers across the internet. We would like to support "offline" mode on the client side by replicating data between the client and server whenever there is an internet connection (we use Hibernate, and are thinking of using an embedded DB on the client for this purpose). Because the client and server will have different DB technologies, we are thinking of replicating at the Hibernate level to get its automatic DB translation. Terracota looks like it's quite good at replication, but would it be a good fit for this sort of thing? The difference here is that we are replicating between DBs, which is not how Terracota is being used in your example. The difficult part for us at this point is the actual replication. If we replicate at the DB level, we have to translate from one DB to another, plus notify Hibernate so it can invalidate its cache, etc. If we replicate at the Hibernate level, we have to deal with how to transmit object graph deltas, or otherwise splitting up the object graph. Also, we are limited to communicating with the server via client initiated HTTP requests (pull), so special protocols and ports are out.

+/ashly said...

nice article..
If you can provide a comparison /opinions on session replication across cluster [on various cathedral/bazaar servers] , it will be helpful.
what I am looking for is ->like a comparison between terracotta and websphere session management server kind of thing.

GuruKool said...

ashly,

Please refer to Terracotta white papers available at - http://www.terracotta.org/confluence/display/docs/Articles

I think you will be interested in papers titled - "Scalability Metrics: Clustering Sessions with Terracotta and Tomcat" (3rd) and "Scalability Metrics: Clustering Sessions with Terracotta and a Popular Commercial AppServer" (4th) . Thanks for the comment!

Mark said...

Interesting article. I think Terracotta is very interesting technology.

I recently used Terracotta on a big integration project by using it as a message bus.

Terracotta Server as a Message Bus

TC is very cool with how it makes objects look local, act remote, and the whole thing is kept in sync automagically.

Regards,
Mark Turansky