Spring is not designed for scalability

Monday, January 16, 2006

Reading blogs most of the time does not give you any value, because 90% of the content is just people referencing other people. It's good to know people opinions, but it takes some time to find good stuff.
This blog has really good stuff, POJO Mojo, and some times you can find pearls like this.
First, I really love Spring and I love all these new light frameworks growing thanks to the lacks of J2EE. They are really excellent if you want to build quickly web applications. But they are not good if you want to build your application thinking in scalability. Basically you have two options if you want to scale with pure web applications:
a) Use a HTTP cluster. You have to trust in the implementing made by your servlet container provider,and follow some simple rules. With this solution you only replicate the content of the HTTP session, but what about the 'live' data of the application?
b) Implement an ad-hoc clustered solution for the data in the 'model'.
Most of the developers understand a), but only a reduced set of developers can really see the complexity of b). Even experienced developers crash when dealing with ad-hoc clustered solutions.
Strictly speaking, Spring is not bad scaling, simply it was not designed to scale.
EJB3 is really good simplifying the complexity of developing enterprise class applications, but the new POJO entities are really good not only simplifying the development, but hiding the complexity of clustering data to the developers. This is why I see EJB3 as a winner.
In years to come we will see the rise of the pervasive computing and grid computing. And only a selected set of people can really handle the complexity of these solutions. So we will see how more layers will cover all these new complexity, to help ordinary people like me to build scalable apps.
But who is building these products?

Posted by Diego Parrilla at Monday, January 16, 2006  

15 comments:

Really good thoughts here. However, remember that you can easily migrate up to ejb if you need to using spring. If you find the need to remote your ejbs, spring nearly transparently allows you to inject these services in your app, enabling the remoted ejb to be used as easily as any other managed bean. I think the point is that spring not only is a logical way to wire together your app, but choose the technologies you need as demands require.

Toast said...
3:49 PM  

Right! I think Spring can be really helpful if you need to implement a Business Delegate Pattern.
Actually, we are considering it instead of an 'ad-hoc' implementation of the pattern.

Diego said...
3:58 PM  

>but what about the 'live' data of the application?

You would really enjoy Tangosol Coherence. It solves that problem, and many more.

> But who is building these products?

Morgan Stanley, Lehman, Citigroup, Putnam Investments, Capital Group Companies, Barclays Global, BNP, CIBC, IXE, Wells Fargo, Deutche Bank and Pictet, just to name a few.

Peace.

Cameron Purdy said...
4:32 PM  

Thanks for your kind words on our blog "POJO Mojo".

Cameron is right - there are a lot of large enterprises implementing solutions like this. In our experience, people really like the transparent approach pioneered by Spring - that application code should not have any dependencies on the underlying infrastructure. The last thing people need these days are more API's...

Bob Griswold said...
4:58 PM  

I don't know if I am the only one here but I don't fully understand your post and arguments .

EJBs are not about clustering the model more than Spring, they are about clustering the business logic.

See my full reply here

Fabien said...
5:00 PM  

Bob: Cameron is right - there are a lot of large enterprises implementing solutions like this.

... with Coherence. Sorry, I should have been more clear.

Bob: The last thing people need these days are more API's...

API's are natural to developers. We chose to support the existing Java Collections API as the basis for Coherence, which certainly aided adoption and made Coherence very easy to learn.

I'm also interested in the research that your group is doing, and I've got a lot of technical respect for the people there, including yourself.

Fabien: EJBs are not about clustering the model more than Spring, they are about clustering the business logic.

Clustering the model often provides the most throughput benefit, because it simultaneously address the highest latency (performance) and highest concurrent load (scalability) part of an application: the data source. Also, by clustering, you can remove single points of failure (addressing High Availability).

As just one example, we were able to process roughly one half million transactions per second in an equities exchange application, by moving those transactions into a data grid composed of 2-CPU blades, instead of trying to run them in a traditional database-centric (monolithic) environment.

Peace.

Cameron said...
5:27 PM  

Hi Diego, sorry but I don't agree with you either - on a couple of different levels actually. 8-) Please see My Reply.

Jing Xue said...
7:41 PM  

This is a very confused post. Diego apparently does not realize that the EJB 3 spec consists of a container spec (Session beans as POJOs, along with pretty weak IOC and AOP capabilities), as well as the separate spec for JPA (Java Persistence Architecture), which is the replacement for the old Entity beans.

In terms of scaling data across a cluster, both an EJB 3 solution and a Spring solution would be working with JPA (or another persistence API) and have the same scaling mechanisms available, including something like Tangosol's Coherence product working as a second level cache behind JPA. Both the EJB and Spring solution would have access to a 'transparent' data clustering library such as TerraCotta's, although I would make the claim that usage in the Spring case would normally be easier since the Spring app can run in any environment, while an EJB 3 app needs to run in an EJB 3 environment, with some classloading and other constraints enforced by that.

On the question of scaling access to POJO services, the general concensus is that a stateless service layer, duplicated horizontally, (and sitting behind a load balancer) is the most scalable approach. Both Spring and EJB 3 facilitate this scenario, although Spring will give you more choice as to a deployment environment. If there is an actual need to cluster stateless services at the level of invocations between them, which is debatable, this can be easily done outside of an EJB container by using something like WebLogic's T3 protocol. In an EJB container, Spring can also sit behind an EJB facade and take advantage of EJB remoting. The point is, there is no disadvantage to using Spring here, anywhere EJB can get clustering for services, so can Spring, along with some environments that EJB 3 can't.

As for Bob Griswold's statement here:

that "In some respects, they actually make the task of scaling out harder than it would have been if the application were written with full J2EE specifications and run within an expensive, mission critical application server like WebLogic Server.", Bob needs a good spanking for mixing two completely separate things: was the app written to the full J2EE specs (undefined what this is, but presumably EJB 2.1 or EJB 3)? and where is it deployed? I will make the claim that (as per the points above), in the same enviroment that an EJB 2.1 or 3.0 app can run, a Spring app can scale out at least as easily as the EJB 2.1 or 3.0 app. It will also scale out very nicely, thank you, in some environments where EJB apps fear to tread...

8:40 PM  

Caray... Thank you very much for your interesting opinions, I think I will answer all your thoughts later in a new post.

Diego said...
11:49 PM  

Colin, thank you very much for your long comment, but honestly mixing t3 and Spring to communicate components each other make nonsense to me. Use EJBs. It is not worth to reivent the wheel.

Spring is good for simple web apps. If you are going to face a complex project, don't bet your ass only in Spring: do it on Spring + J2EE.

Diego said...
1:00 AM  

Actually, correct usage of EJB3 is to use stateful session beans in many cases where you would have used stateless beans in J2EE. Clustering of conversational data via stateful beans can be far more efficient than clustering model data via second-level cache at the ORM layer.

This is the real value of the EJB3 model, and is a huge advance compared to the J2EE stateless facade / Spring stateless component model.

Gavin.

Anonymous said...
7:50 AM  

Bob: The last thing people need these days are more API's...

Cameron: API's are natural to developers. We chose to support the existing Java Collections API as the basis for Coherence, which certainly aided adoption and made Coherence very easy to learn.


Cameron, you know better than to mislead folks like this. Java Collections HAVE OBJECT IDENTITY. Coherence is a totally different animal. When Bob said "no new APIs" he was not referring to interfaces and documentation as much as ALL NEW BEHAVIOR that is unexpected / unnatural.

Cameron: I'm also interested in the research that your group is doing, and I've got a lot of technical respect for the people there, including yourself.

I am not sure paying customers consitutes "research" but feel free to keep calling it research.

ARI ZILKA said...
7:19 PM  


"In years to come we will see the rise of the pervasive computing and grid computing. And only a selected set of people can really handle the complexity of these solutions. So we will see how more layers will cover all these new complexity, to help ordinary people like me to build scalable apps.
But who is building these products?


Well you'll be surprised to see that your dream come true sooner than you thought.
Were (GigaSpaces) actually got into production already with one of the first tier banks with a trading application that uses our In Memory Data Grid (also referred to as a cache) and Messaging-Grid that enables transaction routing and processing to the data-grid as well as Parallel-Processing Grid which enables the parallelization of the transactions.
All that was done using GigaSpaces and Spring. Spring provided the abstraction layer that enabled the application to plug-in our middleware without necessarily coding to it. Using that approach they where able to achieve dynamic scalability and leverage many of our Space Based Architecture capabilities in a relatively none intrusive manner. It also simplifies quite significantly the complexity involved in building such distributed applications.
So bottom line Spring actually enable the scalability of distributed application by abstracting the underlying middleware implementation from the business logic. Using that approach you can very easily scale-up or out your application without changing your business logic. This was a key for us to deliver the "Write Once Scale Anywhere" approach which basically means that you write your business logic once using Spring and Scale Anywhere using GigaSpaces. I don't think that the comparison with EJB3 is that relevant as Spring provides much more than that.
Oh! and one last thing which is slightly relevant to the API discussion on another thread. Even though Spring provide that level of abstraction, the business logic need to be aware that it is running in a distributed environment as many assumptions changes once you're in that part of the world. As you can imagine that will need to be reflected in the way you write your code. What I liked about Spring is that it provides enough flexibility to add those semantics to your POJO without being limited to a plain POJO approach.

Nati Shalom
CTO GigaSpaces
"Writre Once Scale Anywhere"

Nati Shalom said...
11:45 PM  

And it now avliable through opensource spring modules project:
https://springmodules.dev.java.net/

See also:
http://biz.yahoo.com/prnews/060615/ukw014.html?.v=57

Regards
Gershon Diner
GigaSpaces
WRITE ONCE SCALE ANYWHERE
http://www.gigaspaces.com/

Gershon Diner said...
10:48 AM  
Cameron Purdy said...
5:10 AM  

Post a Comment