Yes, we still need Transfer Objects with EJB3 JPA
Monday, October 09, 2006
When we started our first project with the Java Persistence API (JPA) and EJB3, we made some decisions about the design following the recommendation of 'experts' in the domain (see ONJava.com -- Standardizing Java Persistence with the EJB3 Java Persistence API, for example).
One of the decisions was to get rid of Data Transfer Objects. We took this decision because we firmly believed that the new annotated POJOs could be used a Transfer Objects between the different layers of our architecture. But it was not a good decision.
Our database model was mapped to our entity model, and about 60-something classes aroused. I don't think this is a complex model, but we can say it was 'complex enough'. We found out very soon that the JPA POJOs were not good when they played the role of DTOs. These are the reasons:
1) We had to tweak the annotations to do EAGER loads of data instead of LAZY loads. Web developers building on top of the EJB3 architecture tried to use objects that were not loaded eagerly, and then they had to make a request to the middle layer developers in order to change the way data load was done.
2) Detached annotated POJOs were not Plain Old Java Objects, but Plain Old INSTRUMENTED Java Objects. This means that we had to put in the web layer references to Hibernate classes, for instance.
3) A business method using a set of POJOs had as an extra load the eagerly loaded POJOs, no matter if this business method needed them or not: if another business method was using one of the POJOs, then you have to load even if you don't need it. This overhead can be irrelevant in unit testing, but it's relevant in stressed systems.
4) Changes in the ER to OO mapping were propagated to the web layer, forcing the web development team to refactor their code to fix the problem.
All these events were enough to rethink the architecture, and let the Data Transfer Objects come back to our lives. Two new projects with EJB3 with JPA are in development, and we are making a smarter approach: DTOs are used when dealing with data from more than one or two annotated POJOs, and annotated POJOs are used only if they can be used detached from the instrumentation tool (Hibernate in JBOSS). And now things are working much better.
Why all these experts suggested the deprecation of DTOs? I think because of these two reasons:
1) If the number of POJOs is low, just like in articles and examples happens, then DTOs is a nosense. When the number of POJOs grows and teams of developers are in the battlefield, then sounds reasonable to use DTOs.
2) If you compare the number of Lines of Code of a solution with DTOs or without DTOs, the less LOC the better to sell the technology. One of the arguments to go with JPA and EJB3 is less code to write. Obviously, if there is no gain, why do we adopt these new technologies? In .NET DTOs are part of the architecture (Datasets and Tablesets). It has strong advantages in small projects, but with big layered architectures they can kill your project: if you want to success with .NET in big enterprise projects you have to choose the 'java style' and forget about strange artifacts like the Table/Datasets.
EJB3 and JPA are the best thing in J2EE since it's creation: simple and smart. But 'experts' please, ask people in the battlefield of the real life projects before making recommendations.
10 comments:
As Michael points out in his blogs, the "experts" thought hard about this problem, and provided extended persistence contexts to solve it.
If your architecture does not support the use of extended persistence contexts, get one which does.
Gavin,
I'm not complaining about the gaps between the JPA spec and solutions with extended persistence context.
I'm complaining about the fact that the deprecation of DTO was one of the new benefits of EJB3 and JPA, and this is not true. You need another piece of software: call it SEAM, Spring or any propietary solution.
Diego,
In regards the Lazy loading I gather your issue was that your UI developers got LazyLoadingException's and had to go back to the EJB developers all the time to eager load etc.
This being the case, then do you know if you are running at READ_COMMITTED transaction isolation level? By default you should be according to the EJB3 spec (and for good reason).
At this isolation level, are you aware that the transaction demarcation has no effect on the queries?
That is, at READ_COMMITTED it does not matter if the query runs in its own transaction or not it still sees the same data.
I have written an article on this issue so perhaps you could check it out to see if it clarifies the issue.
Lazy loading and Transaction Isolation
Seam works great for a web application, but what about a Swing application that connects to a server via RMI? Are DTO's required for that architecture? Can I pre-load collections in business methods on the server before returning detached entity beans? Or is that too expensive?
Seam works great for a web app, but what about a Java Swing app that connects to a server via RMI? What about an approach whereby the business method on the server pre-loads collections it knows are required by clients (i.e. instead of eager loading annotation)? That might provide greater granularity rather than always eager joining. Is that still too expensive?
Well, 2 years after this post, and why don't I read this before ;). I made so many trials to surround a 20 entities relationship with simple EJB3 as PO[I]JO.. Large mistake, many problems, many things learned (JAXB, JPA implementations divergences between Toplink essentials and Hibernate 3 ...), but no real solution to escape to a DTO design and coding (or not stable).
So agree than in real life objects are more quitely more complex than those provided for expert's tutorials and technics so deeply said deprecated still worth to use.
Please to read you. I read in this post a so good sumary in my EJB3 integration experience in application bigger than an early sample that not fit real industry|life softwares.
After 2 years of EJB3 production, do you have any particular feedback ?
Thanks again for sharing this real experience.
I suggest you read "POJOs in Action" by Chris Richardson. Most of these issues can be solved by use of good design patterns.
Yes, I have read POJOs in Action and I think it's a good training book, but honestly after reading it I did not get any magic answer to my thoughts: Yes, we still need transfer objects under some circumstances, and these are very common in complex Enterprise Architectures.
Did you actually try to expose your entities through Web Services? It will make the schema available to the outside world through WSDL definitions. You will have to watch out for cycle references and other artefacts, but your middle tier should be able to provide your web developers (browser-based or Swing or any other web-accessing layer) with the data they need.
Alternatively, if the amount of entities is large, you might want to use reflection to expose your data objects instead of using DTOs for each entity.



Hi Diego,
I think the solution to your lazy loading problems is to use a "stateful web framework" that can leverage the EJB3 extended persistence context.
I posted more details in my blog.
cheers
Michael