
Everybody had his cup of code generation tools sometime in life. From lightweight (appgen, XDoclet) to more heavyweight tools like EJB generation IDEs, JSF tools and MDA 'robots de cuisine'. The lightweight ones had their use but I find the second kind dangerous for mental health.
Brooks outlined the percentage of work invested in each step in the development process. That was back in 1976, so a little guess from my part had to be added to get to this chart:

Which means that it takes three times more time tweaking code than getting to the very first implementation. This is a more-or-less valid, optimistic guess.
Calendar planning, frameworks selection, UML design are things not affected by these tools, since you are going to do them anyway. The main point where this tools shine is in the initial code generation (that 16%), where most people enjoy a big misconception.
Traditionally we have learned that bugs introduced in a system are easier to fix if they are detected as soon as possible in the above stages. That is true. We have also learned therefore that a detailed design avoids introducing mistakes in the future and helps refine requirements sooner better than later. That is also true, provided that you are not a strong agile follower.
The inherent conclusion we assume is that visual is better. Wrong.
Most of these frameworks end up with some kind of Visual Programming Tool: because visual is better, you will be more comfortable drawing boxes and linking them with arrows than writing the equivalent code. This is not news, and no single tool of this kind did stick in the past. Draw a class, add five stereotypes (transactional, ehm, service, clustered, request-scoped, and, ehm, add 42 just in case), detailed attributes description and half-dozen-stereotypes relationships, which would then generate java code. This way you will have to learn things twice: how to do them in Java and how with the tool. Anyway, the amount of work remains the same since you are not addressing the essence of the problem, to speak in Brooks' terms.
The project will be condemned to deal with legacy tools for its whole life. In order to switch you will have to wait for the framework to (a) get out of beta, and (b) the tool starts supporting it, if ever. It's common practice to start working with beta libraries expected final before the end of the project, to avoid becoming Dead On Arrival. Not a real issue here since everybody is still using EJB 2, right?
Debugging: a big share of any project calendar is assigned to code debugging, that is what that 50% of testing is about. This will happen no matter what your tool vendor has promised you. You cannot avoid this. This is what No Silver Bullet is all about. Now, take a minute to consider how this code was generated: one of the big selling points is that your team do not need to know anything about the concrete technology in hands, and sorry to say it out loud, this is big crap. You will have to debug the damn thing, and it's much harder if the code has been automagically generated since nobody was thinking about readability. I can recall about some eclipse plugin that generates two persistent classes per SQL table and a green Loch Ness DAO of like two hundred lines - for each table.
Your customer doesn't know a thing about what he wants. He wants something that looks like an iPod, performs like Google and has a truly intuitive interface. Expect refactoring to happen, and happen frequently. Choose tools that promote refactor: lightweight code generation frameworks (XDoclet and such) help by removing code redundancy, but with Java 5 annotations these tools have quickly become obsolete. Any other framework I know of helps very little (if anything) to refactor, and most of them in fact get in the way big time.
There is a third kind of tools comprising AppFuse, RoR scaffolding and such. A completely different kind mostly on the "best practices" side of life, they serve as introductory architecture templates helping to kickstart projects with best-of-breed frameworks already picked and assembled in the best possible way. They help to understand the available solutions and start with something that works; it's not a matter of generating code, but assembling frameworks can be a tricky, uneasy deal. In fact, the only partial success I have seen in the heavyweight code generation tools is that they also mimic the first and third type, and as such some people keep using them in spite of the pain.
So, I only know of three types of code generation tools:
- Lightweight (XDoclet, appgen, etc): generate code using embedded comments and similar. These are being progressively replaced with annotation-based frameworks.
- Heavyweight (anything with a pretty window full of colors and buttons): avoid them as pest. They are focused in easing up to 16% of your project, leaves 35% unaffected and harms the rest. The only partial success they get is by imitation of #1 and #3.
- Architectural (AppFuse, project able, Ruby stuff): pre-packaged templates with best-of-breed frameworks already linked together, easing the pain of starting a project and making the initial choices for you. A good way to go unless you already have hired The Gurus.
Just remember that none of the three will do the thinking for you. Have a happy 2007!

6 comments:
As usual you're dead right, but I even cannot stand XDoclet, and it was such a relief when I was able to replace it with annotations.
Where would you categorize parser generators like JavaCC? Maybe they'd go in the first category? Usually the code they generate isn't intended to be modified...
Good points, wrong conclusion. Take an architectural framework that gives a good starting point, describe functional DSL's to describe what the apps should do and then create translators to generate the code in your language(s) of choice. If you do lots of projects, wrap this with metadata to allow a wizard to pre-generate a bunch of the funcitonal metadata (oh, so you need an email newsletter in there - answer these 20 dependent questions and we'll prepopulate the best practice functional metadata for generating a well designed email newsletter). It's like LISP - it takes a while to get your head around it, but when you do you'll never go back to hand writing more than a handful of imperative code in a general purpose programming language.
There are lots of ways of doing application generation wrong, but when you finally grok it and do it right, it is really hard to beat.
Hi Tom,
Hi Tom,
That's a good point. Personally, I find anything that is not a recursive descendent parser (developed by hand) a hell to maintain. You may have the knowledge to understand it, but it's very difficult to find anyone who does.
Top-down parsers are much easier to understand, but then again there are lots of automated control tasks (such as ambiguous grammar detection and such) that you will miss if you try to roll your own solution.
So in my opinion yes, parser generators are the lesser evil. They are a real hell to debug (specially bottom-up parsers!) but I don't find any other reasonable way of generating such code. They do not exactly fit the second category since they do not substitute coding with visual programming, but it's the best fit I can think of.
Hi Peter,
In a certain way, this is the same question of Tom: generating low-level code based on another high-level language (DSL is a especific case of this).
Since we are not talking about visual development, some of the conclusions of the post do not apply, but others do. Again, DSL do not solve the core problem: the scope of the program remains the same, you are just solving it using a different language.
Defining a mailing list with proper OO is a matter of half a dozen lines that can be unit tested, and generated code cannot (well, you can - it just doesn't scale well). Maintainability in a system like the one you describe is like chaos, where (again) the code that you are maintaining depends on the version of the library used to generate the application, instead of the version I want to use now, i.e. upgrading versions do not affect already generated code.
Cheers to everyone!
I would also defend code-generation for cross-language problems such as build and deploy tools, or backup tools that are a combination of code and scripts.
Post a Comment