Wednesday, June 18, 2008

A disruptive technology for code generation

OK, now I have to support my claim. Here we go.

I've been using StringBuffer.append() for code generation (like everyone else) while looking for safer alternatives (safer meaning, compile-time type-checked). I've seen all the model-to-text languages. But the rest of this post is not about them.

One way to ensure syntactic validity when generating source code relies on instantiating EMF classes that stand for Java source elements. Verbosity can be reduced using fluent interfaces instead of directly invoking EMF factories, as reported in http://bugs.eclipse.org/234003

But problems remain. Another (perceived?) problem of AST-to-AST is the laborious coding of visitors, pattern matching and rewriting, multi-stage AST transformations, yes, model transformations. As usual, DSLs have been proposed to solve that, too. They are DSLs, i.e., another language to learn. You want to learn them? This way please.

But LINQ is on the march, and it's being used for code generation. It's being used to query the input (which can be even C# API elements, using LINQ to Reflection) and it's being used to piece together the resulting code:

As the ultimate test of suitability for AST-to-AST, LINQ is being used to refactor (refactor!) C# ASTs, no less. Compare with the article Unleashing the Power of Refactoring (and that article does a great job of explaining the JDT API for refactoring, so the comparison is *not* biased toward LINQ)

LINQ is not a DSL, its declarative constructs are based on logical commonalities in data access, better factored than those of model transformation DSLs. To my (informed) taste at least. And you're always free to extend LINQ with operators of your own, in a type-safe way that is. Or perhaps what finally convinced me were all the optimizations performed out-of-the-box when evaluating LINQ:

It's a pity we have no Java-based library to execute LINQ queries, queries expressed as ASTs (remember http://bugs.eclipse.org/234003 ?) although dedicated textual syntax would be nice, too. All right, all right, even if you'll settle only on textual syntax there's an interim solution till Java supports lambda expressions.

There's also talk about "LINQ in Scala". What I say is: it's not about syntax! (that's the easy part in fact) What makes a BIG difference are the optimizations supported by the engine when evaluating LINQ queries, input syntax notwithstanding.

If the above strikes a chord with you, let's join forces! I want to get rid of StringBuffer.append()! One of the grassroots movements to implement LINQ in Java is http://groups.google.de/group/jlinq and with some luck we won't need to wait for Java 7+ to enjoy the benefits of language-integrated queries.

P.S. If you need to convince your university department that the above qualifies for a master thesis, just flash the following papers and they'll accept immediately:

(you could also slip in one of my papers ;-)

Of course there are other opinions on best practices around code generation. But I hope you'll be reminded of LINQ when tracking down some obscure bug due to StringBuffer.append() in your favorite template language :-)

Miguel
http://www.sts.tu-harburg.de/~mi.garcia