show all comments


Ontologies and Domain-Specific Modeling

September 04, 2014 17:06:47 +0300 (EEST)

A little while back a customer asked about the difference between DSM and ontologies, here's my opinion.

A Domain-Specific Modeling language has many things in common with an ontology: classes in a hierarchy, slots and rules about what values they can hold (including instances of other classes), and of course the ability to instantiate the resulting language/ontology. Creating a DSM language also has things in common with creating an ontology: domain analysis, bounding the domain, trade-offs between theoretical accuracy and practical usability, the importance of good names, etc.

However, ontologies and DSM languages differ in how and why they are used, at least for a stereotypical case:

  Ontology DSM
Purpose: Describing something that exists Designing something that will be created
(often automatically from the instance)
Instantiation: Only once, either globally or once by each user Many times by each user, to create many different things
Querying: Often ask questions of the instances of an ontology, like querying a database Rarely queried manually, but instead are often read by generators that produce programs
Cf.: XML schema and instance Programming language grammar and programs
Creation UI: Tree view plus a property sheet;
no possibility for manual layout
Graphical diagram, matrix or table;
layout made by creator of the instance


Have your language built while you wait, Code Generation 2012

March 08, 2012 15:18:45 +0200 (EET)

From 15.15-16.45 on Thursday 28 March at the Code Generation conference, I'll be leading a new kind of session called "Have your language built while you wait". 15 master craftsmen, representing 11 top language workbench tools, have volunteered their time to build languages for participants' domains:

"Imagine the scene: master craftsmen await, hands poised over mouse and keyboard, ready for you to describe your domain. Together you draft out a prototype language for that domain, seeing it grow as they implement it in their tool. If you want, they might even give you the controls and let you get the feel of things yourself. When the whistle blows after 20 minutes, the work is saved for you and you move on to another craftsman, a different tool, and maybe an entirely different approach. Think of it as high tech speed dating, but without the crushing humiliation."

  • Get a language for your domain, free and made by an expert.
  • Learn about the process of language creation and implementation.
  • Get familiar with different tools and approaches.
  • See your domain from new points of view.

The session is intended for anyone interested in seeing what a language for their domain might look like, or how the language they already have in mind would look in different tools. If you don't have a domain of your own, we'll provide a choice of familiar domains to challenge the master craftsmen with, or you can just sit in and watch the fun.

If you've registered for Code Generation, you can choose which tools you're interested in, and we'll do our best to oblige. Since each master craftsman can only see a few people, places are limited so choose quickly!


Code Generation 2011 -- fulfilling the promise of MDD

May 06, 2011 12:22:44 +0300 (EEST)

Code Generation 2011 is coming up fast: less than 3 weeks now! If you haven't already got your ticket, book now -- as I write, there's one discounted ticket left. As in previous years, the lineup is impressive: Ed Merks, Terence Parr, Jos Warmer, Johan den Haan, Pedro Molina and of course experts from MetaCase, Itemis, and Microsoft. It's great to see 10 out of 27 sessions are Experience Reports or Case Studies: Model-Driven Development is an industrial reality, and the media attention comes from real world successes, not academic theory or vendor hype.

Still, MDD has a long way to go to fulfill all its promise, and there are many misunderstandings and prejudices to be corrected. Often the best way for people to learn more is through a discussion, so I was pleased to see Johan den Haan's Goldfish Bowl on "Making Model-Driven Software Development live up to its promises" -- and happy to accept his invitation to kick things off on the topic of "To make MDD successful we need to focus on domain experts and really abstract away from code". Other suggested recipes for MDD's world domination include "marketing", "alignment with programming", and "better education".

Looks like there’s an interesting three-way divide on these MDD issues, depending on the kind of language and who supplies it:

  • vendor-supplied general purpose language (e.g. Rational Rose)
  • vendor-supplied domain-specific language (e.g. Mendix)
  • customer-created domain-specific language (e.g.with MetaEdit+)

The first two are obviously an important distinction but not a black and white one, more of a sliding scale of domain-specificity. And between vendor-supplied and customer-created are variations like vendor-supplied customizable, consultant-supplied etc. (about which more in Juha-Pekka's "Build or Buy" panel at CG2011).

Probably it comes down to three main orthogonal dimensions, domain-specific vs. general purpose, problem domain vs. solution domain, and in-house language ownership vs. outsider-supplied. We could add other dimensions, e.g. text vs. graphics, which is really a sliding scale from “text as characters” through MPS and Visio to “graphical with reusable objects”. Together these dimensions give us a wide and varied space, basically encompassing what gets labelled as MDD. The space is however far from homogenous, and certainly not evenly populated. Instead, there are lots of interesting clusters where the answers to these issues are similar, but radically different from other clusters. In that respect, there's no one recipe for MDD promotion.

For me, there’s no one recipe for MDD practice either: it depends on the focus, scope and size of the project, the abilities of the developers, and the availability of tools. But I’m pretty sure industry behavior as a whole is inefficient in having too generic languages, too much focus on the solution domain, not enough in-house language building, and too much in-house tool building. So I’m happy to preach the good news of companies creating their own problem-domain specific modeling languages with a good language workbench!


How to build good graphical DSLs

February 21, 2011 17:44:22 +0200 (EET)

Daniel L. Moody (no, not that D L Moody!) has been working for well over a decade on what makes a good graphical language. He brings together previous work in cognitive psychology (which gave us the much-used and much-misunderstood 7±2) with empirical studies of graphical language usability. I've referenced him in some of my talks (e.g. Code Generation 2009 keynote), but I've always been frustrated by the lack of a freely available, in-depth presentation of his research.

That now changes: there's a direct link to his 2009 IEEE Transactions on Software Engineering paper, "The Physics of Notations" in a news item on the web site of his current employer, Ajilon:

"In a nutshell, my paper defines a set of principles for designing cognitively effective visual notations. Visual notations have been used in software engineering since its earliest beginnings, and even before - the first one was designed by Golstine and von Neumann in 1946 - but surprisingly, this is the first time someone has tried to do this."
Daniel says he received a phone call last month from Grady Booch, one of the architects of UML (the industry standard language for software engineering), who is now Chief Scientist at IBM.
"He [Booch] told me he loved the paper and only wished he had this when they designed UML - if so, things could have been very different."

That last comment presumably refers to Moody's testing of UML based on the principles he collected. Predictably, as a black and white language relying on rectangles and solid or dashed lines, UML doesn't do very well in the analysis. You can see the slides from his SLE'08 talk on Evaluating the Visual Syntax of UML, which also form an easier introduction to the principles than the somewhat academic article above (151 references!).

Here's a picture from the slides: see how long it takes you to find the "odd one out" in the three grids below:

Another notation researcher I've referenced frequently, e.g. in the DSM book, is Alan Blackwell, whose PhD thesis showed that pictograms are better than simple geometric symbols or photorealistic bitmaps in graphical language notations. Alan is part of the wider research community of Cognitive Dimensions of Notations, whose site also has a lot to offer the graphical language designer.


Interview on Model Driven Software Network

November 12, 2010 01:31:20 +0200 (EET)

It's always fun to see someone put on the spot... On Monday 15th November at 17:00 GMT, you get the chance to listen in as Angelo Hulshout does just that to me! It's the first of a series of 1 hour interviews run by the Model Driven Software Network, looking at where MDD is, how it got there, and where things are going. We'll also be looking at the practical issues and objections that people run into with MDD.

To listen in, you have to sign up here:
Places are limited by the software, so don't wait!

If you have any questions that you'd like Angelo to pose, you can add them as comments at that link. Try to keep them general rather than MetaEdit+ related; if you have tooling questions, our Forums would be a better place.

Edit: If there's anyone out there who doesn't know Angelo, he has a long history in the embedded, modeling and generation communities. In addition to work with Philips, ICT and ASML, he runs his own company, Delphino Consultancy in the Netherlands.


Modeling Wizards keynote

August 05, 2010 15:57:31 +0300 (EEST)

A few Modeling Wizards

I'm privileged to have been invited to give a keynote session at the Modeling Wizards masterclass in Oslo, Sept. 30 -- Oct. 2. As you can see from the pictures, there's an impressive line-up of speakers: Jean Bézivin, Krzysztof Czarnecki, Øystein Haugen and other luminaries from the field of model-driven development.

Unlike other conferences and workshops, this isn't just people submitting their own papers. As the title and line-up maybe reveal, the idea is to offer the best possible training in MDD for the participants. The three-day program offers "a set of carefully selected lectures dealing with various aspects of modeling and with a particular focus on domain-specific languages. The objective is to provide each attendee with sufficient information to understand the main issues and challenges in the use of modeling and domain-specific languages, and also to have a clear picture of the most recent advances in the field."

One thing I'm particularly happy to see is a mini-project running each afternoon, where participants will get the chance to put what they are learning into practice, with a helping hand from the speakers. I'm a big believer in the master craftsman - apprentice mode of learning, and have benefitted greatly from it myself over the years. To ensure the personal attention necessary, places are limited -- so sign up now! The price of around 887€ includes accommodation and all meals for the three days, which compares very favorably with any other training I've seen. With the timing just before MODELS 2010, you can even get two events for the price of one set of air fares!


Code Generation 2010 talks

June 14, 2010 12:11:33 +0300 (EEST)

At this year's Code Generation conference in Cambridge, we're delighted to be able to offer our hands-on session teaching Domain-Specific Modeling with MetaEdit+, which was voted best session at last year's conference. We'll build five modeling languages and generators from scratch in 2½ hours on Wednesday. If you've never created your own modeling language, or have only used things like GMF and think all tools must be similar, you have to see this!

That's a hands-on talk, but there are also wider issues: Most MDD projects fail before they even get started, or drag on but never really fulfill their promise. The reasons are as often human as technical. By knowing what tactics actually work and what don't -- many counter-intuitive or never considered by developers -- you can avoid the pitfalls and frustration of seeing your good ideas wasted.

We'll be looking at this in a new talk at Code Generation 2010: Proven best practices for successful MDD adoption (Thursday 10:45-12:00). This is intended for lead developers, team leads, architects, CTOs, managers - and also anybody trying to help or encourage other companies to adopt MDD.

The presentation will show you the roadmap for successfully introducing MDD into your organization, on both technical and human levels. Technical phases include the domain selection criteria, technical preparedness, language definition and testing, and the final roll-out. On the human level we will look at selling the idea, building mindshare, keeping momentum going, and managing organizational change. Drawing from our experience over the past fifteen years we will pinpoint the key challenges organizations usually face during MDD adoption, and offer some practical solutions to overcome them.

We will lead people through the technical tasks, people issues and process best practices in a tutorial format: teaching supported by slides, examples, cases, discussions and questions. Participants' own experiences and situations will be discussed throughout, so if you have a story to tell, do come along. We'll see if we can do a quick survey of participants to find which obstacles were most frequently encountered, and which were most serious.


Domain-Specific Modeling: MDD that works

March 17, 2010 20:04:43 +0200 (EET)

Long time no blog. Aside from working on the next version of MetaEdit+ (about which more in a later post), I've been speaking at a variety of events. The first, back in November, was for the members of the International Association of Architects: a webcast on Domain-Specific Modeling: MDD that works. (The link will take you to the webcast, which was recorded live. Update: Added the codec and a Flash video alternative.) The format worked surprisingly well, even for the question and answer session: the participants could submit questions via the webcast chat tool, and Juha-Pekka picked a selection of them to ask me at the end. Something that was asked later was the size of the domain: with DSM, this is never something as broad as "embedded systems", or even "mobile phones". Normally it will be much narrower, e.g. a particular family or product line of mobile phones from a particular manufacturer, or then home and auto insurance policies in the EU. Making things no broader than necessary is one of the keys to high productivity with your DSM language, and also to making it easier to build the language: stick to what you are experts in, and follow YAGNI to avoid creating extra work for yourself.

One thing that struck me when preparing for the talk was that we often show graphs of how productivity increased in various DSM cases, but we don't normally have time to provide the details of each case. The cases studies are, however, available (see table below): some even with the modeling languages and generators publicly available. Most importantly, some of the cases obtained their productivity figures with scientifically rigorous experiments; not many companies have time for that, but those experiments serve to confirm the validity of engineers' estimates in other cases.

Productivity figures can seem a little abstract, so in the talk I looked at the in terms of something more concrete. Imagine that you can build one product in a certain time by coding. How much more can you build if you apply various kinds of modeling? In a previous post we already saw how with normal UML, you can actually build less than with manual coding. With MDA, the tool vendors' own figures range from 22% extra (Obeo: average cost reduction of an MDA project is 18% => 22% productivity increase) to 35% extra (OptimalJ: 30–40%). You can see from the red bars on MDA1 and MDA2 below just how little you get for your investment in tools and training — not to mention the long term nightmare of trying to maintain CIM, PIM, PSM and code in synch with only semi-automatic support. The productivity results for DSM tell a very different story: 5–10x faster, and completely automatic code generation.

But what is the cost of creating your own DSM language and generator? After all, the MDA tool vendors promise to provide those, so that's an extra cost for DSM. Here are the figures in person days for the cases above. I think they speak for themselves: 2–3 weeks by one person is so small it makes you cry. How many companies out there could spend 2–3 weeks to make themselves 5–10x faster?

Of course those figures demand good tools: all those cases are with MetaEdit+. Your mileage will definitely vary with other tools — and what's worse, because they make you do more work, your brain power will be directed away from creating the best possible language. At the end of the day, it's the size of the productivity improvement for the modelers that counts the most. If you can make 10 modelers 6x faster rather than 5x faster, that's worth 10 extra person years per year. With MetaEdit+, you're far more likely to get those extra person years, and you're sure to get your language ready faster. When we see the first case with GMF or DSL Tools making it into the 5 10x range, we can look at this topic again...

Referenced DSM Cases (for more see

Heart rate monitor Polar Kärnä et al., Evaluating the use of DSM in embedded UI application development, Procs of DSM’09 at OOPSLA, 2009.
Call processing services   Kelly, S., Tolvanen, J.-P., Chapter 5, Domain-Specific Modeling: Enabling Full Code Generation, Wiley, 2008.
[CPL project in MetaEdit+]
Touch screen UI applications Panasonic Safa, L., The Making Of User-Interface Designer: A Proprietary DSM Tool, Procs of DSM’07 at OOPSLA, 2007.
Home automation   Kelly, S., Tolvanen, J.-P., Chapter 5, Domain-Specific Modeling: Enabling Full Code Generation, Wiley, 2008.
[Home automation project in MetaEdit+]
Mobile phone applications Nokia MetaCase, Nokia case study, 2000
Phone switch features   Weiss, D. M., Lai, C. T. R., Software Product-line Engineering: A Family-Based Software Development Process, Addison Wesley Longman, 1999.
Financial web application Pecunet Kelly, S., Tolvanen, J.-P., Chapter 6, Domain-Specific Modeling: Enabling Full Code Generation, Wiley, 2008.
MetaCase, Pecunet case study, 2001.
IASA Architect Skills Library: Domain-Specific Modeling, 2007.
[Insurance project in MetaEdit+]


Worst Practices for Domain-Specific Modeling

October 12, 2009 16:52:41 +0300 (EEST)

One of the surprises for me at Code Generation 2009 was during the keynote, when I passed round a list for people to sign up to receive a pre-print of the Worst Practices for Domain-Specific Modeling article that was to appear in IEEE Software. When the list came back, I was astonished to see that basically the entire audience had signed up -- never underestimate the appeal of "free"!

I think the article is important, in that it is the first study of a large sample of DSM languages. The 20 worst practices identified were analysed from a sample spanning:

  • 76 cases of DSM
  • 15 years
  • 4 continents
  • several tools
  • 100 language creators
  • 3 to 300 modelers per case

IEEE Software has now published the next issue after the special issue on DSM, so it seems a fair time to point you to the final version of our article, available for free download:

Worst Practices for Domain-Specific Modeling
Steven Kelly and Risto Pohjonen
IEEE Software, vol. 26, no. 4, pp. 22-29, July/Aug. 2009

Stop press: thanks to the efforts of the tireless Yoshio Asano of Fujisetsubi, the article is also available in Japanese from the same page. Domo arigato, Yoshio-san!


Using UML takes 15% longer than just coding

October 07, 2009 16:20:51 +0300 (EEST)

In the keynote at Code Generation, I mentioned that empirical research shows that using UML does not improve software development productivity: depending on the study, reports ranged from -15% to +10% compared to just coding. I guess most people these days know those results from their own experience, but as the reports I was aware of were from the 1990s, it was interesting to see a more up-to-date article recently:

WJ Dzidek, E Arisholm, LC Briand: A Realistic Empirical Evaluation of the Costs and Benefits of UML in Software Maintenance, IEEE Transactions on Software Engineering, Vol 34 No 3, May/June 2008

Unlike many earlier studies, this uses professional developers and reasonably large tasks. The tasks all extended the same Java Struts web application, in total about 30 hours per developer. 10 developers performed the tasks with Java, and another 10 performed the same tasks with Java and Borland's Together UML tool. The developers using UML were somewhat more experienced -- 256 kLOC of Java under their belts rather than 187 kLOC, and 44% longer Struts experience -- but otherwise the groups were similar. Time was measured until submission of a correct solution, giving a reasonably sound basis for comparison. Here are the results:

Time to correctly complete 5 tasks: with UML 2300 minutes, without UML 2000 minutes

Compared to just coding, using UML took 15% longer to reach a correct solution (the green bar). In addition, it looks like even using UML to help you understand the code gives no benefit over just reading the code: the blue and red bars are the same length as the purple bar. As the tasks only looked at extending an existing system with existing models, we can't say for sure whether the story is the same in initial implementation, but other studies indicate it.

One bad thing about the article is that it tries to obfuscate this clear result by subtracting the time spent on updating the models: the whole times are there, but the abstract, intro and conclusions concentrate on the doctored numbers, trying to show that UML is no slower. Worse, the authors try to give the impression that the results without UML contained more errors -- although they clearly state that they measured the time to a correct submission. They claim a "54% increase in functional correctness", which sounded impressive. However, alarm bells started ringing when I saw the actual data even shows a 100% increase in correctness for one task. That would mean all the UML solutions were totally correct, and all the non-UML solutions were totally wrong, wouldn't it? But not in their world: what it actually meant was that out of 10 non-UML developers, all their submissions were correct apart from one mistake made by one developer in an early submission, but which he later corrected. Since none of the UML developers made a mistake in their initial submissions of that particular task, they calculated a 100% difference, and try to claim that as a 100% improvement in correctness -- ludicrous!

To calculate correctness they should really have had a number of things that had to be correct, e.g. 20 function points. Calculated like that, the value for 1 mistake would drop by a factor of 20, down from 100% to just 5% for that developer, and 0.5% over all non-UML developers. I'm pretty sure that calculated like that there would be no statistically significant difference left. Even if there was, times were measured until all mistakes were corrected, so all it would mean is that the non-UML developers were more likely to submit a code change for testing before it was completely correct. Quite possibly the extra 15% of time spent on updating the models gave the developer time to notice a mistake, perhaps when updating that part of the model, and so he went straight back to making a fix rather than first submitting his code for testing. In any case, to reach the same eventual level of quality took 15% longer with UML than without: if you have a quality standard to meet, using UML won't make you get there any more certainly, it will just slow you down.

To their credit, the authors point out two similar experiments as related work. One showed UML took 27% longer, the other 48% longer. The percentage of time spent updating models was also larger: 30-35% (which may be because those studies only measured time until the first submission of a solution: correcting bugs was probably mostly coding, so if measured to a correct solution the UML time would only increase a little and hence the percentages would drop).

So what do we learn from all this? Probably nothing new about UML, but at least a confirmation that earlier results still apply, even for real developers on realistic projects using today's UML tools. Maybe more importantly, we can see that empirical research, properly written up, is valuable in helping us decide whether something really improves productivity or not. Ignore the conclusions (they probably existed in the minds of the authors before the paper was written), but look at the data and the analysis. Throw out the chaff, and draw your own conclusions from what is left. Above all, don't blindly accept or reject what they say, just because it agrees or disagrees with your existing prejudice. There's at least a chance that you might learn something!


Code Generation 2009 round-up

June 24, 2009 01:25:20 +0300 (EEST)

Once again, Code Generation proved itself as the best European conference on Model-Driven Development. Lots of smart people, lots of experience, lots of enthusiasm, lots of willingness to listen and learn from others. Even though having to prepare and run some sessions hampered me from seeing as much of the rest as I'd like, there's still too much to write for one blog post. I'll post about things I'm certain of first, and come back to things like Xtext and MPS after further investigation.


The two keynotes, both presented as a double act by me and Markus Völter, seemed to go down well. Mark Dalgarno had a surprise up his sleeve, presenting us with a blind choice of weapons from a black bag. We then had to duel it out, graphical DSM against textual DSLs, with the plastic gun and dagger we picked. Since I got the gun, I think the result was a foregone conclusion :-). The dagger may be a "weapon from an earlier, more civilized age", but it's only useful if you can get in close to your adversary. Similarly, text may be more familiar, but it does often tie you closer to the code; problem domain DSLs in text seem as rare as accurate knife throwers. Markus successfully stabbed me in the back later on, so that evened things up and emphasized the point from our slides: both text and graphics are useful in the right place. Choose, but choose wisely.

It was fun to see the keynote get picked up on Twitter:

EelcoVisser: keynote by @markusvoelter and Steven Kelly at #cg2009: great overview of issues in model-driven development
HBehrens: Steven Kelly at #cg2009 keynote: "wizard based generators create a large legacy application you've never seen before"

The latter was picked up by several people. The reference was to vendor-supplied wizards, often found in IDEs or SDKs, that create skeleton applications for you based on your input. Since the vendors take pride in just how much boilerplate they can spew out, you're left with a mass of generated code that you've never seen before, but must extend with your own code. Worse, you're responsible for maintaining the whole ensuing mixture, and there's no chance of re-running the wizard to change some of the choices -- at least not without losing or invalidating the code you've added. That's in sharp contrast with generation in DSM, where your input is in the form of a model which you can edit at any time. You get the speed of generation but can remain at a high level of abstraction throughout.

MetaEdit+ Hands-on

We'd decided to try something special in the hands-on: building 5 different graphical modeling languages from scratch in under 3 hours. Rather than being random exercises, the languages were increasingly good ways of modeling the same domain. We started with something that was basically just the current code turned into graphics, and ended up with a language that reduced the modeling work to a third of what it was at its worst, with many possible errors ruled out by the language design and rules, and with much better scope for reuse. We showed how to make generators for all the languages, and actually built them for two. And of course since this was MetaEdit+, simply defining the metamodel already gave you a full graphical modeling environment -- we just tweaked the symbols to taste.

Never having run the session before, we were rather nervous about how much we could achieve in the time available. In the end, thanks to great slides from Risto Pohjonen and testing from Janne Luoma, it seems we pretty much hit our target. Only at the very end of the last language did we have some people only just starting the last section (the generator) while others were finishing it and going on to beautify the symbols or play around with other fun features of MetaEdit+. Hopefully people learned not just about MetaEdit+ as a tool, but also how to make better languages and improve existing ones. Feedback online was encouraging:

PeterBell: Great metaedit hands on - built and refactored language and generator in just a couple of hours at #cg2009
elsvene: been to a great hands-on session for MetaEdit+. Really interesting tool! #cg2009
HBehrens: for me MetaEdit is the most sophisticated graphical modeling tool currently available #cg2009. Thanks for this session!


The conference dinner was of the high standard you'd expect from a Cambridge college. The airy hall and contemporary art lent a friendly ambience. The large round tables weren't particularly conducive to conversation: you could only really talk to the people either side of you without shouting or craning your neck. On long tables you can reach 5 people for the same effort. I was fortunate to be sitting between Scott Finnie and Jon Hurwitz, so I certainly didn't suffer.

The "suffering" started later, when there was a raffle in aid of Bletchley Park, the home of Allied code-breaking work in World War II. I ended up winning a prize donated by Microsoft: a screwdriver toolkit and MSDN T-shirt, causing much hilarity and bad jokes about finally getting Microsoft tools that didn't crash. The irony continued when Alan Cameron Wills won a signed copy of our Domain-Specific Modeling book -- despite having received one from us last year. Either the older British segment of the audience were most inclined to support Bletchley Park by buying raffle tickets, or then the draw was rigged to encourage vendor co-operation. The people on my table were having none of that, and encouraged me to cover up the Microsoft logos :-). All in all a good laugh, and in a good cause.

Next (97 total)