show all comments


Ontologies and Domain-Specific Modeling

September 04, 2014 17:06:47 +0300 (EEST)

A little while back a customer asked about the difference between DSM and ontologies, here's my opinion.

A Domain-Specific Modeling language has many things in common with an ontology: classes in a hierarchy, slots and rules about what values they can hold (including instances of other classes), and of course the ability to instantiate the resulting language/ontology. Creating a DSM language also has things in common with creating an ontology: domain analysis, bounding the domain, trade-offs between theoretical accuracy and practical usability, the importance of good names, etc.

However, ontologies and DSM languages differ in how and why they are used, at least for a stereotypical case:

  Ontology DSM
Purpose: Describing something that exists Designing something that will be created
(often automatically from the instance)
Instantiation: Only once, either globally or once by each user Many times by each user, to create many different things
Querying: Often ask questions of the instances of an ontology, like querying a database Rarely queried manually, but instead are often read by generators that produce programs
Cf.: XML schema and instance Programming language grammar and programs
Creation UI: Tree view plus a property sheet;
no possibility for manual layout
Graphical diagram, matrix or table;
layout made by creator of the instance


D-TDD: Destruction Test Driven Development

January 24, 2013 18:00:25 +0200 (EET)

If you've ever seen a child learn to stack blocks, you'll know that the greatest pleasure isn't derived from the beauty or height of the structure. No: the squeals of joy are reserved for when he knocks it down, and the order of the tower is replaced by the chaos of flying blocks.

Last Friday evening I took an equally constructive approach to work on the MetaEdit+ 5.0 multi-user version. We're at the stage where we have the single user version tested and released, and "the first build that could possibly work" of the multi-user clients, with normal user testing showed no problems. So I set up the server and a couple of clients on my PC, and scripted the clients to bash the server as fast as they could with requests. In each transaction, a client would try to lock and change 10% of the repository objects, then either abandon or commit the transaction.

As that seemed to bubble along quite happily, I started up my laptop too, and used a batch file to start a MetaEdit+ client (from the original PC's disk) and run the same script. And again for a fourth client, whereupon I remembered a PC in the next office was unused, and I knew its MAC address so could start it up with WakeOnLAN. (You really learn to appreciate Remote Desktop Connection, VNC and WakeOnLAN when it's -29°C...)

By the end of the evening, I'd squished a couple of bugs: the more cycles you run, the less chance there is of bugs being able to hide. I'd also progressed to four PCs, running a total of 12 clients. Over the course of the weekend, I occasionally poked my nose into a remote desktop to see how things were doing, and another bug was found (apparently in Windows XP's networking, since it didn't appear on the other platforms; still, easy enough to just retry if it occurred).

At some point I restarted the experiment with the bug fixes in place from the start, to get a consistent set of data. At that point the original repository had grown from 40MB to over 4GB, as each client was creating several times the original amount of data each hour. As I woke up with the 'flu on Monday, the experiment continued in a similar fashion through to my return to work on Thursday. The last session was started Wednesday afternoon, with up to 20 clients, and by the same time Thursday its server had processed 1TB of disk I/O, in under 10 hours of CPU time and only 32MB of memory:

So, what do we learn from this?

  1. Destruction testing is fun! Just as much fun as with the blocks, and even more so: when you catch a bug as it happens, and fix it in the server on the fly, the tumbling client blocks reassemble themselves into the tower.
  2. Destruction testing is necessary. There are some bugs you'll never catch with code inspection, manual testing or unit tests. They might occur only once a year in normal use - and at a customer, where all you get is an error log. By forcing them out of the woodwork with massive destruction testing, you can see them several times, spot the pattern, and confirm a fix.
  3. Destruction testing is easier than ever before. Earlier, the operating system, hardware, or time would be the limiting factor more often than not. Now, normal PCs are reliable and performant enough to keep the focus on what you want to be tested, rather than how to keep enough PCs running long enough to test it.
  4. Destruction testing is not scalability testing. It may stray into that area, but it has a different purpose. The repository used in MetaEdit+ has been tested with hundreds of simultaneous users, so that wasn't in doubt. The point here was to flush out any bugs that had crept in with the new MetaEdit+ and platform versions.
  5. Destruction testing is not bulletproof. There are plenty of bugs that it won't find, but it will find bugs you can't find another way. Since you can't test everything to destruction, concentrate on testing lower-level services called by everything, or just the most common services. Other kinds of testing are better at covering the breadth of functionality.


Have your language built while you wait, Code Generation 2012

March 08, 2012 15:18:45 +0200 (EET)

From 15.15-16.45 on Thursday 28 March at the Code Generation conference, I'll be leading a new kind of session called "Have your language built while you wait". 15 master craftsmen, representing 11 top language workbench tools, have volunteered their time to build languages for participants' domains:

"Imagine the scene: master craftsmen await, hands poised over mouse and keyboard, ready for you to describe your domain. Together you draft out a prototype language for that domain, seeing it grow as they implement it in their tool. If you want, they might even give you the controls and let you get the feel of things yourself. When the whistle blows after 20 minutes, the work is saved for you and you move on to another craftsman, a different tool, and maybe an entirely different approach. Think of it as high tech speed dating, but without the crushing humiliation."

  • Get a language for your domain, free and made by an expert.
  • Learn about the process of language creation and implementation.
  • Get familiar with different tools and approaches.
  • See your domain from new points of view.

The session is intended for anyone interested in seeing what a language for their domain might look like, or how the language they already have in mind would look in different tools. If you don't have a domain of your own, we'll provide a choice of familiar domains to challenge the master craftsmen with, or you can just sit in and watch the fun.

If you've registered for Code Generation, you can choose which tools you're interested in, and we'll do our best to oblige. Since each master craftsman can only see a few people, places are limited so choose quickly!


Guilty until proved innocent? Flagging unrecognized downloads as malicious

February 27, 2012 19:07:43 +0200 (EET)

Google Chrome's "this file appears malicious" warnings are false and unfounded in too many cases. Similar problems exist with IE, and some anti-virus software. Their tests include two factors that have nothing to do with whether the code is malicious: packed executable, and low number of previous downloads.

Packing an executable is good practice: they take up less space and bandwidth, and are faster to start up from hard disk. Like including some form of software protection or obfuscation, packing may make it harder to recognize or analyse the program, but that does NOT mean it appears malicious.

Software downloads follow the law of the long tail: things like Flash and Adobe Reader installers are frequently encountered, but there is a massive amount of software not commonly used, but which may be very useful to some. Recognizing something as a common download tells you its non-malicious, but not recognizing something does NOT mean it appears malicious.

Both packing and infrequent downloads simply mean that you can't say much about that software. In that case, the principle must be 'innocent until proven guilty'.

If you see someone on the street with a black mask and knife in his hand, he appears malicious; if you see a friend you recognize, he doesn't appear malicious; but if you see someone you don't recognise, and who is mostly obscured by a crowd, you can't go around shouting to everybody that he's malicious.


Code Generation 2011 -- fulfilling the promise of MDD

May 06, 2011 12:22:44 +0300 (EEST)

Code Generation 2011 is coming up fast: less than 3 weeks now! If you haven't already got your ticket, book now -- as I write, there's one discounted ticket left. As in previous years, the lineup is impressive: Ed Merks, Terence Parr, Jos Warmer, Johan den Haan, Pedro Molina and of course experts from MetaCase, Itemis, and Microsoft. It's great to see 10 out of 27 sessions are Experience Reports or Case Studies: Model-Driven Development is an industrial reality, and the media attention comes from real world successes, not academic theory or vendor hype.

Still, MDD has a long way to go to fulfill all its promise, and there are many misunderstandings and prejudices to be corrected. Often the best way for people to learn more is through a discussion, so I was pleased to see Johan den Haan's Goldfish Bowl on "Making Model-Driven Software Development live up to its promises" -- and happy to accept his invitation to kick things off on the topic of "To make MDD successful we need to focus on domain experts and really abstract away from code". Other suggested recipes for MDD's world domination include "marketing", "alignment with programming", and "better education".

Looks like there’s an interesting three-way divide on these MDD issues, depending on the kind of language and who supplies it:

  • vendor-supplied general purpose language (e.g. Rational Rose)
  • vendor-supplied domain-specific language (e.g. Mendix)
  • customer-created domain-specific language (e.g.with MetaEdit+)

The first two are obviously an important distinction but not a black and white one, more of a sliding scale of domain-specificity. And between vendor-supplied and customer-created are variations like vendor-supplied customizable, consultant-supplied etc. (about which more in Juha-Pekka's "Build or Buy" panel at CG2011).

Probably it comes down to three main orthogonal dimensions, domain-specific vs. general purpose, problem domain vs. solution domain, and in-house language ownership vs. outsider-supplied. We could add other dimensions, e.g. text vs. graphics, which is really a sliding scale from “text as characters” through MPS and Visio to “graphical with reusable objects”. Together these dimensions give us a wide and varied space, basically encompassing what gets labelled as MDD. The space is however far from homogenous, and certainly not evenly populated. Instead, there are lots of interesting clusters where the answers to these issues are similar, but radically different from other clusters. In that respect, there's no one recipe for MDD promotion.

For me, there’s no one recipe for MDD practice either: it depends on the focus, scope and size of the project, the abilities of the developers, and the availability of tools. But I’m pretty sure industry behavior as a whole is inefficient in having too generic languages, too much focus on the solution domain, not enough in-house language building, and too much in-house tool building. So I’m happy to preach the good news of companies creating their own problem-domain specific modeling languages with a good language workbench!


How to build good graphical DSLs

February 21, 2011 17:44:22 +0200 (EET)

Daniel L. Moody (no, not that D L Moody!) has been working for well over a decade on what makes a good graphical language. He brings together previous work in cognitive psychology (which gave us the much-used and much-misunderstood 7±2) with empirical studies of graphical language usability. I've referenced him in some of my talks (e.g. Code Generation 2009 keynote), but I've always been frustrated by the lack of a freely available, in-depth presentation of his research.

That now changes: there's a direct link to his 2009 IEEE Transactions on Software Engineering paper, "The Physics of Notations" in a news item on the web site of his current employer, Ajilon:

"In a nutshell, my paper defines a set of principles for designing cognitively effective visual notations. Visual notations have been used in software engineering since its earliest beginnings, and even before - the first one was designed by Golstine and von Neumann in 1946 - but surprisingly, this is the first time someone has tried to do this."
Daniel says he received a phone call last month from Grady Booch, one of the architects of UML (the industry standard language for software engineering), who is now Chief Scientist at IBM.
"He [Booch] told me he loved the paper and only wished he had this when they designed UML - if so, things could have been very different."

That last comment presumably refers to Moody's testing of UML based on the principles he collected. Predictably, as a black and white language relying on rectangles and solid or dashed lines, UML doesn't do very well in the analysis. You can see the slides from his SLE'08 talk on Evaluating the Visual Syntax of UML, which also form an easier introduction to the principles than the somewhat academic article above (151 references!).

Here's a picture from the slides: see how long it takes you to find the "odd one out" in the three grids below:

Another notation researcher I've referenced frequently, e.g. in the DSM book, is Alan Blackwell, whose PhD thesis showed that pictograms are better than simple geometric symbols or photorealistic bitmaps in graphical language notations. Alan is part of the wider research community of Cognitive Dimensions of Notations, whose site also has a lot to offer the graphical language designer.


Interview on Model Driven Software Network

November 12, 2010 01:31:20 +0200 (EET)

It's always fun to see someone put on the spot... On Monday 15th November at 17:00 GMT, you get the chance to listen in as Angelo Hulshout does just that to me! It's the first of a series of 1 hour interviews run by the Model Driven Software Network, looking at where MDD is, how it got there, and where things are going. We'll also be looking at the practical issues and objections that people run into with MDD.

To listen in, you have to sign up here:
Places are limited by the software, so don't wait!

If you have any questions that you'd like Angelo to pose, you can add them as comments at that link. Try to keep them general rather than MetaEdit+ related; if you have tooling questions, our Forums would be a better place.

Edit: If there's anyone out there who doesn't know Angelo, he has a long history in the embedded, modeling and generation communities. In addition to work with Philips, ICT and ASML, he runs his own company, Delphino Consultancy in the Netherlands.


Modeling Wizards keynote

August 05, 2010 15:57:31 +0300 (EEST)

A few Modeling Wizards

I'm privileged to have been invited to give a keynote session at the Modeling Wizards masterclass in Oslo, Sept. 30 -- Oct. 2. As you can see from the pictures, there's an impressive line-up of speakers: Jean Bézivin, Krzysztof Czarnecki, Øystein Haugen and other luminaries from the field of model-driven development.

Unlike other conferences and workshops, this isn't just people submitting their own papers. As the title and line-up maybe reveal, the idea is to offer the best possible training in MDD for the participants. The three-day program offers "a set of carefully selected lectures dealing with various aspects of modeling and with a particular focus on domain-specific languages. The objective is to provide each attendee with sufficient information to understand the main issues and challenges in the use of modeling and domain-specific languages, and also to have a clear picture of the most recent advances in the field."

One thing I'm particularly happy to see is a mini-project running each afternoon, where participants will get the chance to put what they are learning into practice, with a helping hand from the speakers. I'm a big believer in the master craftsman - apprentice mode of learning, and have benefitted greatly from it myself over the years. To ensure the personal attention necessary, places are limited -- so sign up now! The price of around 887€ includes accommodation and all meals for the three days, which compares very favorably with any other training I've seen. With the timing just before MODELS 2010, you can even get two events for the price of one set of air fares!


Code Generation 2010 talks

June 14, 2010 12:11:33 +0300 (EEST)

At this year's Code Generation conference in Cambridge, we're delighted to be able to offer our hands-on session teaching Domain-Specific Modeling with MetaEdit+, which was voted best session at last year's conference. We'll build five modeling languages and generators from scratch in 2½ hours on Wednesday. If you've never created your own modeling language, or have only used things like GMF and think all tools must be similar, you have to see this!

That's a hands-on talk, but there are also wider issues: Most MDD projects fail before they even get started, or drag on but never really fulfill their promise. The reasons are as often human as technical. By knowing what tactics actually work and what don't -- many counter-intuitive or never considered by developers -- you can avoid the pitfalls and frustration of seeing your good ideas wasted.

We'll be looking at this in a new talk at Code Generation 2010: Proven best practices for successful MDD adoption (Thursday 10:45-12:00). This is intended for lead developers, team leads, architects, CTOs, managers - and also anybody trying to help or encourage other companies to adopt MDD.

The presentation will show you the roadmap for successfully introducing MDD into your organization, on both technical and human levels. Technical phases include the domain selection criteria, technical preparedness, language definition and testing, and the final roll-out. On the human level we will look at selling the idea, building mindshare, keeping momentum going, and managing organizational change. Drawing from our experience over the past fifteen years we will pinpoint the key challenges organizations usually face during MDD adoption, and offer some practical solutions to overcome them.

We will lead people through the technical tasks, people issues and process best practices in a tutorial format: teaching supported by slides, examples, cases, discussions and questions. Participants' own experiences and situations will be discussed throughout, so if you have a story to tell, do come along. We'll see if we can do a quick survey of participants to find which obstacles were most frequently encountered, and which were most serious.


Domain-Specific Modeling: MDD that works

March 17, 2010 20:04:43 +0200 (EET)

Long time no blog. Aside from working on the next version of MetaEdit+ (about which more in a later post), I've been speaking at a variety of events. The first, back in November, was for the members of the International Association of Architects: a webcast on Domain-Specific Modeling: MDD that works. (The link will take you to the webcast, which was recorded live. Update: Added the codec and a Flash video alternative.) The format worked surprisingly well, even for the question and answer session: the participants could submit questions via the webcast chat tool, and Juha-Pekka picked a selection of them to ask me at the end. Something that was asked later was the size of the domain: with DSM, this is never something as broad as "embedded systems", or even "mobile phones". Normally it will be much narrower, e.g. a particular family or product line of mobile phones from a particular manufacturer, or then home and auto insurance policies in the EU. Making things no broader than necessary is one of the keys to high productivity with your DSM language, and also to making it easier to build the language: stick to what you are experts in, and follow YAGNI to avoid creating extra work for yourself.

One thing that struck me when preparing for the talk was that we often show graphs of how productivity increased in various DSM cases, but we don't normally have time to provide the details of each case. The cases studies are, however, available (see table below): some even with the modeling languages and generators publicly available. Most importantly, some of the cases obtained their productivity figures with scientifically rigorous experiments; not many companies have time for that, but those experiments serve to confirm the validity of engineers' estimates in other cases.

Productivity figures can seem a little abstract, so in the talk I looked at the in terms of something more concrete. Imagine that you can build one product in a certain time by coding. How much more can you build if you apply various kinds of modeling? In a previous post we already saw how with normal UML, you can actually build less than with manual coding. With MDA, the tool vendors' own figures range from 22% extra (Obeo: average cost reduction of an MDA project is 18% => 22% productivity increase) to 35% extra (OptimalJ: 30–40%). You can see from the red bars on MDA1 and MDA2 below just how little you get for your investment in tools and training — not to mention the long term nightmare of trying to maintain CIM, PIM, PSM and code in synch with only semi-automatic support. The productivity results for DSM tell a very different story: 5–10x faster, and completely automatic code generation.

But what is the cost of creating your own DSM language and generator? After all, the MDA tool vendors promise to provide those, so that's an extra cost for DSM. Here are the figures in person days for the cases above. I think they speak for themselves: 2–3 weeks by one person is so small it makes you cry. How many companies out there could spend 2–3 weeks to make themselves 5–10x faster?

Of course those figures demand good tools: all those cases are with MetaEdit+. Your mileage will definitely vary with other tools — and what's worse, because they make you do more work, your brain power will be directed away from creating the best possible language. At the end of the day, it's the size of the productivity improvement for the modelers that counts the most. If you can make 10 modelers 6x faster rather than 5x faster, that's worth 10 extra person years per year. With MetaEdit+, you're far more likely to get those extra person years, and you're sure to get your language ready faster. When we see the first case with GMF or DSL Tools making it into the 5 10x range, we can look at this topic again...

Referenced DSM Cases (for more see

Heart rate monitor Polar Kärnä et al., Evaluating the use of DSM in embedded UI application development, Procs of DSM’09 at OOPSLA, 2009.
Call processing services   Kelly, S., Tolvanen, J.-P., Chapter 5, Domain-Specific Modeling: Enabling Full Code Generation, Wiley, 2008.
[CPL project in MetaEdit+]
Touch screen UI applications Panasonic Safa, L., The Making Of User-Interface Designer: A Proprietary DSM Tool, Procs of DSM’07 at OOPSLA, 2007.
Home automation   Kelly, S., Tolvanen, J.-P., Chapter 5, Domain-Specific Modeling: Enabling Full Code Generation, Wiley, 2008.
[Home automation project in MetaEdit+]
Mobile phone applications Nokia MetaCase, Nokia case study, 2000
Phone switch features   Weiss, D. M., Lai, C. T. R., Software Product-line Engineering: A Family-Based Software Development Process, Addison Wesley Longman, 1999.
Financial web application Pecunet Kelly, S., Tolvanen, J.-P., Chapter 6, Domain-Specific Modeling: Enabling Full Code Generation, Wiley, 2008.
MetaCase, Pecunet case study, 2001.
IASA Architect Skills Library: Domain-Specific Modeling, 2007.
[Insurance project in MetaEdit+]